What is a Compiler?
Created | Updated Jul 17, 2003
When computer programs are written using programming languages such as Java, Pascal or Basic it is impossible for any computer to understand them as processors can only understand machine code. Because of this something is needed to translate the program to a form understood by the CPU (Central Processing Unit - the brain of the computer) this is what a compiler does (G. M. Schneider, J. L. Gersting 2000).
We don’t write programs in machine code, which can be understood by the processor. One of the reasons for this is because it is very difficult to write complicated programs in machine code as only very simple instructions can be given in machine code and the entire language consists of zeros and ones, which is not user friendly. Another reason for not writing programs in machine code is because each processor has its own version of machine code, so programs written to work on one kind of processor may not work on another computer with a different processor. However a program written in a higher level language (a language which is more user friendly and needs to be translated to low level machine code) can be translated to any version of machine code for any processor (C. Horstmann 2000).
The process of compilation happens in several phases.
Phase I
Lexical Analysis
Phase II
Parsing
Phase III
Semantic Analysis and Code Generation
Phase IV
Code Optimisation
Despite the technical sounding names of these phases what each phase actually does is quite easy to understand.
Phase I: Code generation
In this first stage of compilation the compiler examines the entire program looking for words and other things that it understands.
Phase II: Parsing
In this phase the code is checked to see if the commands given in the program actually make sense and follow all of the rules of the language.
Phase III: Semantic analysis and code generation
Once the compiler has found the words it understands and decided that the commands make sense it translates it to the equivalent machine code.
Phase IV: Code optimisation
In this phase the machine code is examined to see if it can be made any more efficient by removing parts of the code which don’t do anything or are unnecessary.
(G. M. Schneider, J. L. Gersting 2000).
One bonus of using a compiler is that it will inform you if it finds any errors in your code so they can be corrected, however most compilers will not finish compilation if they find errors in the code.
An alternative to a compiler is an interpreter. An interpreter also translates high level languages so the processor can understand them, but it works differently to a compiler. Interpreters only translate each programmed command as they need to be translated, so no permanent machine code is made. Basically interpreters work the same as human interpreters translating what is said as it is said (G. M. Schneider, J. L. Gersting 2000).
The fact that a compiler translates the entire code can be both an advantage and disadvantage depending on the situation. The fact that it compiles every line of code means that it will probably find any errors in the code and inform the programmer of them so they can be corrected. However an interpreter will only find the errors in the lines of code which have been used, so if there is a part of the program which is used very rarely errors may not be found by the interpreter, so much more stringent testing of the interpreted programs is required. However the fact that interpreters don’t check the entire program can be an advantage as it allows programmers to run programs which have errors which are known, unfinished programs and prototypes as most compilers will not compile programs with errors in them. Another disadvantage of a compiler is that each time a program is modified it has to be recompiled before it is run which can be time consuming if it is a large program. However this does mean that each time the program is modified the whole program is checked for errors.