Introduction of Compiler Design
The compiler is software that converts a program written in a high-level language (Source Language) to a low-level language (Object/Target/Machine Language/0, 1’s).
A translator or language processor is a program that translates an input program written in a programming language into an equivalent program in another language. The compiler is a type of translator, which takes a program written in a high-level programming language as input and translates it into an equivalent program in low-level languages such as machine language or assembly language.
The program written in a high-level language is known as a source program, and the program converted into a low-level language is known as an object (or target) program. Without compilation, no program written in a high-level language can be executed. For every programming language, we have a different compiler; however, the basic tasks performed by every compiler are the same. The process of translating the source code into machine code involves several stages, including lexical analysis, syntax analysis, semantic analysis, code generation, and optimization.
High-Level Programming Language
A high-level programming language is a language that has an abstraction of attributes of the computer. High-level programming is more convenient to the user in writing a program.
Low-Level Programming Language
A low-Level Programming language is a language that doesn’t require programming ideas and concepts.
Stages of Compiler Design
- Lexical Analysis: The first stage of compiler design is lexical analysis, also known as scanning. In this stage, the compiler reads the source code character by character and breaks it down into a series of tokens, such as keywords, identifiers, and operators. These tokens are then passed on to the next stage of the compilation process.
- Syntax Analysis: The second stage of compiler design is syntax analysis, also known as parsing. In this stage, the compiler checks the syntax of the source code to ensure that it conforms to the rules of the programming language. The compiler builds a parse tree, which is a hierarchical representation of the program’s structure, and uses it to check for syntax errors.
- Semantic Analysis: The third stage of compiler design is semantic analysis. In this stage, the compiler checks the meaning of the source code to ensure that it makes sense. The compiler performs type checking, which ensures that variables are used correctly and that operations are performed on compatible data types. The compiler also checks for other semantic errors, such as undeclared variables and incorrect function calls.
- Code Generation: The fourth stage of compiler design is code generation. In this stage, the compiler translates the parse tree into machine code that can be executed by the computer. The code generated by the compiler must be efficient and optimized for the target platform.
- Optimization: The final stage of compiler design is optimization. In this stage, the compiler analyzes the generated code and makes optimizations to improve its performance. The compiler may perform optimizations such as constant folding, loop unrolling, and function inlining.
Overall, compiler design is a complex process that involves multiple stages and requires a deep understanding of both the programming language and the target platform. A well-designed compiler can greatly improve the efficiency and performance of software programs, making them more useful and valuable for users.
- Cross Compiler that runs on a machine ‘A’ and produces a code for another machine ‘B’. It is capable of creating code for a platform other than the one on which the compiler is running.
- Source-to-source Compiler or transcompiler or transpiler is a compiler that translates source code written in one programming language into the source code of another programming language.
Language Processing Systems
We know a computer is a logical assembly of Software and Hardware. The hardware knows a language, that is hard for us to grasp, consequently, we tend to write programs in a high-level language, that is much less complicated for us to comprehend and maintain in our thoughts. Now, these programs go through a series of transformations so that they can readily be used by machines. This is where language procedure systems come in handy.
- High-Level Language: If a program contains #define or #include directives such as #include or #define it is called HLL. They are closer to humans but far from machines. These (#) tags are called preprocessor directives. They direct the pre-processor about what to do.
- Pre-Processor: The pre-processor removes all the #include directives by including the files called file inclusion and all the #define directives using macro expansion. It performs file inclusion, augmentation, macro-processing, etc.
- Assembly Language: It’s neither in binary form nor high level. It is an intermediate state that is a combination of machine instructions and some other useful data needed for execution.
- Assembler: For every platform (Hardware + OS) we will have an assembler. They are not universal since for each platform we have one. The output of the assembler is called an object file. Its translates assembly language to machine code.
- Interpreter: An interpreter converts high-level language into low-level machine language, just like a compiler. But they are different in the way they read the input. The Compiler in one go reads the inputs, does the processing, and executes the source code whereas the interpreter does the same line by line. A compiler scans the entire program and translates it as a whole into machine code whereas an interpreter translates the program one statement at a time. Interpreted programs are usually slower concerning compiled ones. For example: Let in the source program, it is written #include “Stdio. h”. Pre-Processor replaces this file with its contents in the produced output. The basic work of a linker is to merge object codes (that have not even been connected), produced by the compiler, assembler, standard library function, and operating system resources. The codes generated by the compiler, assembler, and linker are generally re-located by their nature, which means to say, the starting location of these codes is not determined, which means they can be anywhere in the computer memory, Thus the basic task of loaders to find/calculate the exact address of these memory locations.
- Relocatable Machine Code: It can be loaded at any point and can be run. The address within the program will be in such a way that it will cooperate with the program movement.
- Loader/Linker: Loader/Linker converts the relocatable code into absolute code and tries to run the program resulting in a running program or an error message (or sometimes both can happen). Linker loads a variety of object files into a single file to make it executable. Then loader loads it in memory and executes it.
Types of Compiler
There are mainly three types of compilers.
- Single Pass Compilers
- Two Pass Compilers
- Multipass Compilers
Single Pass Compiler
When all the phases of the compiler are present inside a single module, it is simply called a single-pass compiler. It performs the work of converting source code to machine code.
Two Pass Compiler
Two-pass compiler is a compiler in which the program is translated twice, once from the front end and the back from the back end known as Two Pass Compiler.
When several intermediate codes are created in a program and a syntax tree is processed many times, it is called Multi pass Compiler. It breaks codes into smaller programs.
Phases of a Compiler
There are two major phases of compilation, which in turn have many parts. Each of them takes input from the output of the previous level and works in a coordinated way.
An intermediate representation is created from the given source code :
The lexical analyzer divides the program into “tokens”, the Syntax analyzer recognizes “sentences” in the program using the syntax of the language and the Semantic analyzer checks the static semantics of each construct. Intermediate Code Generator generates “abstract” code.
An equivalent target program is created from the intermediate representation. It has two parts :
Code Optimizer optimizes the abstract code, and the final Code Generator translates abstract intermediate code into specific machine instructions.
Operations of Compiler
These are some operations that are done by the compiler.
- It breaks source programs into smaller parts.
- It enables the creation of symbol tables and intermediate representations.
- It helps in code compilation and error detection.
- it saves all codes and variables.
- It analyses the full program and translates it.
- Convert source code to machine code.
Advantages of Compiler Design
- Efficiency: Compiled programs are generally more efficient than interpreted programs because the machine code produced by the compiler is optimized for the specific hardware platform on which it will run.
- Portability: Once a program is compiled, the resulting machine code can be run on any computer or device that has the appropriate hardware and operating system, making it highly portable.
- Error Checking: Compilers perform comprehensive error checking during the compilation process, which can help catch syntax, semantic, and logical errors in the code before it is run.
- Optimizations: Compilers can make various optimizations to the generated machine code, such as eliminating redundant instructions or rearranging code for better performance.
Disadvantages of Compiler Design
- Longer Development Time: Developing a compiler is a complex and time-consuming process that requires a deep understanding of both the programming language and the target hardware platform.
- Debugging Difficulties: Debugging compiled code can be more difficult than debugging interpreted code because the generated machine code may not be easy to read or understand.
- Lack of Interactivity: Compiled programs are typically less interactive than interpreted programs because they must be compiled before they can be run, which can slow down the development and testing process.
- Platform-Specific Code: If the compiler is designed to generate machine code for a specific hardware platform, the resulting code may not be portable to other platforms.
GATE CS Corner Questions
Practicing the following questions will help you test your knowledge. All questions have been asked in GATE in previous years or GATE Mock Tests. It is highly recommended that you practice them.
- GATE CS 2011, Question 1
- GATE CS 2011, Question 19
- GATE CS 2009, Question 17
- GATE CS 1998, Question 27
- GATE CS 2008, Question 85
- GATE CS 1997, Question 8
- GATE CS 2014 (Set 3), Question 65
- GATE CS 2015 (Set 2), Question 29
Please Login to comment...