Open In App
Related Articles

Intermediate Code Generation in Compiler Design

Improve Article
Save Article
Like Article

In the analysis-synthesis model of a compiler, the front end of a compiler translates a source program into an independent intermediate code, then the back end of the compiler uses this intermediate code to generate the target code (which can be understood by the machine). The benefits of using machine-independent intermediate code are:

  • Because of the machine-independent intermediate code, portability will be enhanced. For ex, suppose, if a compiler translates the source language to its target machine language without having the option for generating intermediate code, then for each new machine, a full native compiler is required. Because, obviously, there were some modifications in the compiler itself according to the machine specifications.
  • Retargeting is facilitated.
  • It is easier to apply source code modification to improve the performance of source code by optimizing the intermediate code.

If we generate machine code directly from source code then for n target machine we will have optimizers and n code generator but if we will have a machine-independent intermediate code, we will have only one optimizer. Intermediate code can be either language-specific (e.g., Bytecode for Java) or language. independent (three-address code). The following are commonly used intermediate code representations:

  1. Postfix Notation: Also known as reverse Polish notation or suffix notation. The ordinary (infix) way of writing the sum of a and b is with an operator in the middle: a + b The postfix notation for the same expression places the operator at the right end as ab +. In general, if e1 and e2 are any postfix expressions, and + is any binary operator, the result of applying + to the values denoted by e1 and e2 is postfix notation by e1e2 +. No parentheses are needed in postfix notation because the position and arity (number of arguments) of the operators permit only one way to decode a postfix expression. In postfix notation, the operator follows the operand. 
    Example 1: The postfix representation of the expression (a + b) * c is : ab + c *
    Example 2: The postfix representation of the expression (a – b) * (c + d) + (a – b) is :   ab – cd + *ab -+
    Read more: Infix to Postfix
  2. Three-Address Code: A statement involving no more than three references(two for operands and one for result) is known as a three address statement. A sequence of three address statements is known as a three address code. Three address statement is of form x = y op z, where x, y, and z will have address (memory location). Sometimes a statement might contain less than three references but it is still called a three address statement. 
    Example: The three address code for the expression a + b * c + d : T 1 = b * c T 2 = a + T 1 T 3 = T 2 + d T 1 , T 2 , T 3 are temporary variables.

    There are 3 ways to represent a Three-Address Code in compiler design: 
    i) Quadruples
    ii) Triples
    iii) Indirect  Triples
    Read more: Three-address code

  3. Syntax Tree: A syntax tree is nothing more than a condensed form of a parse tree. The operator and keyword nodes of the parse tree are moved to their parents and a chain of single productions is replaced by the single link in the syntax tree the internal nodes are operators and child nodes are operands. To form a syntax tree put parentheses in the expression, this way it’s easy to recognize which operand should come first. 
    Example: x = (a + b * c) / (a – b * c)


Advantages of Intermediate Code Generation:

Easier to implement: Intermediate code generation can simplify the code generation process by reducing the complexity of the input code, making it easier to implement.

Facilitates code optimization: Intermediate code generation can enable the use of various code optimization techniques, leading to improved performance and efficiency of the generated code.

Platform independence: Intermediate code is platform-independent, meaning that it can be translated into machine code or bytecode for any platform.

Code reuse: Intermediate code can be reused in the future to generate code for other platforms or languages.

Easier debugging: Intermediate code can be easier to debug than machine code or bytecode, as it is closer to the original source code.

Disadvantages of Intermediate Code Generation:

Increased compilation time: Intermediate code generation can significantly increase the compilation time, making it less suitable for real-time or time-critical applications.

Additional memory usage: Intermediate code generation requires additional memory to store the intermediate representation, which can be a concern for memory-limited systems.

Increased complexity: Intermediate code generation can increase the complexity of the compiler design, making it harder to implement and maintain.

Reduced performance: The process of generating intermediate code can result in code that executes slower than code generated directly from the source code.


Last Updated : 10 Apr, 2023
Like Article
Save Article
Similar Reads