Introduction to Intermediate Representation(IR)
Intermediate Representation(IR), as the name suggests, is any representation of a program between the source and target languages. The intermediate form of the program that is being compiled is the central data structure in a compiler. A compiler may have a single IR or a series of IRs. The decisions that are made during the design of IR affect the efficiency and speed of the compiler.
Properties of IRs:
The priorities of different properties across all compilers are not uniform.
The below five are the properties of IRs:
- Ease of generation
- Ease of manipulation
- Freedom of expression
- Size of the procedure
- Level of abstraction
Level of Abstraction:
- The amount of details exposed in the IR influence the feasibility and profitability of different optimizations.
- Generally, it can be abstracted into three levels.
(i) High Level: It is almost similar to the source language. It is good for memory disambiguation. Here, one IR implies one target machine operation. A single IR may include array access or a procedure call. It includes procedure calls and structured objects like structures and arrays.
(ii) Medium Level: Here, it does not involve any structured objects but still it is independent of the target language. It can be source or target-oriented.
(iii) Low Level: It is extremely close to the target language. A single IR may include an operation in the procedure. Here several IRs implement one target machine optimization.
Reasons for using Intermediate Representations(IRs):
- Translating the given code from one form to another requires synthesis and analysis.
- To perform machine-independent optimizations.
- To make translation simpler.
Types of IRs:
There are three major categories of Intermediate Representation(IR).
1. Graphical IR:
- It is graphically oriented and it tends to be large. This type of IR is used heavily in source-to-source translators.
- The underlying code is represented as a graph.
- Examples: Directed Acyclic Graphs(DAGs), Trees
- It is further divided into two types:
(i) Syntax-Related Trees
- Examples for Syntax-Related trees: Parse Trees, Abstract Syntax Trees(ASTs), Directed Acyclic Graphs(DAGs)
- Examples for Graphs: Control Flow Graphs, Dependency Graphs, and Call Graphs
- Even though all of them contain nodes and edges, they differ in
(i) Structure of the graph
(ii) Level of abstraction
(iii) Relationship between the graph and underlying code
2. Linear IR:
- It is the pseudo-code for an abstract machine. This type of IR contains simple and compact data structures. It is easier to rearrange. The level of abstraction is not uniform, it varies accordingly.
- Examples: Stack machine code, Three address code
3. Hybrid IR:
- As the name suggests, Hybrid IR is the combination of both Graphical IR and Linear IR.
Naming Discipline in IRs:
- While translating the source code to lower-level code, the compiler needs to choose the names for a wide variety of distinct values.
- Example: Consider the expression: x+5*y
- The sequence of operations followed to evaluate the expression are :
(i) t1<– y
(ii) t2<– 5*t1
- Here the names used are t1,t2,t3, and t4.
- We can also reduce the usage of the number of names by reusing the occurrence of t2 and t4 with t1.
- This operation reduces the number of variables used to half.
- The naming scheme also has an impact on the compile-time because it determines the sizes of many compile-time data structures.