Interprocedural Optimization using Inline Substitution
The different scopes of code optimization include Local Optimization, Regional Optimization, Global Optimization, and Interprocedural Optimization.
Local Optimization refers to optimization in a Basic Block. Regional Optimization refers to optimization in an Extended Basic Block. Global Optimization refers to the optimization of a whole procedure. Interprocedural optimization refers to the optimization of whole programs which includes different procedures. The term interprocedural means between different procedures whereas the term intraprocedural means within a single procedure. Global Optimization is intraprocedural optimization. The program is generally divided into several procedures to make our job easy in analyzing, debugging, etc. But, dividing a program into multiple procedures has its own set of advantages and disadvantages.
The advantages and disadvantages of dividing a program into multiple procedures are as follows:
- It limits the amount of code that the compiler considers at any particular time.
- This results in keeping the compile-time data structures small and limiting the cost of various compile-time algorithms.
- It does not allow the compiler to understand what happens inside a call.
- It requires frequent jumps.
Each call includes executing a precall and a postreturn sequence in the caller and a prolog and an epilog sequence in the callee. These operations take a considerable amount of time for implementation. It includes transitions between the sequences. This constitutes the overhead. This affects the compile-time knowledge and run-time actions. These issues cannot be addressed by intraprocedural optimization as it is limited to a single procedure. In order to reduce the inefficiencies that are introduced by separate procedures, the compiler may analyze and transform multiple procedures together. Interprocedural Optimization helps in achieving this.
Interprocedural Optimization can be done by using two techniques:
- Inline Substitution
- Procedure Placement
In order to optimize the whole program, the compiler should have access to the code that is being analyzed and transformed. Performing the whole-program optimization also has implications for the structure of the compiler.
There are several runtime actions that are part of the overhead while implementing the procedure call. Examples are Transfer control from caller to the callee and vice versa, Return values from callee to caller, etc. The compiler can improve the efficiency by replacing the call site with a copy of the callee’s body. This is known as inline substitution. It helps in allowing the compiler to avoid most of the procedure linkage code. It also results in the altering of the program’s call graph.
It has two subproblems:
- Decision Procedure
- Actual Transformation
It involves choosing the call sites that are to be inlined. Choosing the call sites to inline is a very complex task. It involves a lot of analysis and considerations. The compiler should consider the characteristics of the caller, callee, and the site. Inlining may result in an increase in code size, and namespace size and also increases demand for registers. The decision made at one call site affects the decision made at the other call sites.
Criteria for Decision Procedure:
The criteria that are considered for the Decision Procedure are:
- Callee size: If the callee size is comparatively smaller than the size of procedure linkage code which includes pre-call,post-return, prolog, and epilog then inlining the callee should reduce code size. It should also ensure that fewer operations are executed.
- Caller size: To counter balance the increase in compiler time and decrease in optimization effectiveness, the compiler may limit the overall procedure size.
- Dynamic call count: Inlining a frequently executed call site provides more benefits than inlining a less frequently executed call site.
- Static call count: It means the number of distinct sites that call a procedure. Compilers make a note of this count.
- Parameter count: They play a role in determining the cost of the procedure linkage.
- Constant valued actual parameters: It has an impact if constant values are passed as actual parameters. It helps by allowing the constant folding operation.
- Calls in the procedure: Tracking the number of calls in a procedure helps in finding the leaves in the call graph which are always good for inlining.
- Loop nesting depth: Call sites inside the loops execute more frequently than the call sites outside the loop.
- Fraction of execution time: From profile data, it is computed. It helps in preventing the compiler from inlining the procedures that cannot have a major effect on performance.
It involves rewriting the call site with the body of callee. The compiler rewrites a call site with the body of callee. The compiler also makes modifications to model the effects of parameter binding.