 GeeksforGeeks App
Open App Browser
Continue

# Symbolic Analysis in Compiler Design

Symbolic analysis helps in expressing program expressions as symbolic expressions. During program execution, functional behavior is derived from the algebraic representation of its computations. Generally, during normal program execution, the numeric value of the program is computed but the information on how they are achieved is lost. Thus symbolic analysis helps us understand the relationship between different computations. It greatly helps in optimizing our program using optimizing techniques such as constant propagation, strength reduction, and eliminating redundant computations. It helps us understand and illustrate the region-based analysis of our program. Symbolic analysis helps us in optimization, parallelization, and understanding the program.

Example:

## C++

 `#include ``using` `namespace` `std;``int` `main()``{``    ``int` `a, b, c;``    ``cin >> a;``    ``b = a + 1;``    ``c = a - 1;``    ``if` `(c > a)``        ``c = c + 1;``    ``return` `0;``}`

In the above code using symbolic analysis we can figure out that  “if(c>a)” is never true and  the line “c=c+1” is never executed, hence allows the optimizer to remove this block of code

### 1. Affine Expressions:

An affine function is a linear function. In the symbolic analysis, we try to express variables as affine expressions of reference variables whenever possible. Affine expressions are mostly used in array indexing, hence helping in understanding the optimization and parallelization of our program.

An affine expression can also be written in terms of a number of iterations in our program, this is often termed as an induced variable.

## C++

 `#include ``using` `namespace` `std;``int` `main()``{``    ``int` `a;``    ``for` `(``int` `induced_loop = 1; induced_loop <= 10;``         ``induced_loop++) {``        ``int` `induced_var = induced_loop * 10;``        ``a[induced_var] = 0;``    ``}``    ``return` `0;``}`

Output

` `

induced_var takes values 10,20,30….100.  induced_loop takes values 1,2,3…10. Hence both induced_loop and induced_var are induction variables of this loop.

The above program can be optimized using the strength reduction method, where we try to replace the multiplication operation with addition, which is a less costly operation.

Optimized Code:

## C++

 `#include ``using` `namespace` `std;``int` `main()``{``    ``int` `a;``    ``int` `induced_var = 0;``    ``for` `(``int` `induced_loop = 1; induced_loop <= 10;``         ``induced_loop++) {``        ``induced_var += 10;``        ``a[induced_var] = 0;``    ``}``    ``return` `0;``}`

Sometimes it becomes impossible to express the value held by a variable after a function call as a linear function, but we can determine other properties of that variable using the symbolic analysis, such as the comparison between two variables as shown in the example below.

## C++

 `#include ``using` `namespace` `std;``int` `sum() { ``return` `10; }``int` `main()``{``    ``int` `a = sum();``    ``int` `b = a + 10;``    ``int` `c = a + 11;``    ``return` `0;``}`

Using symbolic analysis we can clearly state that value of variable a > b

### 2. Data-Flow Problem:

This helps us understand both where variable values are required to be held and also counting the iteration in a loop. This technique uses symbolic maps, it acts like a function that maps all the variables within the program with a value. Consider the code below

## C++

 `#include ``using` `namespace` `std;`` ` `int` `main()``{``    ``int` `gfg = 0; ``// start of region 1``    ``for` `(``int` `outer = 100; outer <= 200;``         ``outer++) { ``// start of region 2``        ``gfg++;``        ``int` `temp_outer = gfg * 10;``        ``int` `var = 0;``        ``for` `(``int` `inner = 10; inner <= 20;``             ``inner++) { ``// start of region 3``            ``int` `temp_inner = temp_outer + var;``            ``var++;``        ``} ``// end of region 3``    ``} ``// end of region 2``} ``// end of region 1`

Using data flow analysis we try to divide our program into different regions. Then we map variables of our program to a value using the symbolic maps, analyze the program and further reduce them to affine expressions. We also try to keep the block variables exclusive. In the above example we see the variable temp_outer is being used in region 3, actually belongs to region 2, so we try to get rid of it after understating its nature from the symbol mapping of our program. Also, we try to reduce any kind of operations if possible within our program. Hence the code can be reduced to:

## C++

 `#include ``using` `namespace` `std;`` ` `int` `main()``{``    ``int` `gfg = 0; ``// start of region 1``    ``int` `i;``    ``int` `j;``    ``for` `(``int` `outer = 1; outer <= 100;``         ``outer++) { ``// start of region 2``        ``gfg = i;``        ``int` `temp_outer = gfg * 10;``        ``int` `var = 0;``        ``for` `(``int` `inner = 10; inner <= 20;``             ``inner++) { ``// start of region 3``            ``int` `temp_inner = 10 * i + j - 1;``            ``var = j;``        ``} ``// end of region 3``    ``} ``// end of region 2``} ``// end of region 1`

We try to get rid of the data flow problem using the block transfer function, to the input function that is mentioned above.

### 3. Region-Based Symbolic Analysis:

Region-based analysis has two parts bottom-up pass and top-down pass. The bottom-up pass helps in analyzing a region when a symbolic map is passed at the entry by the transfer function and the output symbolic map at the exit. Whereas in top-down pass the values of symbol maps are passed to the inner loop of our program.

My Personal Notes arrow_drop_up