Symbol Table in Compiler

Last Updated : 31 Mar, 2023

Definition

The symbol table is defined as the set of Name and Value pairs.

Symbol Table is an important data structure created and maintained by the compiler in order to keep track of semantics of variables i.e. it stores information about the scope and binding information about names, information about instances of various entities such as variable and function names, classes, objects, etc.

It is built-in lexical and syntax analysis phases.
The information is collected by the analysis phases of the compiler and is used by the synthesis phases of the compiler to generate code.
It is used by the compiler to achieve compile-time efficiency.
It is used by various phases of the compiler as follows:-
1. Lexical Analysis: Creates new table entries in the table, for example like entries about tokens.
2. Syntax Analysis: Adds information regarding attribute type, scope, dimension, line of reference, use, etc in the table.
3. Semantic Analysis: Uses available information in the table to check for semantics i.e. to verify that expressions and assignments are semantically correct(type checking) and update it accordingly.
4. Intermediate Code generation: Refers symbol table for knowing how much and what type of run-time is allocated and table helps in adding temporary variable information.
5. Code Optimization: Uses information present in the symbol table for machine-dependent optimization.
6. Target Code generation: Generates code by using address information of identifier present in the table.

Symbol Table entries – Each entry in the symbol table is associated with attributes that support the compiler in different phases.

Use of Symbol Table-

The symbol tables are typically used in compilers. Basically compiler is a program which scans the application program (for instance: your C program) and produces machine code.

During this scan compiler stores the identifiers of that application program in the symbol table. These identifiers are stored in the form of name, value address, type.

Here the name represents the name of identifier, value represents the value stored in an identifier, the address represents memory location of that identifier and type represents the data type of identifier.

Thus compiler can keep track of all the identifiers with all the necessary information.

Items stored in Symbol table:

Variable names and constants
Procedure and function names
Literal constants and strings
Compiler generated temporaries
Labels in source languages

Information used by the compiler from Symbol table:

Data type and name
Declaring procedures
Offset in storage
If structure or record then, a pointer to structure table.
For parameters, whether parameter passing by value or by reference
Number and type of arguments passed to function
Base Address

Operations of Symbol table – The basic operations defined on a symbol table include:

Operations on Symbol Table :

Following operations can be performed on symbol table-

1. Insertion of an item in the symbol table.

2. Deletion of any item from the symbol table.

3. Searching of desired item from symbol table.

Implementation of Symbol table –
Following are commonly used data structures for implementing symbol table:-

List –

we use a single array or equivalently several arrays, to store names and their associated information ,New names are added to the list in the order in which they are encountered . The position of the end of the array is marked by the pointer available, pointing to where the next symbol-table entry will go. The search for a name proceeds backwards from the end of the array to the beginning. when the name is located the associated information can be found in the words following next.

id1

info1

id2

info2

……..

id_n

info_n

In this method, an array is used to store names and associated information.
A pointer “available” is maintained at end of all stored records and new names are added in the order as they arrive
To search for a name we start from the beginning of the list till available pointer and if not found we get an error “use of the undeclared name”
While inserting a new name we must ensure that it is not already present otherwise an error occurs i.e. “Multiple defined names”
Insertion is fast O(1), but lookup is slow for large tables – O(n) on average
The advantage is that it takes a minimum amount of space.

Linked List –
- This implementation is using a linked list. A link field is added to each record.
- Searching of names is done in order pointed by the link of the link field.
- A pointer “First” is maintained to point to the first record of the symbol table.
- Insertion is fast O(1), but lookup is slow for large tables – O(n) on average
Hash Table –
- In hashing scheme, two tables are maintained – a hash table and symbol table and are the most commonly used method to implement symbol tables.
- A hash table is an array with an index range: 0 to table size – 1. These entries are pointers pointing to the names of the symbol table.
- To search for a name we use a hash function that will result in an integer between 0 to table size – 1.
- Insertion and lookup can be made very fast – O(1).
- The advantage is quick to search is possible and the disadvantage is that hashing is complicated to implement.
Binary Search Tree –
- Another approach to implementing a symbol table is to use a binary search tree i.e. we add two link fields i.e. left and right child.
- All names are created as child of the root node that always follows the property of the binary search tree.
- Insertion and lookup are O(log₂ n) on average.

Advantages of Symbol Table

The efficiency of a program can be increased by using symbol tables, which give quick and simple access to crucial data such as variable and function names, data kinds, and memory locations.
better coding structure Symbol tables can be used to organize and simplify code, making it simpler to comprehend, discover, and correct problems.
Faster code execution: By offering quick access to information like memory addresses, symbol tables can be utilized to optimize code execution by lowering the number of memory accesses required during execution.
Symbol tables can be used to increase the portability of code by offering a standardized method of storing and retrieving data, which can make it simpler to migrate code between other systems or programming languages.
Improved code reuse: By offering a standardized method of storing and accessing information, symbol tables can be utilized to increase the reuse of code across multiple projects.
Symbol tables can be used to facilitate easy access to and examination of a program’s state during execution, enhancing debugging by making it simpler to identify and correct mistakes.

Disadvantages of Symbol Table

Increased memory consumption: Systems with low memory resources may suffer from symbol tables’ high memory requirements.
Increased processing time: The creation and processing of symbol tables can take a long time, which can be problematic in systems with constrained processing power.
Complexity: Developers who are not familiar with compiler design may find symbol tables difficult to construct and maintain.
Limited scalability: Symbol tables may not be appropriate for large-scale projects or applications that require o the management of enormous amounts of data due to their limited scalability.
Upkeep: Maintaining and updating symbol tables on a regular basis can be time- and resource-consuming.
Limited functionality: It’s possible that symbol tables don’t offer all the features a developer needs, and therefore more tools or libraries will be needed to round out their capabilities.

Applications of Symbol Table

Resolution of variable and function names: Symbol tables are used to identify the data types and memory locations of variables and functions as well as to resolve their names.
Resolution of scope issues: To resolve naming conflicts and ascertain the range of variables and functions, symbol tables are utilized.
Symbol tables, which offer quick access to information such as memory locations, are used to optimize code execution.
Code generation: By giving details like memory locations and data kinds, symbol tables are utilized to create machine code from source code.
Error checking and code debugging: By supplying details about the status of a program during execution, symbol tables are used to check for faults and debug code.
Code organization and documentation: By supplying details about a program’s structure, symbol tables can be used to organize code and make it simpler to understand.

Suggest improvement

Phases of a Compiler

Error detection and Recovery in Compiler

Share your thoughts in the comments

Introduction

Lexical Analysis

Syntax Analysis

Parsers

Syntax Directed Translation

Code Generation and Optimization

Runtime Environments

Compiler Design LMN

Compiler Design GATE PYQ's and MCQs