Open In App

Input Buffering in Compiler Design

Last Updated : 18 Apr, 2023
Like Article

The lexical analyzer scans the input from left to right one character at a time. It uses two pointers begin ptr(bp) and forward ptr(fp) to keep track of the pointer of the input scanned. 

Input buffering is an important concept in compiler design that refers to the way in which the compiler reads input from the source code. In many cases, the compiler reads input one character at a time, which can be a slow and inefficient process. Input buffering is a technique that allows the compiler to read input in larger chunks, which can improve performance and reduce overhead.

  1. The basic idea behind input buffering is to read a block of input from the source code into a buffer, and then process that buffer before reading the next block. The size of the buffer can vary depending on the specific needs of the compiler and the characteristics of the source code being compiled. For example, a compiler for a high-level programming language may use a larger buffer than a compiler for a low-level language, since high-level languages tend to have longer lines of code.
  2. One of the main advantages of input buffering is that it can reduce the number of system calls required to read input from the source code. Since each system call carries some overhead, reducing the number of calls can improve performance. Additionally, input buffering can simplify the design of the compiler by reducing the amount of code required to manage input.

However, there are also some potential disadvantages to input buffering. For example, if the size of the buffer is too large, it may consume too much memory, leading to slower performance or even crashes. Additionally, if the buffer is not properly managed, it can lead to errors in the output of the compiler.

Overall, input buffering is an important technique in compiler design that can help improve performance and reduce overhead. However, it must be used carefully and appropriately to avoid potential problems.

Initially both the pointers point to the first character of the input string as shown below The forward ptr moves ahead to search for end of lexeme. As soon as the blank space is encountered, it indicates end of lexeme. In above example as soon as ptr (fp) encounters a blank space the lexeme “int” is identified. The fp will be moved ahead at white space, when fp encounters white space, it ignore and moves ahead. then both the begin ptr(bp) and forward ptr(fp) are set at next token. The input character is thus read from secondary storage, but reading in this way from secondary storage is costly. hence buffering technique is used.A block of data is first read into a buffer, and then second by lexical analyzer. there are two methods used in this context: One Buffer Scheme, and Two Buffer Scheme. These are explained as following below.

  1. One Buffer Scheme: In this scheme, only one buffer is used to store the input string but the problem with this scheme is that if lexeme is very long then it crosses the buffer boundary, to scan rest of the lexeme the buffer has to be refilled, that makes overwriting the first of lexeme.
  2. Two Buffer Scheme: To overcome the problem of one buffer scheme, in this method two buffers are used to store the input string. the first buffer and second buffer are scanned alternately. when end of current buffer is reached the other buffer is filled. the only problem with this method is that if length of the lexeme is longer than length of the buffer then scanning input cannot be scanned completely. Initially both the bp and fp are pointing to the first character of first buffer. Then the fp moves towards right in search of end of lexeme. as soon as blank character is recognized, the string between bp and fp is identified as corresponding token. to identify, the boundary of first buffer end of buffer character should be placed at the end first buffer. Similarly end of second buffer is also recognized by the end of buffer mark present at the end of second buffer. when fp encounters first eof, then one can recognize end of first buffer and hence filling up second buffer is started. in the same way when second eof is obtained then it indicates of second buffer. alternatively both the buffers can be filled up until end of the input program and stream of tokens is identified. This eof character introduced at the end is calling Sentinel which is used to identify the end of buffer.


Input buffering can reduce the number of system calls required to read input from the source code, which can improve performance.
Input buffering can simplify the design of the compiler by reducing the amount of code required to manage input.


If the size of the buffer is too large, it may consume too much memory, leading to slower performance or even crashes.
If the buffer is not properly managed, it can lead to errors in the output of the compiler.
Overall, the advantages of input buffering generally outweigh the disadvantages when used appropriately, as it can improve performance and simplify the compiler design.

Similar Reads

Incremental Compiler in Compiler Design
Incremental Compiler is a compiler that generates code for a statement, or group of statements, which is independent of the code generated for other statements. Examples : C/C++ GNU Compiler, Java eclipse platform, etc. The Incremental Compiler is such a compilation scheme in which only modified source text gets recompiled and merged with previousl
5 min read
Advantages of Multipass Compiler Over Single Pass Compiler
Programmers, write computer programs that make certain tasks easier for users. This program code is written in High-Level Programming languages like C, C++, etc. Computer device doesn't understand this language or the program written by a programmer, so the translator that translates the High-Level Program code into Machine Readable Instructions is
6 min read
Difference between Native compiler and Cross compiler
1. Native Compiler : Native compiler are compilers that generates code for the same Platform on which it runs. It converts high language into computer's native language. For example, Turbo C or GCC compiler. if a compiler runs on a Windows machine and produces executable code for Windows, then it is a native compiler. Native compilers are widely us
6 min read
Compiler Design - GATE CSE Previous Year Questions
Solving GATE Previous Year's Questions (PYQs) not only clears the concepts but also helps to gain flexibility, speed, accuracy, and understanding of the level of questions generally asked in the GATE exam, and that eventually helps you to gain good marks in the examination. Previous Year Questions help a candidate practice and revise for GATE, whic
4 min read
Syntax Directed Translation in Compiler Design
Parser uses a CFG(Context-free-Grammar) to validate the input string and produce output for the next phase of the compiler. Output could be either a parse tree or an abstract syntax tree. Now to interleave semantic analysis with the syntax analysis phase of the compiler, we use Syntax Directed Translation. [caption width="800"] [/caption] Conceptua
5 min read
Peephole Optimization in Compiler Design
Peephole optimization is a type of code Optimization performed on a small part of the code. It is performed on a very small set of instructions in a segment of code. The small set of instructions or small part of code on which peephole optimization is performed is known as peephole or window. It basically works on the theory of replacement in which
2 min read
Compiler Design | Syntax Directed Definition
Prerequisite - Introduction to Syntax Analysis, Syntax Directed Translation Syntax Directed Definition (SDD) is a kind of abstract specification. It is generalization of context free grammar in which each grammar production X --> a is associated with it a set of production rules of the form s = f(b1, b2, ......bk) where s is the attribute obtain
4 min read
Error Handling in Compiler Design
The tasks of the Error Handling process are to detect each error, report it to the user, and then make some recovery strategy and implement them to handle the error. During this whole process processing time of the program should not be slow. Functions of Error Handler: Error DetectionError ReportError RecoveryError handler=Error Detection+Error Re
7 min read
Compiler Design | Detection of a Loop in Three Address Code
Prerequisite - Three address code in Compiler Loop optimization is the phase after the Intermediate Code Generation. The main intention of this phase is to reduce the number of lines in a program. In any program majority of the time is spent actually inside the loop for an iterative program. In the case of the recursive program a block will be ther
3 min read
Synthesis Phase in Compiler Design
Pre-requisites: Phases of a Compiler The synthesis phase, also known as the code generation or code optimization phase, is the final step of a compiler. It takes the intermediate code generated by the front end of the compiler and converts it into machine code or assembly code, which can be executed by a computer. The intermediate code can be in th
4 min read
Article Tags :