 GeeksforGeeks App
Open App Browser
Continue

# FOLLOW Set in Syntax Analysis

We have discussed the following topics on Syntax Analysis.

FOLLOW set is a concept used in syntax analysis, specifically in the context of LR parsing algorithms. It is a set of terminals that can appear immediately after a given non-terminal in a grammar.

The FOLLOW set of a non-terminal A is defined as the set of terminals that can appear immediately after A in any derivation of the grammar. If A can appear at the right-hand side of a production rule, then the FOLLOW set of the left-hand side non-terminal of that production rule will be added to the FOLLOW set of A.

FOLLOW set is used in LR parsing to determine when to reduce a production rule. For example, if the next symbol in the input stream is in the FOLLOW set of a non-terminal, then that non-terminal can be safely reduced using the production rule that starts with that non-terminal.

To compute the FOLLOW set of a grammar, one can start with the FOLLOW set of the starting symbol being the EOF (End Of File) symbol and continue the process by adding the FOLLOW set of a non-terminal in the right-hand side of a production to the non-terminal in the left-hand side of the production. Repeat this process until no new element can be added to any set.

FOLLOW set is a fundamental concept in syntax analysis, and it is used in LR parsing algorithms. Its computation is a crucial step in the construction of LR parsing tables, which are used by LR parsers to parse input efficiently.

In this post, FOLLOW Set is discussed.

Follow(X) to be the set of terminals that can appear immediately to the right of Non-Terminal X in some sentential form.
Example:

```S ->Aa | Ac
A ->b

S                  S
/  \              /   \
A    a            A     c
|                 |
b                 b

Here, FOLLOW (A) = {a, c}```

```1) FOLLOW(S) = { \$ }   // where S is the starting Non-Terminal

2) If A -> pBq is a production, where p, B and q are any grammar symbols,
then everything in FIRST(q)  except Є is in FOLLOW(B).

3) If A->pB is a production, then everything in FOLLOW(A) is in FOLLOW(B).

4) If A->pBq is a production and FIRST(q) contains Є,
then FOLLOW(B) contains { FIRST(q) – Є } U FOLLOW(A) ```

Example 1:

```Production Rules:
E -> TE’
E’ -> +T E’|Є
T -> F T’
T’ -> *F T’ | Є
F -> (E) | id

FIRST set
FIRST(E) = FIRST(T) = { ( , id }
FIRST(E’) = { +, Є }
FIRST(T) = FIRST(F) = { ( , id }
FIRST(T’) = { *, Є }
FIRST(F) = { ( , id }

FOLLOW(E)  = { \$ , ) }  // Note  ')' is there because of 5th rule
FOLLOW(E’) = FOLLOW(E) = {  \$, ) }  // See 1st production rule
FOLLOW(T)  = { FIRST(E’) – Є } U FOLLOW(E’) U FOLLOW(E) = { + , \$ , ) }
FOLLOW(T’) = FOLLOW(T) =      { + , \$ , ) }
FOLLOW(F)  = { FIRST(T’) –  Є } U FOLLOW(T’) U FOLLOW(T) = { *, +, \$, ) }```

Example 2:

```Production Rules:
S -> aBDh
B -> cC
C -> bC | Є
D -> EF
E -> g | Є
F -> f | Є

FIRST set
FIRST(S) = { a }
FIRST(B) = { c }
FIRST(C) = { b , Є }
FIRST(D) = FIRST(E) U FIRST(F) = { g, f, Є }
FIRST(E) = { g , Є }
FIRST(F) = { f , Є }

FOLLOW(S) = { \$ }
FOLLOW(B) = { FIRST(D) – Є } U FIRST(h) = { g , f , h }
FOLLOW(C) = FOLLOW(B) = { g , f , h }
FOLLOW(D) = FIRST(h) = { h }
FOLLOW(E) = { FIRST(F) – Є } U FOLLOW(D) = { f , h }
FOLLOW(F) = FOLLOW(D) = { h } ```

Example 3:

```Production Rules:
S -> ACB|Cbb|Ba
A -> da|BC
B-> g|Є
C-> h| Є

FIRST set
FIRST(S) = FIRST(A) U FIRST(B) U FIRST(C) = { d, g, h, Є, b, a}
FIRST(A) = { d } U {FIRST(B)-Є} U FIRST(C) = { d, g, h, Є }
FIRST(B) = { g, Є }
FIRST(C) = { h, Є }

FOLLOW(S) = { \$ }
FOLLOW(A)  = { h, g, \$ }
FOLLOW(B) = { a, \$, h, g }
FOLLOW(C) = { b, g, \$, h }```

Ambiguity Resolution: The FOLLOW set can help resolve ambiguities in the grammar by providing a way to determine which production to use based on the next input symbol. By using the FOLLOW set, the parser can avoid choosing the wrong production and producing incorrect results.

Parsing Efficiency: The FOLLOW set can help improve parsing efficiency by reducing the number of backtracking operations that the parser needs to perform. By using the FOLLOW set to make informed decisions about which production to use, the parser can avoid unnecessary backtracking and reduce the overall parsing time.

Error Detection: The FOLLOW set can help detect errors in the input code by identifying situations where the input does not conform to the grammar rules. By comparing the input symbols to the FOLLOW set, the parser can detect errors such as missing or extra symbols.

Complexity: The calculation of the FOLLOW set can be complex, especially for grammars with a large number of nonterminal symbols and productions. This complexity can make it difficult to implement the FOLLOW set in practice.

Incomplete Parsing: The FOLLOW set can lead to incomplete parsing in situations where the grammar is ambiguous or contains conflicts. In these cases, the parser may not be able to correctly identify the structure of the input code, leading to errors or incorrect results.

Overhead: The use of the FOLLOW set can add overhead to the parsing process, especially for large grammars. This overhead can impact the overall performance of the parser and make it slower or less efficient.

Note :

1. Є as a FOLLOW doesn’t mean anything (Є is an empty string).
2. \$ is called end-marker, which represents the end of the input string, hence used while parsing to indicate that the input string has been completely processed.
3. The grammar used above is Context-Free Grammar (CFG). The syntax of a programming language can be specified using CFG.
4. CFG is of the form A -> B, where A is a single Non-Terminal, and B can be a set of grammar symbols ( i.e. Terminals as well as Non-Terminals)
My Personal Notes arrow_drop_up