Compiler Design | Why FIRST and FOLLOW?

2

Why FIRST?
We saw the need of backtrack in the previous article of on Introduction to Syntax Analysis, which is really a complex process to implement. There can be easier way to sort out this problem:

If the compiler would have come to know in advance, that what is the “first character of the string produced when a production rule is applied”, and comparing it to the current character or token in the input string it sees, it can wisely take decision on which production rule to apply.

Let’s take the same grammar from the previous article:

S -> cAd
A -> bc|a 
And the input string is “cad”. 

Thus, in the example above, if it knew that after reading character ‘c’ in the input string and applying S->cAd, next character in the input string is ‘a’, then it would have ignored the production rule A->bc (because ‘b’ is the first character of the string produced by this production rule, not ‘a’ ), and directly use the production rule A->a (because ‘a’ is the first character of the string produced by this production rule, and is same as the current character of the input string which is also ‘a’).
Hence it is validated that if the compiler/parser knows about first character of the string that can be obtained by applying a production rule, then it can wisely apply the correct production rule to get the correct syntax tree for the given input string.

Why FOLLOW?
The parser faces one more problem. Let us consider below grammar to understand this problem.

 A -> aBb
 B -> c | ε
 And suppose the input string is “ab” to parse. 

As the first character in the input is a, the parser applies the rule A->aBb.

          A
        / |  \
      a   B   b

Now the parser checks for the second character of the input string which is b, and the Non-Terminal to derive is B, but the parser can’t get any string derivable from B that contains b as first character.
But the Grammar does contain a production rule B -> ε, if that is applied then B will vanish, and the parser gets the input “ab” , as shown below. But the parser can apply it only when it knows that the character that follows B is same as the current character in the input.

In RHS of A -> aBb, b follows Non-Terminal B, i.e. FOLLOW(B) = {b}, and the current input character read is also b. Hence the parser applies this rule. And it is able to get the string “ab” from the given grammar.

           A                    A
        /  |  \              /    \                                                
      a    B    b    =>     a      b       
           |
           ε 

So FOLLOW can make a Non-terminal to vanish out if needed to generate the string from the parse tree.

 

The conclusions is, we need to find FIRST and FOLLOW sets for a given grammar, so that the parser can properly apply the needed rule at the correct position.

In the next article, we will discus formal definitions of FIRST and FOLLOW, and some easy rules to compute these sets.

Quiz on Syntax Analysis

This article is compiled by Vaibhav Bajpai. Please write comments if you find anything incorrect, or you want to share more information about the topic discussed above

GATE CS Corner    Company Wise Coding Practice

Please write to us at contribute@geeksforgeeks.org to report any issue with the above content.

Recommended Posts:



2 Average Difficulty : 2/5.0
Based on 2 vote(s)










Writing code in comment? Please use ide.geeksforgeeks.org, generate link and share the link here.