# Compiler Design LL(1) Parser in Python

In this article, we are going to see how to design LL(1) Parser compiler using Python.

## LL(1) grammar

The first ‘L’ in LL(1) stands for scanning the input from left to right, the second ‘L’ stands for producing a leftmost derivation, and the ‘1’ for using one input symbol of lookahead at each step to make parsing action decisions. LL(1) grammar follows Top-down parsing method. For a class of grammars called LL(1) we can construct grammars predictive parser. That works on the concept of recursive-descent parser not requiring backtracking.

### Elimination of Left Recursion

A grammar is left recursive if it has a nonterminal A such that there is a derivation A → A α | β. Top-down parsing methods cannot handle left-recursive grammars, so a transformation is needed to eliminate left recursion. Left recursion can be eliminated by modifying the rule as follows: (A’ is new non-terminal and ε is representing epsilon).

```A → β A’
A’ → α A’ | ε```

### Left Factoring

It is a grammar transformation that is useful for producing grammar suitable for predictive or top-down parsing. When the choice between two alternative productions is not clear, we rewrite the productions to defer the decision to make the right choice.

```For example, if we have grammar rule A → α β1 | α β2
A → α A’
A’ → β1 | β2```

## Concept of First and Follow

The construction of a top-down parser is aided by FIRST and FOLLOW functions, that are associated with a grammar G. During top-down parsing, FIRST and FOLLOW allow us to choose which production to apply, based on the next input symbol.

Rules for First computation:

1) If x is terminal, then FIRST(x)={x}

2) If X→ ε is production, then add ε to FIRST(X)

3) If X is a non-terminal and X → PQR then FIRST(X)=FIRST(P)

If FIRST(P) contains ε, then

FIRST(X) = (FIRST(P) – {ε}) U FIRST(QR)

Note: Epsilon (ε) can never be present in FOLLOW of any non-terminal symbol.

1) For Start symbol, place \$ in FOLLOW(S)

2) If A→ α B, then FOLLOW(B) = FOLLOW(A)

3) If A→ α B β, then

If ε not in FIRST(β),

FOLLOW(B) = FIRST(β)

else do,

FOLLOW(B) = (FIRST(β)-{ε}) U FOLLOW(A)

## Parsing table

After the construction of the parsing table, if for any non-terminal symbol in the table we have more than one production rule for any terminal symbol in the table column the grammar is not LL(1). Otherwise, then grammar is considered as LL(1).

Rules for construction of parsing table:

Step 1: For each production A → α , of the given grammar perform Step 2 and Step 3.

Step 2: For each terminal symbol ‘a’ in FIRST(α), ADD A → α in table T[A,a], where ‘A’ determines row & ‘a’ determines column.

Step 3:  If ε is present in FIRST(α) then find FOLLOW(A), ADD A → ε, at all columns ‘b’, where ‘b’ is FOLLOW(A).  (T[A,b])

Step 4: If ε is in FIRST(α) and \$ is the FOLLOW(A), ADD A → α to T[A,\$].

a) LHS symbol of First rule is considered as start symbol.

b) ‘#’ represents epsilon symbol.

## Approach

• There are seven functions and the driver code, they together execute the calculations. Code takes grammar rules as input. The user is required to determine which symbols are terminals (list: term_userdef) and which are non-terminals (list: nonterm_userdef).
• Code is executed on the sample set 5 for demo purpose as shown in code, this grammar has left recursion, function computeAllFirsts() is called. This function is responsible for calling functions removeLeftRecursion() and LeftFactoring(). These functions work on the above-mentioned rules respectively. Now, first() function is called for every non-terminal. The recursive logic implemented is purely based on mentioned rules of FIRST computation only.
• Base condition of first() is whether an epsilon or terminal symbol is present, for every non-terminal code enters a recursive logic and if the FIRST has multiple symbols, a list is returned else a string is returned. Therefore, in between lines of code are just making the program type-safe.
• FIRST computation is the prerequisite for FOLLOW computation as follow() function has multiple calls to first() function. A start_symbol is the LHS symbol of First rule given in the ‘rules’ list. (This can be modified), For FOLLOW computation computeAllFollows() function is called. This leads to the calling of follow() function on all non-terminals. follow() function has recursive logic with a base condition being, the function called on start_symbol which will return \$ symbol.
• Rest all conditions are handled as per the above-mentioned rules in FOLLOW computation. For the target non-terminal all rules are traversed and required firsts and circular follows are calculated. All of the intermediate results are accumulated during traversal and the final list of computed follow is returned at the traversal end. Here also, the base case returns a string, but if there are multiple symbols in the result, a list is returned. Therefore, in between extra lines of code are added for type safety.
• After FIRSTs and FOLLOWs are calculated, createPaseTable() function is called. Here firsts and follow are outputted in a formatted way. Then for parse table 2D list named ‘mat’ is prepared, non-terminals form rows and terminals form columns, with ‘\$’ as an extra column. Above mentioned rules for the construction of the parsing table forms the foundation for logic implemented.

## Approach for String Validation

• After the parsing table is generated, we must validate the input string for the given grammar. We make use of the stack and buffer for this purpose. If grammar is not LL(1) String validation cannot be performed. Enter the follow of START SYMBOL, ‘\$’ in both stack and buffer. Firstly, enter the input string in reverse order into the buffer. Then add the START SYMBOL to the top of the stack.
• Then we iteratively traverse over the current state of stack and buffer and match the entries of Table[x][y], where ‘x’ is a symbol at top of the stack and ‘y’ is a symbol at the rear of the buffer. We take the corresponding rule from the table and expand Non-terminal at TOS in the following iteration.
• If x and y both symbols are the same terminals, then we pop them from both stack and buffer and continue computation. If the stack and buffer remain with only ‘\$’ symbol, this shows that all input string symbols have been matched and the string belongs to the given grammar. If Rule is not present in the parsing table or x and y are unequal terminal symbols, then string does not belong to the given grammar.

Below is the complete implementation:

## Python3

 `# LL(1) parser code in python`   `def` `removeLeftRecursion(rulesDiction):` `    ``# for rule: A->Aa|b` `    ``# result: A->bA',A'->aA'|#`   `    ``# 'store' has new rules to be added` `    ``store ``=` `{}` `    ``# traverse over rules` `    ``for` `lhs ``in` `rulesDiction:` `        ``# alphaRules stores subrules with left-recursion` `        ``# betaRules stores subrules without left-recursion` `        ``alphaRules ``=` `[]` `        ``betaRules ``=` `[]` `        ``# get rhs for current lhs` `        ``allrhs ``=` `rulesDiction[lhs]` `        ``for` `subrhs ``in` `allrhs:` `            ``if` `subrhs[``0``] ``=``=` `lhs:` `                ``alphaRules.append(subrhs[``1``:])` `            ``else``:` `                ``betaRules.append(subrhs)` `        ``# alpha and beta containing subrules are separated` `        ``# now form two new rules` `        ``if` `len``(alphaRules) !``=` `0``:` `            ``# to generate new unique symbol` `            ``# add ' till unique not generated` `            ``lhs_ ``=` `lhs ``+` `"'"` `            ``while (lhs_ in rulesDiction.keys()) \` `                    ``or (lhs_ in store.keys()):` `                ``lhs_ += "'"` `            ``# make beta rule` `            ``for` `b ``in` `range``(``0``, ``len``(betaRules)):` `                ``betaRules[b].append(lhs_)` `            ``rulesDiction[lhs] ``=` `betaRules` `            ``# make alpha rule` `            ``for` `a ``in` `range``(``0``, ``len``(alphaRules)):` `                ``alphaRules[a].append(lhs_)` `            ``alphaRules.append([``'#'``])` `            ``# store in temp dict, append to` `            ``# - rulesDiction at end of traversal` `            ``store[lhs_] ``=` `alphaRules` `    ``# add newly generated rules generated` `    ``# - after removing left recursion` `    ``for` `left ``in` `store:` `        ``rulesDiction[left] ``=` `store[left]` `    ``return` `rulesDiction`     `def` `LeftFactoring(rulesDiction):` `    ``# for rule: A->aDF|aCV|k` `    ``# result: A->aA'|k, A'->DF|CV`   `    ``# newDict stores newly generated` `    ``# - rules after left factoring` `    ``newDict ``=` `{}` `    ``# iterate over all rules of dictionary` `    ``for` `lhs ``in` `rulesDiction:` `        ``# get rhs for given lhs` `        ``allrhs ``=` `rulesDiction[lhs]` `        ``# temp dictionary helps detect left factoring` `        ``temp ``=` `dict``()` `        ``for` `subrhs ``in` `allrhs:` `            ``if` `subrhs[``0``] ``not` `in` `list``(temp.keys()):` `                ``temp[subrhs[``0``]] ``=` `[subrhs]` `            ``else``:` `                ``temp[subrhs[``0``]].append(subrhs)` `        ``# if value list count for any key in temp is > 1,` `        ``# - it has left factoring` `        ``# new_rule stores new subrules for current LHS symbol` `        ``new_rule ``=` `[]` `        ``# temp_dict stores new subrules for left factoring` `        ``tempo_dict ``=` `{}` `        ``for` `term_key ``in` `temp:` `            ``# get value from temp for term_key` `            ``allStartingWithTermKey ``=` `temp[term_key]` `            ``if` `len``(allStartingWithTermKey) > ``1``:` `                ``# left factoring required` `                ``# to generate new unique symbol` `                ``# - add ' till unique not generated` `                ``lhs_ ``=` `lhs ``+` `"'"` `                ``while (lhs_ in rulesDiction.keys()) \` `                        ``or (lhs_ in tempo_dict.keys()):` `                    ``lhs_ += "'"` `                ``# append the left factored result` `                ``new_rule.append([term_key, lhs_])` `                ``# add expanded rules to tempo_dict` `                ``ex_rules ``=` `[]` `                ``for` `g ``in` `temp[term_key]:` `                    ``ex_rules.append(g[``1``:])` `                ``tempo_dict[lhs_] ``=` `ex_rules` `            ``else``:` `                ``# no left factoring required` `                ``new_rule.append(allStartingWithTermKey[``0``])` `        ``# add original rule` `        ``newDict[lhs] ``=` `new_rule` `        ``# add newly generated rules after left factoring` `        ``for` `key ``in` `tempo_dict:` `            ``newDict[key] ``=` `tempo_dict[key]` `    ``return` `newDict`     `# calculation of first` `# epsilon is denoted by '#' (semi-colon)`   `# pass rule in first function` `def` `first(rule):` `    ``global` `rules, nonterm_userdef, \` `        ``term_userdef, diction, firsts` `    ``# recursion base condition` `    ``# (for terminal or epsilon)` `    ``if` `len``(rule) !``=` `0` `and` `(rule ``is` `not` `None``):` `        ``if` `rule[``0``] ``in` `term_userdef:` `            ``return` `rule[``0``]` `        ``elif` `rule[``0``] ``=``=` `'#'``:` `            ``return` `'#'`   `    ``# condition for Non-Terminals` `    ``if` `len``(rule) !``=` `0``:` `        ``if` `rule[``0``] ``in` `list``(diction.keys()):` `            ``# fres temporary list of result` `            ``fres ``=` `[]` `            ``rhs_rules ``=` `diction[rule[``0``]]` `            ``# call first on each rule of RHS` `            ``# fetched (& take union)` `            ``for` `itr ``in` `rhs_rules:` `                ``indivRes ``=` `first(itr)` `                ``if` `type``(indivRes) ``is` `list``:` `                    ``for` `i ``in` `indivRes:` `                        ``fres.append(i)` `                ``else``:` `                    ``fres.append(indivRes)`   `            ``# if no epsilon in result` `            ``# - received return fres` `            ``if` `'#'` `not` `in` `fres:` `                ``return` `fres` `            ``else``:` `                ``# apply epsilon` `                ``# rule => f(ABC)=f(A)-{e} U f(BC)` `                ``newList ``=` `[]` `                ``fres.remove(``'#'``)` `                ``if` `len``(rule) > ``1``:` `                    ``ansNew ``=` `first(rule[``1``:])` `                    ``if` `ansNew !``=` `None``:` `                        ``if` `type``(ansNew) ``is` `list``:` `                            ``newList ``=` `fres ``+` `ansNew` `                        ``else``:` `                            ``newList ``=` `fres ``+` `[ansNew]` `                    ``else``:` `                        ``newList ``=` `fres` `                    ``return` `newList` `                ``# if result is not already returned` `                ``# - control reaches here` `                ``# lastly if eplison still persists` `                ``# - keep it in result of first` `                ``fres.append(``'#'``)` `                ``return` `fres`     `# calculation of follow` `# use 'rules' list, and 'diction' dict from above`   `# follow function input is the split result on` `# - Non-Terminal whose Follow we want to compute` `def` `follow(nt):` `    ``global` `start_symbol, rules, nonterm_userdef, \` `        ``term_userdef, diction, firsts, follows` `    ``# for start symbol return \$ (recursion base case)`   `    ``solset ``=` `set``()` `    ``if` `nt ``=``=` `start_symbol:` `        ``# return '\$'` `        ``solset.add(``'\$'``)`   `    ``# check all occurrences` `    ``# solset - is result of computed 'follow' so far`   `    ``# For input, check in all rules` `    ``for` `curNT ``in` `diction:` `        ``rhs ``=` `diction[curNT]` `        ``# go for all productions of NT` `        ``for` `subrule ``in` `rhs:` `            ``if` `nt ``in` `subrule:` `                ``# call for all occurrences on` `                ``# - non-terminal in subrule` `                ``while` `nt ``in` `subrule:` `                    ``index_nt ``=` `subrule.index(nt)` `                    ``subrule ``=` `subrule[index_nt ``+` `1``:]` `                    ``# empty condition - call follow on LHS` `                    ``if` `len``(subrule) !``=` `0``:` `                        ``# compute first if symbols on` `                        ``# - RHS of target Non-Terminal exists` `                        ``res ``=` `first(subrule)` `                        ``# if epsilon in result apply rule` `                        ``# - (A->aBX)- follow of -` `                        ``# - follow(B)=(first(X)-{ep}) U follow(A)` `                        ``if` `'#'` `in` `res:` `                            ``newList ``=` `[]` `                            ``res.remove(``'#'``)` `                            ``ansNew ``=` `follow(curNT)` `                            ``if` `ansNew !``=` `None``:` `                                ``if` `type``(ansNew) ``is` `list``:` `                                    ``newList ``=` `res ``+` `ansNew` `                                ``else``:` `                                    ``newList ``=` `res ``+` `[ansNew]` `                            ``else``:` `                                ``newList ``=` `res` `                            ``res ``=` `newList` `                    ``else``:` `                        ``# when nothing in RHS, go circular` `                        ``# - and take follow of LHS` `                        ``# only if (NT in LHS)!=curNT` `                        ``if` `nt !``=` `curNT:` `                            ``res ``=` `follow(curNT)`   `                    ``# add follow result in set form` `                    ``if` `res ``is` `not` `None``:` `                        ``if` `type``(res) ``is` `list``:` `                            ``for` `g ``in` `res:` `                                ``solset.add(g)` `                        ``else``:` `                            ``solset.add(res)` `    ``return` `list``(solset)`     `def` `computeAllFirsts():` `    ``global` `rules, nonterm_userdef, \` `        ``term_userdef, diction, firsts` `    ``for` `rule ``in` `rules:` `        ``k ``=` `rule.split(``"->"``)` `        ``# remove un-necessary spaces` `        ``k[``0``] ``=` `k[``0``].strip()` `        ``k[``1``] ``=` `k[``1``].strip()` `        ``rhs ``=` `k[``1``]` `        ``multirhs ``=` `rhs.split(``'|'``)` `        ``# remove un-necessary spaces` `        ``for` `i ``in` `range``(``len``(multirhs)):` `            ``multirhs[i] ``=` `multirhs[i].strip()` `            ``multirhs[i] ``=` `multirhs[i].split()` `        ``diction[k[``0``]] ``=` `multirhs`   `    ``print``(f``"\nRules: \n"``)` `    ``for` `y ``in` `diction:` `        ``print``(f``"{y}->{diction[y]}"``)` `    ``print``(f``"\nAfter elimination of left recursion:\n"``)`   `    ``diction ``=` `removeLeftRecursion(diction)` `    ``for` `y ``in` `diction:` `        ``print``(f``"{y}->{diction[y]}"``)` `    ``print``(``"\nAfter left factoring:\n"``)`   `    ``diction ``=` `LeftFactoring(diction)` `    ``for` `y ``in` `diction:` `        ``print``(f``"{y}->{diction[y]}"``)`   `    ``# calculate first for each rule` `    ``# - (call first() on all RHS)` `    ``for` `y ``in` `list``(diction.keys()):` `        ``t ``=` `set``()` `        ``for` `sub ``in` `diction.get(y):` `            ``res ``=` `first(sub)` `            ``if` `res !``=` `None``:` `                ``if` `type``(res) ``is` `list``:` `                    ``for` `u ``in` `res:` `                        ``t.add(u)` `                ``else``:` `                    ``t.add(res)`   `        ``# save result in 'firsts' list` `        ``firsts[y] ``=` `t`   `    ``print``(``"\nCalculated firsts: "``)` `    ``key_list ``=` `list``(firsts.keys())` `    ``index ``=` `0` `    ``for` `gg ``in` `firsts:` `        ``print``(f``"first({key_list[index]}) "` `              ``f``"=> {firsts.get(gg)}"``)` `        ``index ``+``=` `1`     `def` `computeAllFollows():` `    ``global` `start_symbol, rules, nonterm_userdef,\` `        ``term_userdef, diction, firsts, follows` `    ``for` `NT ``in` `diction:` `        ``solset ``=` `set``()` `        ``sol ``=` `follow(NT)` `        ``if` `sol ``is` `not` `None``:` `            ``for` `g ``in` `sol:` `                ``solset.add(g)` `        ``follows[NT] ``=` `solset`   `    ``print``(``"\nCalculated follows: "``)` `    ``key_list ``=` `list``(follows.keys())` `    ``index ``=` `0` `    ``for` `gg ``in` `follows:` `        ``print``(f``"follow({key_list[index]})"` `              ``f``" => {follows[gg]}"``)` `        ``index ``+``=` `1`     `# create parse table` `def` `createParseTable():` `    ``import` `copy` `    ``global` `diction, firsts, follows, term_userdef` `    ``print``(``"\nFirsts and Follow Result table\n"``)`   `    ``# find space size` `    ``mx_len_first ``=` `0` `    ``mx_len_fol ``=` `0` `    ``for` `u ``in` `diction:` `        ``k1 ``=` `len``(``str``(firsts[u]))` `        ``k2 ``=` `len``(``str``(follows[u]))` `        ``if` `k1 > mx_len_first:` `            ``mx_len_first ``=` `k1` `        ``if` `k2 > mx_len_fol:` `            ``mx_len_fol ``=` `k2`   `    ``print``(f``"{{:<{10}}} "` `          ``f``"{{:<{mx_len_first + 5}}} "` `          ``f``"{{:<{mx_len_fol + 5}}}"` `          ``.``format``(``"Non-T"``, ``"FIRST"``, ``"FOLLOW"``))` `    ``for` `u ``in` `diction:` `        ``print``(f``"{{:<{10}}} "` `              ``f``"{{:<{mx_len_first + 5}}} "` `              ``f``"{{:<{mx_len_fol + 5}}}"` `              ``.``format``(u, ``str``(firsts[u]), ``str``(follows[u])))`   `    ``# create matrix of row(NT) x [col(T) + 1(\$)]` `    ``# create list of non-terminals` `    ``ntlist ``=` `list``(diction.keys())` `    ``terminals ``=` `copy.deepcopy(term_userdef)` `    ``terminals.append(``'\$'``)`   `    ``# create the initial empty state of ,matrix` `    ``mat ``=` `[]` `    ``for` `x ``in` `diction:` `        ``row ``=` `[]` `        ``for` `y ``in` `terminals:` `            ``row.append('')` `        ``# of \$ append one more col` `        ``mat.append(row)`   `    ``# Classifying grammar as LL(1) or not LL(1)` `    ``grammar_is_LL ``=` `True`   `    ``# rules implementation` `    ``for` `lhs ``in` `diction:` `        ``rhs ``=` `diction[lhs]` `        ``for` `y ``in` `rhs:` `            ``res ``=` `first(y)` `            ``# epsilon is present,` `            ``# - take union with follow` `            ``if` `'#'` `in` `res:` `                ``if` `type``(res) ``=``=` `str``:` `                    ``firstFollow ``=` `[]` `                    ``fol_op ``=` `follows[lhs]` `                    ``if` `fol_op ``is` `str``:` `                        ``firstFollow.append(fol_op)` `                    ``else``:` `                        ``for` `u ``in` `fol_op:` `                            ``firstFollow.append(u)` `                    ``res ``=` `firstFollow` `                ``else``:` `                    ``res.remove(``'#'``)` `                    ``res ``=` `list``(res) ``+``\` `                          ``list``(follows[lhs])` `            ``# add rules to table` `            ``ttemp ``=` `[]` `            ``if` `type``(res) ``is` `str``:` `                ``ttemp.append(res)` `                ``res ``=` `copy.deepcopy(ttemp)` `            ``for` `c ``in` `res:` `                ``xnt ``=` `ntlist.index(lhs)` `                ``yt ``=` `terminals.index(c)` `                ``if` `mat[xnt][yt] ``=``=` `'':` `                    ``mat[xnt][yt] ``=` `mat[xnt][yt] \` `                                   ``+` `f``"{lhs}->{' '.join(y)}"` `                ``else``:` `                    ``# if rule already present` `                    ``if` `f``"{lhs}->{y}"` `in` `mat[xnt][yt]:` `                        ``continue` `                    ``else``:` `                        ``grammar_is_LL ``=` `False` `                        ``mat[xnt][yt] ``=` `mat[xnt][yt] \` `                                       ``+` `f``",{lhs}->{' '.join(y)}"`   `    ``# final state of parse table` `    ``print``(``"\nGenerated parsing table:\n"``)` `    ``frmt ``=` `"{:>12}"` `*` `len``(terminals)` `    ``print``(frmt.``format``(``*``terminals))`   `    ``j ``=` `0` `    ``for` `y ``in` `mat:` `        ``frmt1 ``=` `"{:>12}"` `*` `len``(y)` `        ``print``(f``"{ntlist[j]} {frmt1.format(*y)}"``)` `        ``j ``+``=` `1`   `    ``return` `(mat, grammar_is_LL, terminals)`     `def` `validateStringUsingStackBuffer(parsing_table, grammarll1,` `                                   ``table_term_list, input_string,` `                                   ``term_userdef,start_symbol):`   `    ``print``(f``"\nValidate String => {input_string}\n"``)`   `    ``# for more than one entries` `    ``# - in one cell of parsing table` `    ``if` `grammarll1 ``=``=` `False``:` `        ``return` `f``"\nInput String = "` `\` `               ``f``"\"{input_string}\"\n"` `\` `               ``f``"Grammar is not LL(1)"`   `    ``# implementing stack buffer`   `    ``stack ``=` `[start_symbol, ``'\$'``]` `    ``buffer` `=` `[]`   `    ``# reverse input string store in buffer` `    ``input_string ``=` `input_string.split()` `    ``input_string.reverse()` `    ``buffer` `=` `[``'\$'``] ``+` `input_string`   `    ``print``(``"{:>20} {:>20} {:>20}"``.` `          ``format``(``"Buffer"``, ``"Stack"``,``"Action"``))`   `    ``while` `True``:` `        ``# end loop if all symbols matched` `        ``if` `stack ``=``=` `[``'\$'``] ``and` `buffer` `=``=` `[``'\$'``]:` `            ``print``(``"{:>20} {:>20} {:>20}"` `                  ``.``format``(``' '``.join(``buffer``),` `                          ``' '``.join(stack),` `                          ``"Valid"``))` `            ``return` `"\nValid String!"` `        ``elif` `stack[``0``] ``not` `in` `term_userdef:` `            ``# take font of buffer (y) and tos (x)` `            ``x ``=` `list``(diction.keys()).index(stack[``0``])` `            ``y ``=` `table_term_list.index(``buffer``[``-``1``])` `            ``if` `parsing_table[x][y] !``=` `'':` `                ``# format table entry received` `                ``entry ``=` `parsing_table[x][y]` `                ``print``(``"{:>20} {:>20} {:>25}"``.` `                      ``format``(``' '``.join(``buffer``),` `                             ``' '``.join(stack),` `                             ``f``"T[{stack[0]}][{buffer[-1]}] = {entry}"``))` `                ``lhs_rhs ``=` `entry.split(``"->"``)` `                ``lhs_rhs[``1``] ``=` `lhs_rhs[``1``].replace(``'#'``, '').strip()` `                ``entryrhs ``=` `lhs_rhs[``1``].split()` `                ``stack ``=` `entryrhs ``+` `stack[``1``:]` `            ``else``:` `                ``return` `f``"\nInvalid String! No rule at "` `\` `                       ``f``"Table[{stack[0]}][{buffer[-1]}]."` `        ``else``:` `            ``# stack top is Terminal` `            ``if` `stack[``0``] ``=``=` `buffer``[``-``1``]:` `                ``print``(``"{:>20} {:>20} {:>20}"` `                      ``.``format``(``' '``.join(``buffer``),` `                              ``' '``.join(stack),` `                              ``f``"Matched:{stack[0]}"``))` `                ``buffer` `=` `buffer``[:``-``1``]` `                ``stack ``=` `stack[``1``:]` `            ``else``:` `                ``return` `"\nInvalid String! "` `\` `                       ``"Unmatched terminal symbols"`     `# DRIVER CODE - MAIN`   `# NOTE: To test any of the sample sets, uncomment ->` `# 'rules' list, 'nonterm_userdef' list, 'term_userdef' list` `# and for any String validation uncomment following line with` `# 'sample_input_String' variable.`   `sample_input_string ``=` `None`   `# sample set 1 (Result: Not LL(1))` `# rules=["A -> S B | B",` `#        "S -> a | B c | #",` `#        "B -> b | d"]` `# nonterm_userdef=['A','S','B']` `# term_userdef=['a','c','b','d']` `# sample_input_string="b c b"`   `# sample set 2 (Result: LL(1))` `# rules=["S -> A | B C",` `#        "A -> a | b",` `#        "B -> p | #",` `#        "C -> c"]` `# nonterm_userdef=['A','S','B','C']` `# term_userdef=['a','c','b','p']` `# sample_input_string="p c"`   `# sample set 3 (Result: LL(1))` `# rules=["S -> A B | C",` `#        "A -> a | b | #",` `#        "B-> p | #",` `#        "C -> c"]` `# nonterm_userdef=['A','S','B','C']` `# term_userdef=['a','c','b','p']` `# sample_input_string="a c b"`   `# sample set 4 (Result: Not LL(1))` `# rules = ["S -> A B C | C",` `#          "A -> a | b B | #",` `#          "B -> p | #",` `#         "C -> c"]` `# nonterm_userdef=['A','S','B','C']` `# term_userdef=['a','c','b','p']` `# sample_input_string="b p p c"`   `# sample set 5 (With left recursion)` `# rules=["A -> B C c | g D B",` `#        "B -> b C D E | #",` `#        "C -> D a B | c a",` `#        "D -> # | d D",` `#        "E -> E a f | c"` `#       ]` `# nonterm_userdef=['A','B','C','D','E']` `# term_userdef=["a","b","c","d","f","g"]` `# sample_input_string="b a c a c"`   `# sample set 6` `# rules=["E -> T E'",` `#        "E' -> + T E' | #",` `#        "T -> F T'",` `#        "T' -> * F T' | #",` `#        "F -> ( E ) | id"` `# ]` `# nonterm_userdef=['E','E\'','F','T','T\'']` `# term_userdef=['id','+','*','(',')']` `# sample_input_string="id * * id"` `# example string 1` `# sample_input_string="( id * id )"` `# example string 2` `# sample_input_string="( id ) * id + id"`   `# sample set 7 (left factoring & recursion present)` `rules``=``[``"S -> A k O"``,` `       ``"A -> A d | a B | a C"``,` `       ``"C -> c"``,` `       ``"B -> b B C | r"``]`   `nonterm_userdef``=``[``'A'``,``'B'``,``'C'``]` `term_userdef``=``[``'k'``,``'O'``,``'d'``,``'a'``,``'c'``,``'b'``,``'r'``]` `sample_input_string``=``"a r k O"`   `# sample set 8 (Multiple char symbols T & NT)` `# rules = ["S -> NP VP",` `#          "NP -> P | PN | D N",` `#          "VP -> V NP",` `#          "N -> championship | ball | toss",` `#          "V -> is | want | won | played",` `#          "P -> me | I | you",` `#          "PN -> India | Australia | Steve | John",` `#          "D -> the | a | an"]` `#` `# nonterm_userdef = ['S', 'NP', 'VP', 'N', 'V', 'P', 'PN', 'D']` `# term_userdef = ["championship", "ball", "toss", "is", "want",` `#                 "won", "played", "me", "I", "you", "India",` `#                 "Australia","Steve", "John", "the", "a", "an"]` `# sample_input_string = "India won the championship"`   `# diction - store rules inputted` `# firsts - store computed firsts` `diction ``=` `{}` `firsts ``=` `{}` `follows ``=` `{}`   `# computes all FIRSTs for all non terminals` `computeAllFirsts()` `# assuming first rule has start_symbol` `# start symbol can be modified in below line of code` `start_symbol ``=` `list``(diction.keys())[``0``]` `# computes all FOLLOWs for all occurrences` `computeAllFollows()` `# generate formatted first and follow table` `# then generate parse table`   `(parsing_table, result, tabTerm) ``=` `createParseTable()`   `# validate string input using stack-buffer concept` `if` `sample_input_string !``=` `None``:` `    ``validity ``=` `validateStringUsingStackBuffer(parsing_table, result,` `                                              ``tabTerm, sample_input_string,` `                                              ``term_userdef,start_symbol)` `    ``print``(validity)` `else``:` `    ``print``(``"\nNo input String detected"``)`   `# Author: Tanmay P. Bisen`

Output

```Rules:

S->[['A', 'k', 'O']]
A->[['A', 'd'], ['a', 'B'], ['a', 'C']]
C->[['c']]
B->[['b', 'B', 'C'], ['r']]

After elimination of left recursion:

S->[['A', 'k', 'O']]
A->[['a', 'B', "A'"], ['a', 'C', "A'"]]
C->[['c']]
B->[['b', 'B', 'C'], ['r']]
A'->[['d', "A'"], ['#']]

After left factoring:

S->[['A', 'k', 'O']]
A->[['a', "A''"]]
A''->[['B', "A'"], ['C', "A'"]]
C->[['c']]
B->[['b', 'B', 'C'], ['r']]
A'->[['d', "A'"], ['#']]

Calculated firsts:
first(S) => {'a'}
first(A) => {'a'}
first(A'') => {'c', 'r', 'b'}
first(C) => {'c'}
first(B) => {'r', 'b'}
first(A') => {'d', '#'}

Calculated follows:
follow(S) => {'\$'}
follow(A) => {'k'}
follow(A'') => {'k'}
follow(C) => {'d', 'c', 'k'}
follow(B) => {'d', 'c', 'k'}
follow(A') => {'k'}

S          {'a'}                {'\$'}
A          {'a'}                {'k'}
A''        {'c', 'r', 'b'}      {'k'}
C          {'c'}                {'d', 'c', 'k'}
B          {'r', 'b'}           {'d', 'c', 'k'}
A'         {'d', '#'}           {'k'}

Generated parsing table:

k           O           d           a           c           b           r           \$
S                                         S->A k O
A                                         A->a A''
A''                                                    A''->C A'   A''->B A'   A''->B A'
C                                                         C->c
B                                                                 B->b B C        B->r
A'        A'->#                A'->d A'

Validate String => a r k O

Buffer                Stack               Action
\$ O k r a                  S \$        T[S][a] = S->A k O
\$ O k r a              A k O \$        T[A][a] = A->a A''
\$ O k r a          a A'' k O \$            Matched:a
\$ O k r            A'' k O \$     T[A''][r] = A''->B A'
\$ O k r           B A' k O \$            T[B][r] = B->r
\$ O k r           r A' k O \$            Matched:r
\$ O k             A' k O \$          T[A'][k] = A'->#
\$ O k                k O \$            Matched:k
\$ O                  O \$            Matched:O
\$                    \$                Valid

Valid String!```

## Inference

Sample set 7 is used to represent code output, it covers all the aspects of LL(1) parsing. Grammar after left recursion removal and after left factoring is printed. After that, we also have first and follow computation results. Then we generate the parsing table, if there are no multiple entries at any position (Table[NT][T]) in the table, we say grammar is LL(1). Finally, the sample input string is validated using stack buffer validation.

Whether you're preparing for your first job interview or aiming to upskill in this ever-evolving tech landscape, GeeksforGeeks Courses are your key to success. We provide top-quality content at affordable prices, all geared towards accelerating your growth in a time-bound manner. Join the millions we've already empowered, and we're here to do the same for you. Don't miss out - check it out now!

Previous
Next