Removing Direct and Indirect Left Recursion in a Grammar

Prerequisite – Classification of Context Free Grammars, Ambiguity and Parsers

Introduction: Left recursion is a common problem that occurs in grammar during parsing in the syntax analysis part of compilation. It is important to remove left recursion from grammar because it can create an infinite loop, leading to errors and a significant decrease in performance. This article will provide an algorithm to remove left recursion from grammar, along with an example and explanations of the process.

Left Recursion: Grammar of the form,

`S â‡’ S | a | b `

is called left recursive where S is any non Terminal and a and b are any set of terminals.
Problem with Left Recursion: If a left recursion is present in any grammar then, during parsing in the syntax analysis part of compilation, there is a chance that the grammar will create an infinite loop. This is because, at every time of production of grammar, S will produce another S without checking any condition.
Algorithm to Remove Left Recursion with an example: Suppose we have a grammar which contains left recursion:

`S â‡’ S a | S b | c | d `

Check if the given grammar contains left recursion. If present, then separate the production and start working on it.  In our example:

`S â‡’ S a | S b | c | d   `

Introduce a new nonterminal and write it at the end of every terminal. We create a new nonterminal S’ and write the new production as:

`S â‡’ cS' | dS' `

Write the newly produced nonterminal S’ in the LHS, and in the RHS it can either produce S’ or it can produce new production in which the terminals or non terminals which followed the previous LHS will be replaced by the new nonterminal S’ at the end of the term.

`S' â‡’ Îµ | aS' | bS'`

So, after conversion, the new equivalent production is:

```S â‡’ cS' | dS'
S' â‡’ Îµ | aS' | bS'```

Indirect Left Recursion: A grammar is said to have indirect left recursion if, starting from any symbol of the grammar, it is possible to derive a string whose head is that symbol. For example,

```A â‡’ B r
B â‡’ C d
C â‡’ A t ```

where A, B, C are non-terminals and r, d, t are terminals. Here, starting with A, we can derive A again by substituting C to B and B to A.

Algorithm to remove Indirect Recursion with help of an example:

```A1 â‡’ A2 A3
A2 â‡’ A3 A1 | b
A3 â‡’ A1 A1 | a ```

Where A1, A2, A3 are non terminals and a, b are terminals.

Identify the productions which can cause indirect left recursion. In our case,

`A3 â‡’ A1 A1 | a`

Substitute its production at the place the terminal is present in any other production: substitute A1â€“> A2 A3 in production of A3.

`A3 â‡’ A2 A3 A1 | a`

Now in this production substitute A2 â‡’ A3 A1 | b

`A3 â‡’ (A3 A1 | b) A3 A1 | a `

and then distributing,

`A3 â‡’ A3 A1 A3 A1 | b A3 A1 | a`

Now the new production is converted in the form of direct left recursion, solve this by the direct left recursion method.

Eliminating direct left recursion as in the above, introduce a new nonterminal and write it at the end of every terminal. We create a new nonterminal A’ and write the new productions as:

```A3 â‡’ b A3 A1 A' | aA'
A' â‡’ Îµ | A1 A3 A1 A'```

Îµ can be distributed to avoid an empty term:

```A3 â‡’ b A3 A1 | a | b A3 A1 A' | aA'
A' â‡’ A1 A3 A1 | A1 A3 A1 A'```

The resulting grammar is then:

```A1 â‡’ A2 A3
A2 â‡’ A3 A1 | b
A3 â‡’ b A3 A1 | a | b A3 A1 A' | aA'
A' â‡’ A1 A3 A1 | A1 A3 A1 A'```

Implementation:

C++

 `#include ``using` `namespace` `std;` `class` `NonTerminal {``    ``string name;                    ``// Stores the Head of production rule``    ``vector productionRules; ``// Stores the body of production rules` `public``:``    ``NonTerminal(string name) {``        ``this``->name = name;``    ``}` `    ``// Returns the head of the production rule``    ``string getName() {``        ``return` `name;``    ``}` `    ``// Returns the body of the production rules``    ``void` `setRules(vector rules) {``        ``productionRules.clear();``        ``for` `(``auto` `rule : rules){``            ``productionRules.push_back(rule);``        ``}``    ``}` `    ``vector getRules() {``        ``return` `productionRules;``    ``}` `    ``void` `addRule(string rule) {``        ``productionRules.push_back(rule);``    ``}` `    ``// Prints the production rules``    ``void` `printRule() {``        ``string toPrint = ``""``;``        ``toPrint += name + ``" ->"``;` `        ``for` `(string s : productionRules){``            ``toPrint += ``" "` `+ s + ``" |"``;``        ``}` `        ``toPrint.pop_back();``        ``cout << toPrint << endl;``    ``}``};` `class` `Grammar {``    ``vector nonTerminals;` `public``:``    ``// Add rules to the grammar``    ``void` `addRule(string rule) {``        ``bool` `nt = 0;``        ``string parse = ``""``;` `        ``for` `(``char` `c : rule){``            ``if` `(c == ``' '``) {``                ``if` `(!nt) {``                    ``NonTerminal newNonTerminal(parse);``                    ``nonTerminals.push_back(newNonTerminal);``                    ``nt = 1;``                    ``parse = ``""``;``                ``} ``else` `if` `(parse.size()){``                    ``nonTerminals.back().addRule(parse);``                    ``parse = ``""``;``                ``}``            ``}``else` `if` `(c != ``'|'` `&& c != ``'-'` `&& c != ``'>'``){``                ``parse += c;``            ``}``        ``}``        ``if` `(parse.size()){``            ``nonTerminals.back().addRule(parse);``        ``}``    ``}` `    ``void` `inputData() {` `       ` `        ``addRule(``"S -> Sa | Sb | c | d"``);` `    ``}` `    ``// Algorithm for eliminating the non-Immediate Left Recursion``    ``void` `solveNonImmediateLR(NonTerminal &A, NonTerminal &B) {``        ``string nameA = A.getName();``        ``string nameB = B.getName();` `        ``vector rulesA, rulesB, newRulesA;``        ``rulesA = A.getRules();``        ``rulesB = B.getRules();` `        ``for` `(``auto` `rule : rulesA) {``            ``if` `(rule.substr(0, nameB.size()) == nameB) {``                ``for` `(``auto` `rule1 : rulesB){``                    ``newRulesA.push_back(rule1 + rule.substr(nameB.size()));``                ``}``            ``}``            ``else``{``                ``newRulesA.push_back(rule);``            ``}``        ``}``        ``A.setRules(newRulesA);``    ``}` `    ``// Algorithm for eliminating Immediate Left Recursion``    ``void` `solveImmediateLR(NonTerminal &A) {``        ``string name = A.getName();``        ``string newName = name + ``"'"``;` `        ``vector alphas, betas, rules, newRulesA, newRulesA1;``        ``rules = A.getRules();` `        ``// Checks if there is left recursion or not``        ``for` `(``auto` `rule : rules) {``            ``if` `(rule.substr(0, name.size()) == name){``                ``alphas.push_back(rule.substr(name.size()));``            ``}``            ``else``{``                ``betas.push_back(rule);``            ``}``        ``}` `        ``// If no left recursion, exit``        ``if` `(!alphas.size())``            ``return``;` `        ``if` `(!betas.size())``            ``newRulesA.push_back(newName);` `        ``for` `(``auto` `beta : betas)``            ``newRulesA.push_back(beta + newName);` `        ``for` `(``auto` `alpha : alphas)``            ``newRulesA1.push_back(alpha + newName);` `        ``// Amends the original rule``        ``A.setRules(newRulesA);``        ``newRulesA1.push_back(``"\u03B5"``);` `        ``// Adds new production rule``        ``NonTerminal newNonTerminal(newName);``        ``newNonTerminal.setRules(newRulesA1);``        ``nonTerminals.push_back(newNonTerminal);``    ``}` `    ``// Eliminates left recursion``    ``void` `applyAlgorithm() {``        ``int` `size = nonTerminals.size();``        ``for` `(``int` `i = 0; i < size; i++){``            ``for` `(``int` `j = 0; j < i; j++){``                ``solveNonImmediateLR(nonTerminals[i], nonTerminals[j]);``            ``}``            ``solveImmediateLR(nonTerminals[i]);``        ``}``    ``}` `    ``// Print all the rules of grammar``    ``void` `printRules() {``        ``for` `(``auto` `nonTerminal : nonTerminals){``            ``nonTerminal.printRule();``        ``}``    ``}``};` `int` `main(){``    ``//freopen("output.txt", "w+", stdout);` `    ``Grammar grammar;``    ``grammar.inputData();``    ``grammar.applyAlgorithm();``    ``grammar.printRules();` `    ``return` `0;``}`

Java

 `import` `java.util.*;` `class` `NonTerminal{``    ``private` `String name;``    ``private` `ArrayList rules;` `    ``public` `NonTerminal(String name) {``        ``this``.name = name;``        ``rules = ``new` `ArrayList<>();``    ``}` `    ``public` `void` `addRule(String rule) {``        ``rules.add(rule);``    ``}` `    ``public` `void` `setRules(ArrayList rules) {``        ``this``.rules = rules;``    ``}` `    ``public` `String getName() {``        ``return` `name;``    ``}` `    ``public` `ArrayList getRules() {``        ``return` `rules;``    ``}` `    ``public` `void` `printRule() {``        ``System.out.print(name + ``" -> "``);``        ``for` `(``int` `i = ``0``; i < rules.size(); i++){``            ``System.out.print(rules.get(i));``            ``if` `(i != rules.size() - ``1``)``                ``System.out.print(``" | "``);``        ``}``        ``System.out.println();``    ``}``}`  `class` `Grammar{``    ``private` `ArrayList nonTerminals;` `    ``public` `Grammar() {``        ``nonTerminals = ``new` `ArrayList<>();``    ``}` `    ``public` `void` `addRule(String rule) {``        ``boolean` `nt = ``false``;``        ``String parse = ``""``;` `        ``for` `(``int` `i = ``0``; i < rule.length(); i++){``            ``char` `c = rule.charAt(i);``            ``if` `(c == ``' '``) {``                ``if` `(!nt) {``                    ``NonTerminal newNonTerminal = ``new` `NonTerminal(parse);``                    ``nonTerminals.add(newNonTerminal);``                    ``nt = ``true``;``                    ``parse = ``""``;``                ``} ``else` `if` `(parse.length() != ``0``){``                    ``nonTerminals.get(nonTerminals.size() - ``1``).addRule(parse);``                    ``parse = ``""``;``                ``}``            ``}``else` `if` `(c != ``'|'` `&& c != ``'-'` `&& c != ``'>'``){``                ``parse += c;``            ``}``        ``}``        ``if` `(parse.length() != ``0``){``            ``nonTerminals.get(nonTerminals.size() - ``1``).addRule(parse);``        ``}``    ``}` `    ``public` `void` `inputData() {``        ``addRule(``"S -> Sa | Sb | c | d"``);``    ``}` `    ``public` `void` `solveNonImmediateLR(NonTerminal A, NonTerminal B) {``        ``String nameA = A.getName();``        ``String nameB = B.getName();` `        ``ArrayList rulesA = ``new` `ArrayList<>();``        ``ArrayList rulesB = ``new` `ArrayList<>();``        ``ArrayList newRulesA = ``new` `ArrayList<>();``        ``rulesA = A.getRules();``        ``rulesB = B.getRules();` `        ``for` `(String rule : rulesA) {``            ``if` `(rule.substring(``0``, nameB.length()).equals(nameB)) {``                ``for` `(String rule1 : rulesB){``                    ``newRulesA.add(rule1 + rule.substring(nameB.length()));``                ``}``            ``}``            ``else``{``                ``newRulesA.add(rule);``            ``}``        ``}``        ``A.setRules(newRulesA);``    ``}` `    ``public` `void` `solveImmediateLR(NonTerminal A) {``        ``String name = A.getName();``        ``String newName = name + ``"'"``;` `        ``ArrayList alphas= ``new` `ArrayList<>();``        ``ArrayList betas = ``new` `ArrayList<>();``        ``ArrayList rules = A.getRules();``        ``ArrayList newRulesA = ``new` `ArrayList<>();``        ``ArrayList newRulesA1 = ``new` `ArrayList<>();` `        ` `        ``rules = A.getRules();` `        ``// Checks if there is left recursion or not``        ``for` `(String rule : rules) {``            ``if` `(rule.substring(``0``, name.length()).equals(name)){``                ``alphas.add(rule.substring(name.length()));``            ``}``            ``else``{``                ``betas.add(rule);``            ``}``        ``}` `        ``// If no left recursion, exit``        ``if` `(alphas.size() == ``0``)``            ``return``;` `        ``if` `(betas.size() == ``0``)``            ``newRulesA.add(newName);` `        ``for` `(String beta : betas)``            ``newRulesA.add(beta + newName);` `        ``for` `(String alpha : alphas)``            ``newRulesA1.add(alpha + newName);` `        ``// Amends the original rule` `        ``A.setRules(newRulesA);``        ``newRulesA1.add(``"\u03B5"``);` `        ``// Adds new production rule``        ``NonTerminal newNonTerminal = ``new` `NonTerminal(newName);``        ``newNonTerminal.setRules(newRulesA1);``        ``nonTerminals.add(newNonTerminal);``    ``}` `    ``public` `void` `applyAlgorithm() {``        ``int` `size = nonTerminals.size();``        ``for` `(``int` `i = ``0``; i < size; i++){``            ``for` `(``int` `j = ``0``; j < i; j++){``                ``solveNonImmediateLR(nonTerminals.get(i), nonTerminals.get(j));``            ``}``            ``solveImmediateLR(nonTerminals.get(i));``        ``}``    ``}` `    ``void` `printRules() {``        ``for` `(NonTerminal nonTerminal : nonTerminals){``            ``nonTerminal.printRule();``        ``}``    ``}``    `   `}``class` `Main{``    ``public` `static` `void` `main(String[] args) {``        ``Grammar grammar = ``new` `Grammar();``        ``grammar.inputData();``        ``grammar.applyAlgorithm();``        ``grammar.printRules();``    ``}``}`

Python3

 `class` `NonTerminal :``    ``def` `__init__(``self``, name) :``        ``self``.name ``=` `name``        ``self``.rules ``=` `[]``    ``def` `addRule(``self``, rule) :``        ``self``.rules.append(rule)``    ``def` `setRules(``self``, rules) :``        ``self``.rules ``=` `rules``    ``def` `getName(``self``) :``        ``return` `self``.name``    ``def` `getRules(``self``) :``        ``return` `self``.rules``    ``def` `printRule(``self``) :``        ``print``(``self``.name ``+` `" -> "``, end ``=` `"")``        ``for` `i ``in` `range``(``len``(``self``.rules)) :``            ``print``(``self``.rules[i], end ``=` `"")``            ``if` `i !``=` `len``(``self``.rules) ``-` `1` `:``                ``print``(``" | "``, end ``=` `"")``        ``print``()``        ` `        ` `class` `Grammar :``    ``def` `__init__(``self``) :``        ``self``.nonTerminals ``=` `[]` `    ``def` `addRule(``self``, rule) :``        ``nt ``=` `False``        ``parse ``=` `""` `        ``for` `i ``in` `range``(``len``(rule)) :``            ``c ``=` `rule[i]``            ``if` `c ``=``=` `' '` `:``                ``if` `not` `nt :``                    ``newNonTerminal ``=` `NonTerminal(parse)``                    ``self``.nonTerminals.append(newNonTerminal)``                    ``nt ``=` `True``                    ``parse ``=` `""``                ``elif` `parse !``=` `"" :``                    ``self``.nonTerminals[``len``(``self``.nonTerminals) ``-` `1``].addRule(parse)``                    ``parse ``=` `""``            ``elif` `c !``=` `'|'` `and` `c !``=` `'-'` `and` `c !``=` `'>'` `:``                ``parse ``+``=` `c``        ``if` `parse !``=` `"" :``            ``self``.nonTerminals[``len``(``self``.nonTerminals) ``-` `1``].addRule(parse)` `    ``def` `inputData(``self``) :``        ``self``.addRule(``"S -> Sa | Sb | c | d"``)` `    ``def` `solveNonImmediateLR(``self``, A, B) :``        ``nameA ``=` `A.getName()``        ``nameB ``=` `B.getName()` `        ``rulesA ``=` `[]``        ``rulesB ``=` `[]``        ``newRulesA ``=` `[]``        ``rulesA ``=` `A.getRules()``        ``rulesB ``=` `B.getRules()` `        ``for` `rule ``in` `rulesA :``            ``if` `rule[``0` `: ``len``(nameB)] ``=``=` `nameB :``                ``for` `rule1 ``in` `rulesB :``                    ``newRulesA.append(rule1 ``+` `rule[``len``(nameB) : ])``            ``else` `:``                ``newRulesA.append(rule)``        ``A.setRules(newRulesA)` `    ``def` `solveImmediateLR(``self``, A) :``        ``name ``=` `A.getName()``        ``newName ``=` `name ``+` `"'"` `        ``alphas ``=` `[]``        ``betas ``=` `[]``        ``rules ``=` `A.getRules()``        ``newRulesA ``=` `[]``        ``newRulesA1 ``=` `[]` `        ``rules ``=` `A.getRules()` `        ``# Checks if there is left recursion or not``        ``for` `rule ``in` `rules :``            ``if` `rule[``0` `: ``len``(name)] ``=``=` `name :``                ``alphas.append(rule[``len``(name) : ])``            ``else` `:``                ``betas.append(rule)` `        ``# If no left recursion, exit``        ``if` `len``(alphas) ``=``=` `0` `:``            ``return` `        ``if` `len``(betas) ``=``=` `0` `:``            ``newRulesA.append(newName)` `        ``for` `beta ``in` `betas :``            ``newRulesA.append(beta ``+` `newName)` `        ``for` `alpha ``in` `alphas :``            ``newRulesA1.append(alpha ``+` `newName)` `        ``# Amends the original rule` `        ``A.setRules(newRulesA)``        ``newRulesA1.append(``"\u03B5"``)` `        ``# Adds new production rule``        ``newNonTerminal ``=` `NonTerminal(newName)``        ``newNonTerminal.setRules(newRulesA1)``        ``self``.nonTerminals.append(newNonTerminal)` `    ``def` `applyAlgorithm(``self``) :``        ``size ``=` `len``(``self``.nonTerminals)``        ``for` `i ``in` `range``(size) :``            ``for` `j ``in` `range``(i) :``                ``self``.solveNonImmediateLR(``self``.nonTerminals[i], ``self``.nonTerminals[j])``            ``self``.solveImmediateLR(``self``.nonTerminals[i])` `    ``def` `printRules(``self``) :``        ``for` `nonTerminal ``in` `self``.nonTerminals :``            ``nonTerminal.printRule()` `            ` `grammar ``=` `Grammar()``grammar.inputData()``grammar.applyAlgorithm()``grammar.printRules()`

Output
```S -> cS' | dS'
S' -> aS' | bS' | Îµ ```

Time Complexity :  The time complexity of the algorithm is O(n*s) where n= no of production rules and s = maximum string length of each rule.

Previous
Next