Prerequisite – Simplifying Context Free Grammars

A context free grammar (CFG) is in Chomsky Normal Form (CNF) if all production rules satisfy one of the following conditions:

- A non-terminal generating a terminal (e.g.; X->x)
- A non-terminal generating two non-terminals (e.g.; X->YZ)
- Start symbol generating ε. (e.g.; S-> ε)

Consider the following grammars,

G1 = {S->a, S->AZ, A->a, Z->z} G2 = {S->a, S->aZ, Z->a}

The grammar G1 is in CNF as production rules satisfy the rules specified for CNF. However, the grammar G2 is not in CNF as the production rule S->aZ contains terminal followed by non-terminal which does not satisfy the rules specified for CNF.

**Note –**

- For a given grammar, there can be more than one CNF.
- CNF produces the same language as generated by CFG.
- CNF is used as a preprocessing step for many algorithms for CFG like CYK(membership algo), bottom-up parsers etc.
- For generating string w of length ‘n’ requires ‘2n-1’ production or steps in CNF.
- Any Context free Grammar that do not have ε in it’s language has an equivalent CNF.

**How to convert CFG to CNF? **

**Step 1.** Eliminate start symbol from RHS.

If start symbol S is at the RHS of any production in the grammar, create a new production as:

S0->S

where S0 is the new start symbol.

**Step 2.** Eliminate null, unit and useless productions.

If CFG contains null, unit or useless production rules, eliminate them. You can refer the this article to eliminate these types of production rules.

**Step 3.** Eliminate terminals from RHS if they exist with other terminals or non-terminals. e.g,; production rule X->xY can be decomposed as:

X->ZY

Z->x

**Step 4.** Eliminate RHS with more than two non-terminals.

e.g,; production rule X->XYZ can be decomposed as:

X->PZ

P->XY

**Example –** Let us take an example to convert CFG to CNF. Consider the given grammar G1:

S → ASB A → aAS|a|ε B → SbS|A|bb

**Step 1.** As start symbol S appears on the RHS, we will create a new production rule S0->S. Therefore, the grammar will become:

S0->S S → ASB A → aAS|a|ε B → SbS|A|bb

**Step 2.** As grammar contains null production A-> ε, its removal from the grammar yields:

S0->S S → ASB|SB A → aAS|aS|a B → SbS| A|ε|bb

Now, it creates null production B→ ε, its removal from the grammar yields:

S0->S S → AS|ASB| SB| S A → aAS|aS|a B → SbS| A|bb

Now, it creates unit production B->A, its removal from the grammar yields:

S0->S S → AS|ASB| SB| S A → aAS|aS|a B → SbS|bb|aAS|aS|a

Also, removal of unit production S0->S from grammar yields:

S0-> AS|ASB| SB| S S → AS|ASB| SB| S A → aAS|aS|a B → SbS|bb|aAS|aS|a

Also, removal of unit production S->S and S0->S from grammar yields:

S0-> AS|ASB| SB S → AS|ASB| SB A → aAS|aS|a B → SbS|bb|aAS|aS|a

**Step 3.** In production rule A->aAS |aS and B-> SbS|aAS|aS, terminals a and b exist on RHS with non-terminates. Removing them from RHS:

S0-> AS|ASB| SB S → AS|ASB| SB A → XAS|XS|a B → SYS|bb|XAS|XS|a X →a Y→b

Also, B->bb can’t be part of CNF, removing it from grammar yields:

S0-> AS|ASB| SB S → AS|ASB| SB A → XAS|XS|a B → SYS|VV|XAS|XS|a X → a Y → b V → b

**Step 4:** In production rule S0->ASB, RHS has more than two symbols, removing it from grammar yields:

S0-> AS|PB| SB S → AS|ASB| SB A → XAS|XS|a B → SYS|VV|XAS|XS|a X → a Y → b V → b P → AS

Similarly, S->ASB has more than two symbols, removing it from grammar yields:

S0-> AS|PB| SB S → AS|QB| SB A → XAS|XS|a B → SYS|VV|XAS|XS|a X → a Y → b V → b P → AS Q → AS

Similarly, A->XAS has more than two symbols, removing it from grammar yields:

S0-> AS|PB| SB S → AS|QB| SB A → RS|XS|a B → SYS|VV|XAS|XS|a X → a Y → b V → b P → AS Q → AS R → XA

Similarly, B->SYS has more than two symbols, removing it from grammar yields:

S0 -> AS|PB| SB S → AS|QB| SB A → RS|XS|a B → TS|VV|XAS|XS|a X → a Y → b V → b P → AS Q → AS R → XA T → SY

Similarly, B->XAX has more than two symbols, removing it from grammar yields:

S0-> AS|PB| SB S → AS|QB| SB A → RS|XS|a B → TS|VV|US|XS|a X → a Y → b V → b P → AS Q → AS R → XA T → SY U → XA

So this is the required CNF for given grammar.

Attention reader! Don’t stop learning now. Get hold of all the important CS Theory concepts for SDE interviews with the CS Theory Course at a student-friendly price and become industry ready.

## Recommended Posts:

- Converting Context Free Grammar to Greibach Normal Form
- Ambiguity in Context free Grammar and Context free Languages
- Ambiguity in Context free Grammar and Context free Languages
- Regular Expression Vs Context Free Grammar
- CYK Algorithm for Context Free Grammar
- Context-sensitive Grammar (CSG) and Language (CSL)
- Difference between Thread Context Switch and Process Context Switch
- Classification of Context Free Grammars
- Closure Properties of Context Free Languages
- Simplifying Context Free Grammars
- Check if the language is Context Free or Not
- Various Properties of context free languages (CFL)
- Chomsky Hierarchy in Theory of Computation
- How to find the highest normal form of a relation
- Introduction of 4th and 5th Normal form in DBMS
- Minimum relations satisfying First Normal Form (1NF)
- Domain Key Normal Form in DBMS
- First Normal Form (1NF)
- Second Normal Form (2NF)
- Third Normal Form (3NF)

This article is contributed by **Sonal Tuteja**. If you like GeeksforGeeks and would like to contribute, you can also write an article using contribute.geeksforgeeks.org or mail your article to contribute@geeksforgeeks.org. See your article appearing on the GeeksforGeeks main page and help other Geeks.

Please Improve this article if you find anything incorrect by clicking on the "Improve Article" button below.