Open In App

Parikh’s Theorem

Introduction :
Parikh’s theorem in theoretical computer science says that if one looks only at the number of occurrences of each terminal symbol in a context-free language, without regard to their order, then the language is indistinguishable from a regular language. It is useful for deciding that strings with a given number of terminals are not accepted by context-free grammar. It was first proved by Rohit Parikh in 1961 and republished in 1966.

 Theorem :
Parikhʼs theorem” states that the Parikh image of a context-free language is semi-linear or, equivalently, that every context-free language has the same Parikh image as some regular language. We present a very simple construction that, given context-free grammar, produces a finite automaton recognizing such a regular language.
A strengthened form of the pumping lemma for context-free languages is used to give a simple proof of Parikh’s Theorem.



1. Parikh Image –

   Examples –



  1. Π{a,b,c}(bccba) = (1, 2, 2) where (1, 2, 2) stands for {(a, 1),(b, 2),(c, 2)}.
  2. Π{a,b,c}(cabaaabb) = (4, 3, 1).

2. Derivation –
Let  ∑={a1,a2,…,ak} be an alphabet. The Parikh vector of a word is defined as the function  p: ∑* -> Nk, given by p(w) = (|w|a1, |w|a2, …, |w|k) where  |w|ai denotes the number of occurrences of the letter  ai  in the word w.

For Bounded Languages :
A language L is bounded if  L is a subset of w1*……..wk*  for some fixed words  w1,….. , wk . Ginsburg and Spanier gave a necessary and sufficient condition, similar to Parikh’s theorem, for bounded languages.
The Ginsburg-Spanier theorem says that a bounded language L is context-free if and only if  {(n1,…….,nk) | w1n1…… wknk  ∈ L} is a stratified semi-linear set.

Example –

Parikh vector :
P(‘000111’) = {3, 3}

The Parikh vector can be defined for this particular string with the no. of 0(|w|0) and 1(|w|1) in the string.
P(w) = {|w|0 , |w|1
So, P(’01’) = {1, 1}
and similarly, P(‘0011’) = {2, 2}
P(L) is the set of Parikh vectors of words in L .
Then, here P(L) = { P(’01’) , P(‘0011’) } = { {1,1}, {2,2} }
As we get the set of Parikh vectors of the given language , so the string belongs to the given language is finite and we can easily construct the DFA for the given language, i.e. a commutatively equivalent to some regular language. 

Some corollaries :

Significance :
The theorem has multiple interpretations. It shows that a context-free language over a singleton alphabet must be a regular language and that some context-free languages can only have ambiguous grammar. Such languages are called inherently ambiguous languages. From a formal grammar perspective, this means that some ambiguous context-free grammars cannot be converted to equivalent unambiguous context-free grammars.

Article Tags :