# Regular Expressions, Regular Grammar and Regular Languages

As discussed in Chomsky Hierarchy, Regular Languages are the most restricted types of languages and are accepted by finite automata.

**Regular Expressions**

Regular Expressions are used to denote regular languages. An expression is regular if:

- ɸ is a regular expression for regular language ɸ.
- ɛ is a regular expression for regular language {ɛ}.
- If a ∈ Σ (Σ represents the input alphabet), a is regular expression with language {a}.
- If a and b are regular expression, a + b is also a regular expression with language {a,b}.
- If a and b are regular expression, ab (concatenation of a and b) is also regular.
- If a is regular expression, a* (0 or more times a) is also regular.
- Star Height of Regular Expression and Regular Language
- Regular Expression Vs Context Free Grammar
- Theory of Computation | Union & Intersection of Regular languages with CFL
- TOC | Regular expression to ∈-NFA
- How to identify if a language is regular or not
- TOC | Designing Finite Automata from Regular Expression (Set 4)
- TOC | Designing Finite Automata from Regular Expression (Set 5)
- TOC | Designing Finite Automata from Regular Expression (Set 8)
- TOC | Designing Finite Automata from Regular Expression (Set 7)
- TOC | Designing Finite Automata from Regular Expression (Set 6)
- TOC | Designing Finite Automata from Regular Expression (Set 2)
- TOC | Designing Finite Automata from Regular Expression (Set 3)
- TOC | Designing Finite Automata from Regular Expression (Set 1)
- Theory of Computation | Generating regular expression from finite automata
- Regular Graph in Graph Theory

Regular Expression | Regular Languages | |
---|---|---|

set of vovels | ( a ∪ e ∪ i ∪ o ∪ u ) | {a, e, i, o, u} |

a followed by 0 or more b | (a.b^{*}) |
{a, ab, abb, abbb, abbbb,….} |

any no. of vowels followed by any no. of consonants | v^{*}.c^{*} ( where v – vowels and c – consonants) |
{ ε , a ,aou, aiou, b, abcd…..} where ε represent empty string (in case 0 vowels and o consonants ) |

**Regular Grammar :** A grammar is regular if it has rules of form A -> a or A -> aB or A -> ɛ where ɛ is a special symbol called NULL.

**Regular Languages :** A language is regular if it can be expressed in terms of regular expression.

**Closure Properties of Regular Languages**

**Union :** If L1 and If L2 are two regular languages, their union L1 ∪ L2 will also be regular. For example, L1 = {a^{n} | n ≥ 0} and L2 = {b^{n} | n ≥ 0}

L3 = L1 ∪ L2 = {a^{n} ∪ b^{n} | n ≥ 0} is also regular.

**Intersection :** If L1 and If L2 are two regular languages, their intersection L1 ∩ L2 will also be regular. For example,

L1= {a^{m} b^{n} | n ≥ 0 and m ≥ 0} and L2= {a^{m} b^{n} ∪ b^{n} a^{m} | n ≥ 0 and m ≥ 0}

L3 = L1 ∩ L2 = {a^{m} b^{n} | n ≥ 0 and m ≥ 0} is also regular.

**Concatenation :** If L1 and If L2 are two regular languages, their concatenation L1.L2 will also be regular. For example,

L1 = {a^{n} | n ≥ 0} and L2 = {b^{n} | n ≥ 0}

L3 = L1.L2 = {a^{m} . b^{n} | m ≥ 0 and n ≥ 0} is also regular.

**Kleene Closure :** If L1 is a regular language, its Kleene closure L1* will also be regular. For example,

L1 = (a ∪ b)

L1* = (a ∪ b)*

**Complement :** If L(G) is regular language, its complement L’(G) will also be regular. Complement of a language can be found by subtracting strings which are in L(G) from all possible strings. For example,

L(G) = {a^{n} | n > 3}

L’(G) = {a^{n} | n <= 3}

**Note :** Two regular expressions are equivalent if languages generated by them are same. For example, (a+b*)* and (a+b)* generate same language. Every string which is generated by (a+b*)* is also generated by (a+b)* and vice versa.

### How to solve problems on regular expression and regular languages?

**Question 1 :** Which one of the following languages over the alphabet {0,1} is described by the regular expression?

(0+1)*0(0+1)*0(0+1)*

(A) The set of all strings containing the substring 00.

(B) The set of all strings containing at most two 0’s.

(C) The set of all strings containing at least two 0’s.

(D) The set of all strings that begin and end with either 0 or 1.

**
Solution :** Option A says that it must have substring 00. But 10101 is also a part of language but it does not contain 00 as substring. So it is not correct option.

Option B says that it can have maximum two 0’s but 00000 is also a part of language. So it is not correct option.

Option C says that it must contain atleast two 0. In regular expression, two 0 are present. So this is correct option.

Option D says that it contains all strings that begin and end with either 0 or 1. But it can generate strings which start with 0 and end with 1 or vice versa as well. So it is not correct.

**Question 2 :**Which of the following languages is generated by given grammar?

S -> aS | bS | ∊

(A) {a

^{n}b

^{m}| n,m ≥ 0}

(B) {w ∈ {a,b}* | w has equal number of a’s and b’s}

(C) {a

^{n}| n ≥ 0} ∪ {b

^{n}| n ≥ 0} ∪ {a

^{n}b

^{n}| n ≥ 0}

(D) {a,b}*

**Solution :** Option (A) says that it will have 0 or more a followed by 0 or more b. But S -> bS => baS => ba is also a part of language. So (A) is not correct.

Option (B) says that it will have equal no. of a’s and b’s. But But S -> bS => b is also a part of language. So (B) is not correct.

Option (C) says either it will have 0 or more a’s or 0 or more b’s or a’s followed by b’s. But as shown in option (A), ba is also part of language. So (C) is not correct.

Option (D) says it can have any number of a’s and any numbers of b’s in any order. So (D) is correct.

**Question 3 : **The regular expression 0*(10*)* denotes the same set as

(A) (1*0)*1*

(B) 0 + (0 + 10)*

(C) (0 + 1)* 10(0 + 1)*

(D) none of these

**
Solution :** Two regular expressions are equivalent if languages generated by them are same.

Option (A) can generate all strings generated by 0*(10*)*. So they are equivalent.

Option (B) can generate 0100 but 0*(10*)* cannot. So they are not equivalent.

Option (C) will have 10 as substring but 0*(10*)* may or may not. So they are not equivalent.

**Question 4 : **The regular expression for the language having input alphabets a and b, in which two a’s do not come together:

(A) (b + ab)* + (b +ab)*a

(B) a(b + ba)* + (b + ba)*

(C) both options (A) and (B)

(D) none of the above

**Solution:**

Option (C) stating both both options (A) and (B) is the correct regular expression for the stated question.

The language in the question can be expressed as L={&epsilon,a,b,bb,ab,aba,ba,bab,baba,abab,…}.

In option (A) ‘ab’ is considered the building block for finding out the required regular expression.(b + ab)* covers all cases of strings generated ending with ‘b’.(b + ab)*a covers all cases of strings generated ending with a.

Applying similar logic for option (B) we can see that the regular expression is derived considering ‘ba’ as the building block and it covers all cases of strings starting with a and starting with b.

This article has been contributed by Sonal Tuteja.

Please write comments if you find anything incorrect, or you want to share more information about the topic discussed above