Regular Expressions, Regular Grammar and Regular Languages

As discussed in Chomsky Hierarchy, Regular Languages are the most restricted types of languages and are accepted by finite automata.
 
Regular Expressions
Regular Expressions are used to denote regular languages. An expression is regular if:

  • ɸ is a regular expression for regular language ɸ.
  • ɛ is a regular expression for regular language {ɛ}.
  • If a ∈ Σ (Σ represents the input alphabet), a is regular expression with language {a}.
  • If a and b are regular expression, a + b is also a regular expression with language {a,b}.
  • If a and b are regular expression, ab (concatenation of a and b) is also regular.
  • If a is regular expression, a* (0 or more times a) is also regular.
  • regular languages
     
    Regular Grammar : A grammar is regular if it has rules of form A -> a or A -> aB or A -> ɛ where ɛ is a special symbol called NULL.
     
    Regular Languages : A language is regular if it can be expressed in terms of regular expression.
     
    Closure Properties of Regular Languages
    Union : If L1 and If L2 are two regular languages, their union L1 ∪ L2 will also be regular. For example, L1 = {an | n ≥ 0} and L2 = {bn | n ≥ 0}
    L3 = L1 ∪ L2 = {an ∪ bn | n ≥ 0} is also regular.
    Intersection : If L1 and If L2 are two regular languages, their intersection L1 ∩ L2 will also be regular. For example,
    L1= {am bn | n ≥ 0 and m ≥ 0} and L2= {am bn ∪ bn am | n ≥ 0 and m ≥ 0}
    L3 = L1 ∩ L2 = {am bn | n ≥ 0 and m ≥ 0} is also regular.
    Concatenation : If L1 and If L2 are two regular languages, their concatenation L1.L2 will also be regular. For example,
    L1 = {an | n ≥ 0} and L2 = {bn | n ≥ 0}
    L3 = L1.L2 = {am . bn | m ≥ 0 and n ≥ 0} is also regular.
    Kleene Closure : If L1 is a regular language, its Kleene closure L1* will also be regular. For example,
    L1 = (a ∪ b)
    L1* = (a ∪ b)*
    Complement : If L(G) is regular language, its complement L’(G) will also be regular. Complement of a language can be found by subtracting strings which are in L(G) from all possible strings. For example,
    L(G) = {an | n > 3}
    L’(G) = {an | n <= 3}

    Note : Two regular expressions are equivalent if languages generated by them are same. For example, (a+b*)* and (a+b)* generate same language. Every string which is generated by (a+b*)* is also generated by (a+b)* and vice versa.

    How to solve problems on regular expression and regular languages?

    Question 1 : Which one of the following languages over the alphabet {0,1} is described by the regular expression?
    (0+1)*0(0+1)*0(0+1)*
    (A) The set of all strings containing the substring 00.
    (B) The set of all strings containing at most two 0’s.
    (C) The set of all strings containing at least two 0’s.
    (D) The set of all strings that begin and end with either 0 or 1.

    Solution :
    Option A says that it must have substring 00. But 10101 is also a part of language but it does not contain 00 as substring. So it is not correct option.
    Option B says that it can have maximum two 0’s but 00000 is also a part of language. So it is not correct option.
    Option C says that it must contain atleast two 0. In regular expression, two 0 are present. So this is correct option.
    Option D says that it contains all strings that begin and end with either 0 or 1. But it can generate strings which start with 0 and end with 1 or vice versa as well. So it is not correct.
     
    Question 2 : Which of the following languages is generated by given grammar?
    S -> aS | bS | ∊
    (A) {an bm | n,m ≥ 0}
    (B) {w ∈ {a,b}* | w has equal number of a’s and b’s}
    (C) {an | n ≥ 0} ∪ {bn | n ≥ 0} ∪ {an bn | n ≥ 0}
    (D) {a,b}*

    Solution : Option (A) says that it will have 0 or more a followed by 0 or more b. But S -> bS => baS => ba is also a part of language. So (A) is not correct.
    Option (B) says that it will have equal no. of a’s and b’s. But But S -> bS => b is also a part of language. So (B) is not correct.
    Option (C) says either it will have 0 or more a’s or 0 or more b’s or a’s followed by b’s. But as shown in option (A), ba is also part of language. So (C) is not correct.
    Option (D) says it can have any number of a’s and any numbers of b’s in any order. So (D) is correct.
     
    Question 3 : The regular expression 0*(10*)* denotes the same set as
    (A) (1*0)*1*
    (B) 0 + (0 + 10)*
    (C) (0 + 1)* 10(0 + 1)*
    (D) none of these

    Solution :
    Two regular expressions are equivalent if languages generated by them are same.
    Option (A) can generate 101 but 0*(10*)* cannot. So they are not equivalent.
    Option (B) can generate 0100 but 0*(10*)* cannot. So they are not equivalent.
    Option (C) will have 10 as substring but 0*(10*)* may or may not. So they are not equivalent.
     
    This article has been contributed by Sonal Tuteja.
     
    Please write comments if you find anything incorrect, or you want to share more information about the topic discussed above

    GATE CS Corner    Company Wise Coding Practice

Recommended Posts:



0 Average Difficulty : 0/5.0
No votes yet.










Writing code in comment? Please use ide.geeksforgeeks.org, generate link and share the link here.