Given two strings S and P where S consists of only lowercase English alphabets while P consists of lowercase English alphabets as well as special characters ‘.’ and ‘*’, the task is to implement a function to test regular expression such that:
-
'.'
Matches any single character. -
'*'
Matches zero or more of the preceding element.
Note: For each appearance of the character ‘
*'
, there will be a previous valid character to match.
Examples:
Input:
s = "aaa", p = "a"
Output:
false
Explanation:
"a" does not match the entire string "aaa".
Input:
s = "abb", p = "a.*"
Output:
true
Explanation:
replace
.
with b then p becomes ab* now replace * with one preceeding character hence p becomes abb.
Approach: To solve the problem follow the below observations:
- Suppose: s = “aa” and p = “aa”, since all the characters in both s and p are the same, it’s a match.
- Suppose: s = “aabb” and p = “aab*”, We know that substrings bb and b* match because * can be replaced by one b. Since we already know that the remaining substrings “aa” and “aa” are matched, the whole strings are also a match.
What can we infer from this? Right, if we have a solution of part of a problem, we can use that partial result and can go forward. Also, we can use the already calculated result without calculating it again.
Does this ring a bell ????? Yes, this problem satisfies the following two properties –
- Optimal Substructure — Any problem has optimal substructure property if its overall optimal solution can be constructed from the optimal solutions of its subproblems.
- Overlapping Subproblems — Any problem has overlapping sub-problems if finding its solution involves solving the same subproblem multiple times.
It is now evident that we can use Dynamic Programming to solve this problem. Below are the steps —
-
Create a boolean 2D
dp
array with sizes asboolean[][] dp = new boolean[s.length() + 1][p.length() + 1]
. We are adding extra 1 to incorporate the case in case either or both of the strings are empty. -
If both strings are empty, then it’s a match, thus,
dp[0][0] = true
. -
Let’s take an example
s = "aab"
andp = "c*a*b"
and create a DP table.
c |
* |
a |
* |
b |
|||
---|---|---|---|---|---|---|---|
0 |
1 |
2 |
3 |
4 |
5 |
||
0 |
TRUE |
FALSE |
TRUE |
FALSE |
TRUE |
FALSE |
|
a |
1 |
FALSE |
FALSE |
FALSE |
TRUE |
TRUE |
FALSE |
a |
2 |
FALSE |
FALSE |
FALSE |
FALSE |
TRUE |
FALSE |
b |
3 |
FALSE |
FALSE |
FALSE |
FALSE |
FALSE |
TRUE |
-
First column — it means
p
is empty and it will match tos
only ifs
is also empty which we have stored indp[0][0]
. Thus, remaining values of dp[0][i] will befalse
. -
First row — this is not so easy. It means which
p
matches emptys
. The answer is either an empty pattern or a pattern that represents an empty string such as"a*"
,"x*y*"
,"l*m*n*"
and so on. In the above example, ifs = ""
andp = "c*"
, then due to*
,c
can be replaced by 0c
s which gives us an empty string. Hence,dp[0][2] = true
. -
For non-empty strings, let’s say that
s[i - 1] == p[j - 1]
this means the (i – 1)th and (j – 1)th characters are same. This means, we have to check if the remaining strings are a match or not. If they are a match, then the current substrings will be a match, otherwise they won’t be a match i.e.,dp[i][j] = dp[i - 1][j - 1]
. We’re taking (i – 1)th and (j – 1)th characters to offset empty strings as we’re assuming our strings start from index 1. -
If
p[j - 1] == "."
, then it means any single character can be matched. Therefore, here also, we will have to check if the remaining string is a match or not. Thus,dp[i][j] = dp[i - 1][j - 1]
. -
If
p[j - 1] == "*"
, then it means either it’s represents an empty string (0 characters), thusdp[i][j] = dp[i][j - 2]
ors[i - 1] == p[j - 2] || p[j - 2] == "."
, then current character of string equals the char preceding ‘*'
in pattern so the result isdp[i-1][j]
.
Below is the implementation of above approach:
#include <iostream> #include <vector> using namespace std;
bool isMatch(string s, string p) {
int rows = s.length();
int columns = p.length();
// Base conditions
if (rows == 0 && columns == 0) {
return true ;
}
if (columns == 0) {
return false ;
}
// DP array
vector<vector< bool >> dp(rows + 1, vector< bool >(columns + 1, false ));
// Empty string and empty pattern are a match
dp[0][0] = true ;
// Deals with patterns with *
for ( int i = 2; i <= columns; i++) {
if (p[i - 1] == '*' ) {
dp[0][i] = dp[0][i - 2];
}
}
// For remaining characters
for ( int i = 1; i <= rows; i++) {
for ( int j = 1; j <= columns; j++) {
if (s[i - 1] == p[j - 1] || p[j - 1] == '.' ) {
dp[i][j] = dp[i - 1][j - 1];
} else if (j > 1 && p[j - 1] == '*' ) {
dp[i][j] = dp[i][j - 2];
if (p[j - 2] == '.' || p[j - 2] == s[i - 1]) {
dp[i][j] = dp[i][j] || dp[i - 1][j];
}
}
}
}
return dp[rows][columns];
} int main() {
cout << boolalpha << isMatch( "aab" , "a.*" ) << endl;
return 0;
} |
// Java Code for above approach import java.io.*;
import java.util.Scanner;
public class RegularExpressionMatching {
public static boolean isMatch(String s, String p)
{
int rows = s.length();
int columns = p.length();
/// Base conditions
if (rows == 0 && columns == 0 ) {
return true ;
}
if (columns == 0 ) {
return false ;
}
// DP array
boolean [][] dp = new boolean [rows + 1 ][columns + 1 ];
// Empty string and empty pattern
// are a match
dp[ 0 ][ 0 ] = true ;
// Deals with patterns with *
for ( int i = 2 ; i < columns + 1 ; i++) {
if (p.charAt(i - 1 ) == '*' ) {
dp[ 0 ][i] = dp[ 0 ][i - 2 ];
}
}
// For remaining characters
for ( int i = 1 ; i < rows + 1 ; i++) {
for ( int j = 1 ; j < columns + 1 ; j++) {
if (s.charAt(i - 1 ) == p.charAt(j - 1 )
|| p.charAt(j - 1 ) == '.' ) {
dp[i][j] = dp[i - 1 ][j - 1 ];
}
else if (j > 1 && p.charAt(j - 1 ) == '*' ) {
dp[i][j] = dp[i][j - 2 ];
if (p.charAt(j - 2 ) == '.'
|| p.charAt(j - 2 )
== s.charAt(i - 1 )) {
dp[i][j] = dp[i][j] | dp[i - 1 ][j];
}
}
}
}
return dp[rows][columns];
}
public static void main(String[] args)
{
System.out.println(isMatch("aab", "a.*"));
}
} |
def isMatch(s: str , p: str ) - > bool :
rows, columns = ( len (s), len (p))
# Base conditions
if rows = = 0 and columns = = 0 :
return True
if columns = = 0 :
return False
# DP array
dp = [[ False for j in range (columns + 1 )] for i in range (rows + 1 )]
# Since empty string and empty pattern are a match
dp[ 0 ][ 0 ] = True
# Deals with patterns containing *
for i in range ( 2 , columns + 1 ):
if p[i - 1 ] = = '*' :
dp[ 0 ][i] = dp[ 0 ][i - 2 ]
# For remaining characters
for i in range ( 1 , rows + 1 ):
for j in range ( 1 , columns + 1 ):
if s[i - 1 ] = = p[j - 1 ] or p[j - 1 ] = = '.' :
dp[i][j] = dp[i - 1 ][j - 1 ]
elif j > 1 and p[j - 1 ] = = '*' :
dp[i][j] = dp[i][j - 2 ]
if p[j - 2 ] = = '.' or p[j - 2 ] = = s[i - 1 ]:
dp[i][j] = dp[i][j] or dp[i - 1 ][j]
return dp[rows][columns]
if __name__ = = '__main__' :
print (isMatch("aa", "a"))
print (isMatch("aa", "a * "))
print (isMatch("ab", "."))
print (isMatch("aab", "c * a * b"))
print (isMatch("mississippi", "mis * is * p * ."))
print (isMatch("", ". * "))
|
using System;
public class RegularExpressionMatching
{ public static bool IsMatch( string s, string p)
{
int rows = s.Length;
int columns = p.Length;
// Base conditions
if (rows == 0 && columns == 0)
{
return true ;
}
if (columns == 0)
{
return false ;
}
// DP array
bool [][] dp = new bool [rows + 1][];
for ( int i = 0; i <= rows; i++)
{
dp[i] = new bool [columns + 1];
}
// Empty string and empty pattern are a match
dp[0][0] = true ;
// Deals with patterns with *
for ( int i = 2; i <= columns; i++)
{
if (p[i - 1] == '*' )
{
dp[0][i] = dp[0][i - 2];
}
}
// For remaining characters
for ( int i = 1; i <= rows; i++)
{
for ( int j = 1; j <= columns; j++)
{
if (s[i - 1] == p[j - 1] || p[j - 1] == '.' )
{
dp[i][j] = dp[i - 1][j - 1];
}
else if (j > 1 && p[j - 1] == '*' )
{
dp[i][j] = dp[i][j - 2];
if (p[j - 2] == '.' || p[j - 2] == s[i - 1])
{
dp[i][j] = dp[i][j] || dp[i - 1][j];
}
}
}
}
return dp[rows][columns];
}
public static void Main( string [] args)
{
Console.WriteLine(IsMatch( "aab" , "a.*" ));
}
} |
var isMatch = function (s, p) {
const rows = s.length;
const columns = p.length;
/// Base conditions
if (rows == 0 && columns == 0) {
return true ;
}
if (columns == 0) {
return false ;
}
// DP array
const dp = Array.from({ length: s.length + 1 }, () => [ false ]);
// Empty string and empty pattern are a match
dp[0][0] = true ;
// Deals with patterns with *
for (let i = 1; i < columns + 1; i++) {
if (p[i - 1] === '*' ) {
dp[0][i] = dp[0][i - 2];
}
else {
dp[0][i] = false ;
};
}
// For remaining characters
for (let i = 1; i < rows + 1; i++) {
for (let j = 1; j < columns + 1; j++) {
if (p[j - 1] === '*' ) {
if (p[j - 2] === s[i - 1] || p[j - 2] === '.' ) {
dp[i][j] = dp[i][j - 2] || dp[i - 1][j];
} else {
dp[i][j] = dp[i][j - 2];
}
} else if (p[j - 1] === s[i - 1] || p[j - 1] === '.' ) {
dp[i][j] = dp[i - 1][j - 1];
} else {
dp[i][j] = false ;
}
}
}
return dp[rows][columns];
}; console.log(isMatch("aa", "a")); console.log(isMatch("aa", "a*")); console.log(isMatch("an", ".")); console.log(isMatch("aab", "c*a*b")); console.log(isMatch("mississippi", "mis*is*p*.")); console.log(isMatch("", ".*")); console.log(isMatch("ab", ".*c")); |
true
Time Complexity: O(m×n)
Auxiliary Space: O(m×n)