Open In App
Related Articles

Python – Wildcard Substring search

Improve
Improve
Improve
Like Article
Like
Save Article
Save
Report issue
Report

Sometimes, while working with Python Strings, we have problem in which, we need to search for substring, but have some of characters missing and we need to find the match. This can have application in many domains. Lets discuss certain ways in which this task can be performed. 

Method #1 : Using re.search() This is one of the way in which this task can be performed. In this, we feed the regex compile with the substring and search for it using main string in search(). 

Python3

# Python3 code to demonstrate working of
# Wildcard Substring search
# Using re.search()
import re
     
# initializing string
test_str = 'geeksforgeeks is best for geeks'
 
# printing original string
print("The original string is : " + str(test_str))
 
# initializing Substring
sub_str = '..st'
 
# Wildcard Substring search
# Using re.search()
temp = re.compile(sub_str)
res = temp.search(test_str)
 
# printing result
print("The substring match is : " + str(res.group(0)))

                    
Output : 
The original string is : geeksforgeeks is best for geeks                                                                
The substring match is : best     

  Method #2 : Using re.finditer() This is yet another way to solve this problem. In this, we can also extract the position of match if required. 

Python3

# Python3 code to demonstrate working of
# Wildcard Substring search
# Using re.finditer()
import re
     
# initializing string
test_str = 'geeksforgeeks is best for geeks'
 
# printing original string
print("The original string is : " + str(test_str))
 
# initializing Substring
sub_str = '..st'
 
# Wildcard Substring search
# Using re.finditer()
temp = re.compile(sub_str)
res = temp.search(test_str)
 
# printing result
print("The substring match is : " + str(res.group(0)))

                    
Output : 
The original string is : geeksforgeeks is best for geeks                                                                
The substring match is : best     

The Time and Space Complexity for all the methods are the same:

Time Complexity: O(n)

Space Complexity: O(n)

Method #3 :  Here’s another approach that uses the re library and the re.findall method. It is similar to the re.search method, but re.findall returns a list of all non-overlapping matches as separate strings.

Python3

import re
 
# initializing string
test_str = 'geeksforgeeks is best for geeks'
 
# printing original string
print("The original string is : " + str(test_str))
 
# initializing Substring
sub_str = '..st'
 
# Wildcard Substring search using re.findall()
temp = re.compile(sub_str)
res = temp.findall(test_str)
 
# printing result
print("The substring match(es) are: " + str(res))
#This code is contributed by Edula Vinay Kumar Reddy

                    

Output
The original string is : geeksforgeeks is best for geeks
The substring match(es) are: ['best']

The time and space complexity for this approach is the same as the previous methods:

Time Complexity: O(n)
Auxiliary Space: O(n)

Method #4 : Using split(),replace(),endswith(),join() methods

Approach 

  1. Splitted the given string using split()
  2. Replaced . with empty string in sub_str(using replace())
  3. Initiated a for loop to traverse the list
  4. Checked whether the length of element is equal to initial substring length and whether the element endswith replaced substring(using endswith())
  5. If True append such elements to output list
  6. Finally joined the output and displayed it(using join())

Python3

# Python3 code to demonstrate working of
# Wildcard Substring search
 
# initializing string
test_str = 'geeksforgeeks is best for geeks'
 
# printing original string
print("The original string is : " + str(test_str))
 
# initializing Substring
sub_str = '..st'
 
# Wildcard Substring search
x=test_str.split()
res=[]
y=len(sub_str)
sub_str=sub_str.replace(".","")
for i in x:
    if len(i)==y and i.endswith(sub_str):
        res.append(i)
 
# printing result
print("The substring match is : " + "".join(res))

                    

Output
The original string is : geeksforgeeks is best for geeks
The substring match is : best

Time Complexity: O(n)
Auxiliary Space: O(n)

Method #5 : Using string slicing and a list comprehension

1. Initialize the string and the pattern to search for.
2. Print the original string.
3. Use a list comprehension to create a list of all substrings that match the pattern.
  a. For each possible starting index i in the original string, extract a substring of length equal to the length of the pattern.
  b. Check if each character in the substring matches the corresponding character in the pattern, or if the pattern character is a wildcard “.”.
  c. If all characters match or are wildcards, add the substring to the list of matches.
4. If there is at least one match, print the first one.
 

Python3

test_str = 'geeksforgeeks is best for geeks'
sub_str = '..st'
 
# printing original string
print("The original string is : " + str(test_str))
 
# Use a list comprehension to create a list of all substrings that match the pattern
# The substring must be the same length as the pattern, and each character must match the pattern character or be a wildcard "."
matches = [test_str[i:i+len(sub_str)] for i in range(len(test_str) - len(sub_str) + 1) if all(a == b or b == '.' for a, b in zip(test_str[i:i+len(sub_str)], sub_str))]
 
# If there is at least one match, print the first one
if len(matches) > 0:
    print("The substring match is : " + matches[0])
#This code is contributed by Jyothi pinjala.

                    

Output
The original string is : geeksforgeeks is best for geeks
The substring match is : best

The time complexity : O(nm), where n is the length of the original string test_str and m is the length of the substring sub_str. The code uses a list comprehension to iterate over all possible starting indices of the substring and check if the characters match the pattern using the zip function. Since the zip function has a time complexity of O(m) and the list comprehension is repeated n-m+1 times, the overall time complexity is O(nm).

The auxiliary space :O(k), where k is the number of substring matches found. This is because the code stores all matching substrings in the matches list. Since the maximum number of matching substrings is limited to the number of possible starting indices, which is n-m+1, the space complexity is O(n-m+1), which is equivalent to O(n).

Method #6: Using list comprehension and string comparison without zip()

Steps:

  • Use a list comprehension to iterate through the input string and generate a list of substrings that match the given wildcard pattern. In this case, we use string slicing and endswith() method for comparison.
  • Extract the matched substring from the list of matches. We take the first match from the list, if any, otherwise set the result to None.
  • Print the result.

Python3

# Python3 code to demonstrate working of
# Wildcard Substring search
# Using list comprehension and string comparison
 
# initializing string
test_str = 'geeksforgeeks is best for geeks'
 
# printing original string
print("The original string is : " + str(test_str))
 
# initializing Substring
sub_str = '..st'
 
# Wildcard Substring search
# Using list comprehension and string comparison
matches = [test_str[i:i+len(sub_str)] for i in range(len(test_str)-len(sub_str)+1) if test_str[i:i+len(sub_str)].endswith(sub_str[-1])]
res = matches[0] if matches else None
 
# printing result
print("The substring match is : " + str(res))

                    

Output
The original string is : geeksforgeeks is best for geeks
The substring match is : best

Time Complexity: O(n), where n is the length of the test string.
Auxiliary Space: O(1).



Last Updated : 09 Apr, 2023
Like Article
Save Article
Previous
Next
Share your thoughts in the comments
Similar Reads