Open In App

Python | Filter String with substring at specific position

Improve
Improve
Like Article
Like
Save
Share
Report

Sometimes, while working with Python string lists, we can have a problem in which we need to extract only those lists that have a specific substring at a specific position. This kind of problem can come in data processing and web development domains. Let us discuss certain ways in which this task can be performed. 

Method #1: Using list comprehension + list slicing The combination of the above functionalities can be used to perform this particular task. In this, we check for substring range using list slicing, and the task of extraction list is compiled in list comprehension. 

Python3




# Python3 code to demonstrate
# Filter String with substring at specific position
# using list comprehension + list slicing
 
# Initializing list
test_list = ['geeksforgeeks', 'is', 'best', 'for', 'geeks']
 
# printing original list
print("The original list is : " + str(test_list))
 
# Initializing substring
sub_str = 'geeks'
 
# Initializing range
i, j = 0, 5
 
# Filter String with substring at specific position
# using list comprehension + list slicing
res = [ele for ele in test_list if ele[i: j] == sub_str]
 
# printing result
print ("Filtered list : " + str(res))


Output : 

The original list is : ['geeksforgeeks', 'is', 'best', 'for', 'geeks']
Filtered list : ['geeksforgeeks', 'geeks']

Time Complexity: O(n), where n is the length of the input list. This is because we’re using the list comprehension + list slicing which has a time complexity of O(n) in the worst case.
Auxiliary Space: O(n), as we’re using additional space res other than the input list itself with the same size of input list. 

Method #2 : Using filter() + lambda The combination of above methods can be used to perform this task. In this, we filter the elements using logic compiled using lambda using filter(). 

Python3




# Python3 code to demonstrate
# Filter String with substring at specific position
# using filter() + lambda
 
# Initializing list
test_list = ['geeksforgeeks', 'is', 'best', 'for', 'geeks']
 
# printing original list
print("The original list is : " + str(test_list))
 
# Initializing substring
sub_str = 'geeks'
 
# Initializing range
i, j = 0, 5
 
# Filter String with substring at specific position
# using filter() + lambda
res = list(filter(lambda ele: ele[i: j] == sub_str, test_list))
 
# printing result
print ("Filtered list : " + str(res))


Output : 

The original list is : ['geeksforgeeks', 'is', 'best', 'for', 'geeks']
Filtered list : ['geeksforgeeks', 'geeks']

Time Complexity: O(n*m), where n is the length of the input list and m is the length of the substring

Auxiliary Space: O(k), where k is the number of elements that meet

Method #3: Using find() method

Python3




# Python3 code to demonstrate
# Filter String with substring at specific position
 
# Initializing list
test_list = ['geeksforgeeks', 'is', 'best', 'for', 'geeks']
 
# printing original list
print("The original list is : " + str(test_list))
 
# Initializing substring
sub_str = 'geeks'
 
# Initializing range
i, j = 0, 5
 
# Filter String with substring at specific position
res=[]
for k in test_list:
    if(k.find(sub_str)==i):
        res.append(k)
     
 
# printing result
print ("Filtered list : " + str(res))


Output

The original list is : ['geeksforgeeks', 'is', 'best', 'for', 'geeks']
Filtered list : ['geeksforgeeks', 'geeks']

Time Complexity:

The find() method has a time complexity of O(NM) where N is the length of the string and M is the length of the substring being searched. However, in this case, we are only checking for the substring at a specific position, so the length of the string being searched is limited to j-i characters. Therefore, the time complexity of the program will be O(NM) where N is the number of elements in the list and M is the length of the substring, multiplied by j-i characters being searched for each element.
Auxiliary Space Complexity:

The program uses an empty list res to hold the filtered elements. The space occupied by res will depend on the number of elements that pass the filter, which can be at most N elements (where N is the number of elements in the original list). Therefore, the auxiliary space complexity of the program will be O(N).
Overall, the program has a time complexity of O(NM(j-i)) and an auxiliary space complexity of O(N).

Method 4: Using RegEx
 

Python3




import re
 
#Initializing list
test_list = ['geeksforgeeks', 'is', 'best', 'for', 'geeks']
 
#printing original list
print("The original list is : " + str(test_list))
 
#Initializing substring
sub_str = 'geeks'
 
#Initializing range
i, j = 0, 5
 
#Filter String with substring at specific position
#using re.match()
res = [ele for ele in test_list if re.match(r"^{}.*".format(sub_str), ele[i: j])]
 
#printing result
print ("Filtered list : " + str(res))
#This code is contributed by Edula Vinay Kumar Reddy


Output

The original list is : ['geeksforgeeks', 'is', 'best', 'for', 'geeks']
Filtered list : ['geeksforgeeks', 'geeks']

Time complexity: O(n) where n is the length of the list.
Auxiliary Space: O(m) where m is the length of the longest string in the list.

Method 5: Using itertools.islice() and str.startswith():

Algorithm :

  1. Initialize a list of strings.
  2. Initialize the substring and the range for the substring to check.
  3. Use a for loop to iterate through each element in the input list.
  4. For each element, use the startswith() method to check if the substring is present at the specified range.
  5. If the substring is present at the specified range, add the element to a new list.
  6. Return the new list containing the filtered elements.

Python3




import itertools
 
#Initializing list
test_list = ['geeksforgeeks', 'is', 'best', 'for', 'geeks']
 
#printing original list
print("The original list is : " + str(test_list))
 
#Initializing substring
sub_str = 'geeks'
 
#Initializing range
i, j = 0, 5
 
#Filter String with substring at specific position
res = [ele for ele in test_list if ele.startswith(sub_str, i, j)]
 
#printing result
print ("Filtered list : " + str(res))
 
#This code is contributed by Jyothi pinjala.


Output

The original list is : ['geeksforgeeks', 'is', 'best', 'for', 'geeks']
Filtered list : ['geeksforgeeks', 'geeks']

The time complexity : O(n * k), where n is the length of the input list and k is the length of the substring. This is because for each element in the list, we need to check if the substring is present at a specific position, which takes O(k) time.

The auxiliary space : O(m), where m is the length of the resulting filtered list. This is because we are creating a new list to store the filtered elements. The space complexity of the itertools.islice() method is also O(m), as it returns an iterator object that generates the filtered elements on-the-fly without creating a new list.

Method 6 : Use a loop and string slicing.

Step-by-step approach:

  • Initialize the list of strings to be filtered and print it.
  • Initialize the substring to match and the range of positions in which to check the substring.
  • Create an empty list to store the filtered strings.
  • Loop through each element in the original list.
  • Check if the substring at the specified range matches the target substring. If it does, append the element to the filtered list.
  • Print the resulting filtered list.

Python3




# Initializing list
test_list = ['geeksforgeeks', 'is', 'best', 'for', 'geeks']
 
# Printing original list
print("The original list is : " + str(test_list))
 
# Initializing substring and range
sub_str = 'geeks'
i, j = 0, 5
 
# Filter String with substring at specific position
res = []
for ele in test_list:
    if ele[i:j] == sub_str:
        res.append(ele)
 
# Printing result
print("Filtered list : " + str(res))


Output

The original list is : ['geeksforgeeks', 'is', 'best', 'for', 'geeks']
Filtered list : ['geeksforgeeks', 'geeks']

The time complexity of this implementation is O(n), where n is the length of the input list. 

The auxiliary space is also O(n), as the filtered list can potentially contain all n elements of the input list.



Last Updated : 13 Apr, 2023
Like Article
Save Article
Previous
Next
Share your thoughts in the comments
Similar Reads