Python | Filter list of strings based on the substring list
Given two lists of strings string and substr, write a Python program to filter out all the strings in string that contains string in substr.
Examples:
Input : string = [‘city1’, ‘class5’, ‘room2’, ‘city2’]
substr = [‘class’, ‘city’]
Output : [‘city1’, ‘class5’, ‘city2’]Input : string = [‘coordinates’, ‘xyCoord’, ‘123abc’]
substr = [‘abc’, ‘xy’]
Output : [‘xyCoord’, ‘123abc’]
Method #1: Using List comprehension.
We can use list comprehension along with in operator to check if the string in ‘substr’ is contained in ‘string’ or not.
Python3
# Python3 program to Filter list of # strings based on another list import re def Filter (string, substr): return [ str for str in string if any (sub in str for sub in substr)] # Driver code string = [ 'city1' , 'class5' , 'room2' , 'city2' ] substr = [ 'class' , 'city' ] print ( Filter (string, substr)) |
['city1', 'class5', 'city2']
Time complexity: O(n * m), where n is the number of strings in the input list “string” and m is the number of substrings in the input list “substr”.
Auxiliary space: O(1), as the function uses only a few variables and doesn’t create any additional data structures.
Method #2: Python Regex
Python3
# Python3 program to Filter list of # strings based on another list import re def Filter (string, substr): return [ str for str in string if re.match(r '[^\d]+|^' , str ).group( 0 ) in substr] # Driver code string = [ 'city1' , 'class5' , 'room2' , 'city2' ] substr = [ 'class' , 'city' ] print ( Filter (string, substr)) |
['city1', 'class5', 'city2']
The time complexity of this program is O(nm), where n is the length of the string list and m is the length of the substr list.
The space complexity of this program is O(k), where k is the maximum length of a string in the string list.
Method #3 : Using find() method.
find() method searches for the string that is passed as argument in given string and returns the position or else returns -1.
Python3
# Python3 program to Filter list of # strings based on another list string = [ 'city1' , 'class5' , 'room2' , 'city2' ] substr = [ 'class' , 'city' ] x = [] for i in substr: for j in string: if (j.find(i)! = - 1 and j not in x): x.append(j) print (x) |
['class5', 'city1', 'city2']
The time complexity of this program is O(mn), where m is the length of the substr list and n is the length of the string list.
The auxiliary space complexity of this program is O(k), where k is the size of the resulting list that contains the filtered strings.
Method #4 : Using the filter function and a lambda function:
The filter function is a built-in Python function that takes in two arguments: a function and an iterable. It returns an iterator that returns the elements of the iterable for which the function returns True.
In this case, we are using a lambda function as the first argument to the filter function. The lambda function takes in a string x and returns True if any of the substrings in the substrings list appear in x, and False otherwise. The second argument to the filter function is the strings list, which is the iterable that we want to filter.
Therefore, the filter function returns an iterator that returns all the elements of the strings list for which the lambda function returns True. In this case, the lambda function returns True for the elements ‘city1’, ‘class5’, and ‘city2’, so the iterator returned by the filter function will contain those elements.
Python3
# Initialize the list of strings and the list of substrings strings = [ 'city1' , 'class5' , 'room2' , 'city2' ] substrings = [ 'class' , 'city' ] # Use the filter function and a lambda function to filter the strings filtered_strings = list ( filter ( lambda x: any (substring in x for substring in substrings), strings)) # Print the filtered strings print (filtered_strings) #This code is contributed by Edula Vinay Kumar Reddy |
['city1', 'class5', 'city2']
Time complexity: O(n^2), where n is the length of the strings list
Auxiliary Space: O(n), where n is the length of the filtered_strings list
Method #5: Using a for loop:
step-by-step approach for the given program:
- Define a function called “Filter” that takes two arguments: “string” and “substr”.
- Create an empty list called “filtered_list” to store the filtered strings.
- Use a for loop to iterate over each string in the input list “string”.
- Inside the first for loop, use another for loop to iterate over each substring in the filter list “substr”.
- Use an if statement to check if the current substring is present in the current string.
- If the substring is found in the string, add the string to the “filtered_list” using the “append” method and break out of the inner loop using the “break” keyword.
- Once all the substrings have been checked for the current string, move on to the next string in the input list.
- After all the strings have been checked against all the substrings, return the final filtered list using the “return” keyword.
- Define the input list of strings as “string” and the filter list of substrings as “substr”.
- Call the “Filter” function with the “string” and “substr” arguments and store the result in “filtered_list”.
- Print the “filtered_list” using the “print” statement.
Python3
# Define a function to filter a list of strings based on another list of substrings def Filter (string, substr): # Create an empty list to store the filtered strings filtered_list = [] # Loop over each string in the input list for s in string: # Loop over each substring in the filter list for sub in substr: # Check if the substring is in the current string if sub in s: # If it is, add the string to the filtered list and break out of the inner loop filtered_list.append(s) break # Return the final filtered list return filtered_list # Define the input list of strings and the filter list of substrings string = [ 'city1' , 'class5' , 'room2' , 'city2' ] substr = [ 'class' , 'city' ] # Call the filter function with the input lists and print the result filtered_list = Filter (string, substr) print (filtered_list) |
['city1', 'class5', 'city2']
Time complexity: O(nm), where n is the length of the input string list and m is the length of the filter substring list.
Auxiliary space: O(k), where k is the length of the filtered list.
Method 6: Using the “any” function and a generator expression:
Step-by-step approach:
- Define a function named “filter_strings” that takes two arguments: a list of strings and a list of substrings.
- Use the “any” function and a generator expression to create a filter condition. The generator expression should loop over each substring in the filter list and check if it is in the current string.
- Use the built-in “filter” function to filter the input list based on the filter condition.
- Convert the filtered iterator to a list and return
Below is the implementation of the above approach:
Python3
def filter_strings(string_list, substr_list): # Create a filter condition using the "any" function and a generator expression filter_cond = ( any (sub in s for sub in substr_list) for s in string_list) # Use the "filter" function to filter the input list based on the filter condition filtered_iterator = filter ( lambda x: x[ 1 ], zip (string_list, filter_cond)) # Convert the filtered iterator to a list and return it filtered_list = [x[ 0 ] for x in filtered_iterator] return filtered_list string_list = [ 'city1' , 'class5' , 'room2' , 'city2' ] substr_list = [ 'class' , 'city' ] filtered_list = filter_strings(string_list, substr_list) print (filtered_list) |
['city1', 'class5', 'city2']
Time complexity: O(n*m), where n is the length of the input list and m is the average length of the substrings in the filter list.
Auxiliary space: O(n), where n is the length of the input list.
Please Login to comment...