Python | Filter a list based on the given list of strings
Given a List, the task is to filter elements from list based on another list of strings. These type of problems are quite common while scraping websites.
Examples:
Input: List_string1 = ['key', 'keys', 'keyword', 'keychain', 'keynote'] List_string2 = ['home/key/1.pdf', 'home/keys/2.pdf', 'home/keyword/3.pdf', 'home/keychain/4.pdf', 'home/Desktop/5.pdf', 'home/keynote/6.pdf'] Output: ['home/Desktop/5.pdf'] Explanation: We filter only those element from list_string2 that do not have string in list_string1
Below are some ways to achieve the above task.
Method #1: Using Iteration
Python3
# Python code to filter element from list # based on another list of string. # List Initialization Input = [ 'key' , 'keys' , 'keyword' , 'keychain' , 'keynote' ] Input_string = [ 'home/key/1.pdf' , 'home/keys/2.pdf' , 'home/keyword/3.pdf' , 'home/keychain/4.pdf' , 'home/Desktop/5.pdf' , 'home/keynote/6.pdf' ] Output = Input_string.copy() temp = [] # Using iteration for elem in Input_string: for n in Input : if n in elem: temp.append(elem) for elem in temp: if elem in Output: Output.remove(elem) # Printing print ( "List of keywords are:" , Input ) print ( "Given list:" , Input_string) print ( "filtered list is :" , Output) |
List of keywords are: ['key', 'keys', 'keyword', 'keychain', 'keynote'] Given list: ['home/key/1.pdf', 'home/keys/2.pdf', 'home/keyword/3.pdf', 'home/keychain/4.pdf', 'home/Desktop/5.pdf', 'home/keynote/6.pdf'] filtered list is : ['home/Desktop/5.pdf']
Time Complexity: O(n*n), where n is the length of the list test_list
Auxiliary Space: O(n) additional space of size n is created where n is the number of elements in the res list
Method #2: Using list comprehension
Python3
# Python code to filter element from list # based on another list of string. # List Initialization Input = [ 'key' , 'keys' , 'keyword' , 'keychain' , 'keynote' ] Input_string = [ 'home/key/1.pdf' , 'home/keys/2.pdf' , 'home/keyword/3.pdf' , 'home/keychain/4.pdf' , 'home/Desktop/5.pdf' , 'home/keynote/6.pdf' ] # Using list comprehension Output = [b for b in Input_string if all (a not in b for a in Input )] # Printing print ( "List of keywords are:" , Input ) print ( "Given list:" , Input_string) print ( "filtered list is :" , Output) |
List of keywords are: ['key', 'keys', 'keyword', 'keychain', 'keynote'] Given list: ['home/key/1.pdf', 'home/keys/2.pdf', 'home/keyword/3.pdf', 'home/keychain/4.pdf', 'home/Desktop/5.pdf', 'home/keynote/6.pdf'] filtered list is : ['home/Desktop/5.pdf']
Time Complexity: O(n) where n is the number of elements in the list
Auxiliary Space: O(n), where n is the number of elements in the new output list
Approach #3: Using the filter function and a lambda function
This approach uses the built-in filter function and a lambda function to filter the elements from the list. The lambda function returns True if none of the elements in Input are present in the element being considered, and False otherwise. The filter function filters the list Input_string based on the output of the lambda function.
Python3
# Python code to filter element from list # based on another list of string. # List Initialization Input = [ 'key' , 'keys' , 'keyword' , 'keychain' , 'keynote' ] Input_string = [ 'home/key/1.pdf' , 'home/keys/2.pdf' , 'home/keyword/3.pdf' , 'home/keychain/4.pdf' , 'home/Desktop/5.pdf' , 'home/keynote/6.pdf' ] # Using the filter function and a lambda function Output = list ( filter ( lambda x: all (y not in x for y in Input ), Input_string)) # Printing print ( "List of keywords are:" , Input ) print ( "Given list:" , Input_string) print ( "filtered list is :" , Output) #This code is contributed by Edula Vinay Kumar Reddy |
List of keywords are: ['key', 'keys', 'keyword', 'keychain', 'keynote'] Given list: ['home/key/1.pdf', 'home/keys/2.pdf', 'home/keyword/3.pdf', 'home/keychain/4.pdf', 'home/Desktop/5.pdf', 'home/keynote/6.pdf'] filtered list is : ['home/Desktop/5.pdf']
Time complexity: O(n*m), where n is the length of Input_string and m is the length of Input
Auxiliary Space: O(n)
Please Login to comment...