Python | Filter a list based on the given list of strings
Given a List, the task is to filter elements from list based on another list of strings. These type of problems are quite common while scraping websites.
Examples:
Input:
List_string1 = ['key', 'keys', 'keyword', 'keychain', 'keynote']
List_string2 = ['home/key/1.pdf',
'home/keys/2.pdf',
'home/keyword/3.pdf',
'home/keychain/4.pdf',
'home/Desktop/5.pdf',
'home/keynote/6.pdf']
Output:
['home/Desktop/5.pdf']
Explanation: We filter only those element from
list_string2 that do not have string in list_string1
Below are some ways to achieve the above task.
Method #1: Using Iteration
Python3
Input = [ 'key' , 'keys' , 'keyword' , 'keychain' , 'keynote' ]
Input_string = [ 'home/key/1.pdf' ,
'home/keys/2.pdf' ,
'home/keyword/3.pdf' ,
'home/keychain/4.pdf' ,
'home/Desktop/5.pdf' ,
'home/keynote/6.pdf' ]
Output = Input_string.copy()
temp = []
for elem in Input_string:
for n in Input :
if n in elem:
temp.append(elem)
for elem in temp:
if elem in Output:
Output.remove(elem)
print ( "List of keywords are:" , Input )
print ( "Given list:" , Input_string)
print ( "filtered list is :" , Output)
|
Output
List of keywords are: ['key', 'keys', 'keyword', 'keychain', 'keynote']
Given list: ['home/key/1.pdf', 'home/keys/2.pdf', 'home/keyword/3.pdf', 'home/keychain/4.pdf', 'home/Desktop/5.pdf', 'home/keynote/6.pdf']
filtered list is : ['home/Desktop/5.pdf']
Time Complexity: O(n*n), where n is the length of the list test_list
Auxiliary Space: O(n) additional space of size n is created where n is the number of elements in the res list
Method #2: Using list comprehension
Python3
Input = [ 'key' , 'keys' , 'keyword' , 'keychain' , 'keynote' ]
Input_string = [ 'home/key/1.pdf' ,
'home/keys/2.pdf' ,
'home/keyword/3.pdf' ,
'home/keychain/4.pdf' ,
'home/Desktop/5.pdf' ,
'home/keynote/6.pdf' ]
Output = [b for b in Input_string if
all (a not in b for a in Input )]
print ( "List of keywords are:" , Input )
print ( "Given list:" , Input_string)
print ( "filtered list is :" , Output)
|
Output
List of keywords are: ['key', 'keys', 'keyword', 'keychain', 'keynote']
Given list: ['home/key/1.pdf', 'home/keys/2.pdf', 'home/keyword/3.pdf', 'home/keychain/4.pdf', 'home/Desktop/5.pdf', 'home/keynote/6.pdf']
filtered list is : ['home/Desktop/5.pdf']
Time Complexity: O(n) where n is the number of elements in the list
Auxiliary Space: O(n), where n is the number of elements in the new output list
Approach #3: Using the filter function and a lambda function
This approach uses the built-in filter function and a lambda function to filter the elements from the list. The lambda function returns True if none of the elements in Input are present in the element being considered, and False otherwise. The filter function filters the list Input_string based on the output of the lambda function.
Python3
Input = [ 'key' , 'keys' , 'keyword' , 'keychain' , 'keynote' ]
Input_string = [ 'home/key/1.pdf' ,
'home/keys/2.pdf' ,
'home/keyword/3.pdf' ,
'home/keychain/4.pdf' ,
'home/Desktop/5.pdf' ,
'home/keynote/6.pdf' ]
Output = list ( filter ( lambda x: all (y not in x for y in Input ), Input_string))
print ( "List of keywords are:" , Input )
print ( "Given list:" , Input_string)
print ( "filtered list is :" , Output)
|
Output
List of keywords are: ['key', 'keys', 'keyword', 'keychain', 'keynote']
Given list: ['home/key/1.pdf', 'home/keys/2.pdf', 'home/keyword/3.pdf', 'home/keychain/4.pdf', 'home/Desktop/5.pdf', 'home/keynote/6.pdf']
filtered list is : ['home/Desktop/5.pdf']
Time complexity: O(n*m), where n is the length of Input_string and m is the length of Input
Auxiliary Space: O(n)
Last Updated :
06 Apr, 2023
Like Article
Save Article
Share your thoughts in the comments
Please Login to comment...