Python – Filter the List of String whose index in second List contains the given Substring

Last Updated : 03 Jun, 2023

Given two lists, extract all elements from the first list, whose corresponding index in the second list contains the required substring.

Examples:

Input : test_list1 = [“Gfg”, “is”, “not”, “best”, “and”, “not”, “CS”],
test_list2 = [“Its ok”, “all ok”, “wrong”, “looks ok”, “ok”, “wrong”, “thats ok”], sub_str = “ok”
Output : [‘Gfg’, ‘is’, ‘best’, ‘and’, ‘CS’]
Explanation : All retained contain “ok” as substring in corresponding idx, e.g : Gfg -> Its ok ( has ok ) as substr.

Input : test_list1 = [“Gfg”, “not”, “best”],
test_list2 = [“yes”, “noo”, “its yes”], sub_str = “yes”
Output : [‘Gfg’, ‘best’]
Explanation : All retained contain “yes” as substring in corresponding idx, e.g : Gfg -> yes ( has yes ) as substr.

Method #1 : Using zip() + loop + in operator

In this, we combine the indices using zip(), and in operator is used to check for substring. Loop is used to the task of iteration.

we takes two lists, test_list1 and test_list2, and extracts the elements from test_list1 that are associated with elements in test_list2 containing the substring “ok”. It then prints the extracted list.

Follow the below steps to implement the above idea:

Initialize the two lists.
Print the original lists.
Initialize the substring.
Initialize an empty list to store the extracted elements:
Use zip() to iterate through both lists at the same time and map elements with the same index together.
Check if the substring is in the second element (ele2) using the in operator.
If the substring is present, append the corresponding element from test_list1 to res.
Print the extracted list.

Below is the implementation of the above approach:

Python3

# Python3 code to demonstrate working of
# Extract elements filtered by substring
# from other list Using zip() + loop + in
# operator
 
# initializing list
test_list1 = ["Gfg", "is", "not", "best", "and",
              "not", "for", "CS"]
test_list2 = ["Its ok", "all ok", "wrong", "looks ok",
              "ok", "wrong", "ok", "thats ok"]
 
# printing original lists
print("The original list 1 is : " + str(test_list1))
print("The original list 2 is : " + str(test_list2))
 
# initializing substr
sub_str = "ok"
 
res = []
# using zip() to map by index
for ele1, ele2 in zip(test_list1, test_list2):
 
    # checking for substring
    if sub_str in ele2:
        res.append(ele1)
 
# printing result
print("The extracted list : " + str(res))

Output

The original list 1 is : ['Gfg', 'is', 'not', 'best', 'and', 'not', 'for', 'CS']
The original list 2 is : ['Its ok', 'all ok', 'wrong', 'looks ok', 'ok', 'wrong', 'ok', 'thats ok']
The extracted list : ['Gfg', 'is', 'best', 'and', 'for', 'CS']

Time complexity: O(n), where n is the length of the longest list (test_list1 or test_list2).
Auxiliary space: O(m), where m is the number of elements in the result list (res).

Method #2 : Using list comprehension + zip()

This is similar to above method. The only difference here is that list comprehension is used as shorthand to solve the problem.

step-by-step approach for the program:

Initialize two lists test_list1 and test_list2 that contain the elements to be filtered based on a substring.
Print the original lists using the print() function and string concatenation.
Initialize the substring sub_str that will be used to filter the lists.
Use the zip() function to iterate over both lists simultaneously and create a tuple of elements from each list.
Use list comprehension to filter the elements from the first list that have the substring sub_str in the corresponding element of the second list. Here, we use the in operator to check if sub_str is present in the string element of the second list.
Store the filtered elements in a new list res.
Print the extracted list using the print() function and string concatenation.

Python3

# Python3 code to demonstrate working of
# Extract elements filtered by substring
# from other list Using list comprehension + zip()
 
# initializing list
test_list1 = ["Gfg", "is", "not", "best", "and",
              "not", "for", "CS"]
test_list2 = ["Its ok", "all ok", "no", "looks ok",
              "ok", "wrong", "ok", "thats ok"]
 
# printing original lists
print("The original list 1 is : " + str(test_list1))
print("The original list 2 is : " + str(test_list2))
 
# initializing substr
sub_str = "ok"
 
# using list comprehension to perform task
res = [ele1 for ele1, ele2 in zip(test_list1, test_list2) if sub_str in ele2]
 
# printing result
print("The extracted list : " + str(res))

Output

The original list 1 is : ['Gfg', 'is', 'not', 'best', 'and', 'not', 'for', 'CS']
The original list 2 is : ['Its ok', 'all ok', 'no', 'looks ok', 'ok', 'wrong', 'ok', 'thats ok']
The extracted list : ['Gfg', 'is', 'best', 'and', 'for', 'CS']

Time Complexity: O(n)
Auxiliary Space: O(n)

Method #3 : Using find() method

Python3

# Python3 code to demonstrate working of
# Extract elements filtered by substring
 
# initializing list
test_list1 = ["Gfg", "is", "not", "best", "and",
              "not", "for", "CS"]
test_list2 = ["Its ok", "all ok", "wrong", "looks ok",
                        "ok", "wrong", "ok", "thats ok"]
 
# printing original lists
print("The original list 1 is : " + str(test_list1))
print("The original list 2 is : " + str(test_list2))
 
# initializing substr
sub_str = "ok"
 
res = []
for i in range(0, len(test_list2)):
    if test_list2[i].find(sub_str) != -1:
        res.append(test_list1[i])
 
 
# printing result
print("The extracted list : " + str(res))

Output

The original list 1 is : ['Gfg', 'is', 'not', 'best', 'and', 'not', 'for', 'CS']
The original list 2 is : ['Its ok', 'all ok', 'wrong', 'looks ok', 'ok', 'wrong', 'ok', 'thats ok']
The extracted list : ['Gfg', 'is', 'best', 'and', 'for', 'CS']

Time Complexity: O(n), where n is the length of the list test_list2.
Auxiliary Space: O(n), as we are storing the extracted elements in a new list, res.

Method #4: Using filter() function with lambda function

The filter() function can be used to filter out the elements of the first list based on the condition given by a lambda function. In this case, the lambda function will check if the substring is present in the corresponding element of the second list.

Python3

# Python3 code to demonstrate working of
# Extract elements filtered by substring
# from other list using filter() function
 
# initializing list
test_list1 = ["Gfg", "is", "not", "best", "and",
              "not", "for", "CS"]
test_list2 = ["Its ok", "all ok", "wrong", "looks ok",
              "ok", "wrong", "ok", "thats ok"]
 
# printing original lists
print("The original list 1 is : " + str(test_list1))
print("The original list 2 is : " + str(test_list2))
 
# initializing substr
sub_str = "ok"
 
# using filter() function to filter elements
res = list(filter(lambda x: sub_str in test_list2[test_list1.index(x)], test_list1))
 
# printing result
print("The extracted list : " + str(res))

Output

The original list 1 is : ['Gfg', 'is', 'not', 'best', 'and', 'not', 'for', 'CS']
The original list 2 is : ['Its ok', 'all ok', 'wrong', 'looks ok', 'ok', 'wrong', 'ok', 'thats ok']
The extracted list : ['Gfg', 'is', 'best', 'and', 'for', 'CS']

Time complexity: O(n) where n is the length of the list.
Auxiliary space: O(n)

Method 5: Using itertools.compress() function

This code snippet is functionally equivalent to the original implementation, but it replaces the for loop and append() method with the compress() function. The compress() function takes two arguments: the first argument is the iterable to be filtered, and the second argument is the selector iterable. The selector iterable should be a boolean iterable with the same length as the iterable to be filtered.

Python3

# Python3 code to demonstrate working of
# Extract elements filtered by substring
# from other list Using itertools.compress()
 
import itertools
 
# initializing list
test_list1 = ["Gfg", "is", "not", "best", "and",
              "not", "for", "CS"]
test_list2 = ["Its ok", "all ok", "wrong", "looks ok",
              "ok", "wrong", "ok", "thats ok"]
 
# printing original lists
print("The original list 1 is : " + str(test_list1))
print("The original list 2 is : " + str(test_list2))
 
# initializing substr
sub_str = "ok"
 
# using compress() to filter corresponding elements from test_list1
filtered_list = itertools.compress(test_list1, [sub_str in ele for ele in test_list2])
 
# converting filter object to list
res = list(filtered_list)
 
# printing result
print("The extracted list : " + str(res))

Output

The original list 1 is : ['Gfg', 'is', 'not', 'best', 'and', 'not', 'for', 'CS']
The original list 2 is : ['Its ok', 'all ok', 'wrong', 'looks ok', 'ok', 'wrong', 'ok', 'thats ok']
The extracted list : ['Gfg', 'is', 'best', 'and', 'for', 'CS']

Time complexity: O(N), where N is the length of the input lists test_list1 and test_list2.
Auxiliary space: O(N).where N is the length of the input lists test_list1 and test_list2.

Method #6: Using map() function with lambda function and filter() function

This method uses the map() function along with the lambda function to extract only the first element of each tuple returned by the filter() function. The filter() function uses a lambda function to check if the substring is present in the second element of each tuple returned by the zip() function.

Follow the below steps to implement the above idea:

Initialize two lists test_list1 and test_list2 with some string elements.
Initialize a substring sub_str.
Use the zip() function to combine the two lists test_list1 and test_list2 into a list of tuples where the i-th tuple contains the i-th element from both lists.
Use the filter() function with a lambda function to filter out the tuples whose second element (i.e., the string from test_list2) does not contain the substring sub_str. The lambda function returns True if the substring is present and False otherwise.
Convert the filtered tuples to a list and store it in the res variable.
Use the map() function with a lambda function to extract the first element of each tuple in res. The lambda function simply returns the first element of each tuple.
Convert the extracted elements to a list and store it in the extracted_list variable.
Print the extracted_list variable.

Below is the implementation of the above approach:

Python3

test_list1 = ["Gfg", "is", "not", "best",
              "and", "not", "for", "CS"]
test_list2 = ["Its ok", "all ok", "wrong", 
              "looks ok", "ok", "wrong", "ok", "thats ok"]
 
sub_str = "ok"
 
res = list(filter(lambda x: x[1].
            find(sub_str) != -1, zip(test_list1, test_list2)))
extracted_list = list(map(lambda x: x[0], res))
 
print("The extracted list : " + str(extracted_list))

Output

The extracted list : ['Gfg', 'is', 'best', 'and', 'for', 'CS']

Time complexity: O(n), where n is the length of the longer of the two lists test_list1 and test_list2.
Auxiliary space: O(m), where m is the number of elements in the extracted_list.

Method #7 : Using numpy and string comparison

Import the numpy library.
Define the lists test_list1 and test_list2 containing the elements to be compared.
Define the substring sub_str that we want to find in the elements of test_list2.
Convert test_list1 and test_list2 to numpy arrays using the np.array() function, creating test_array1 and test_array2.
Use the np.char.find() function to find the index of the first occurrence of the substring sub_str in each element of test_array2. This returns a boolean array where True represents the elements that contain the substring and False represents the elements that do not contain the substring.
Use boolean indexing on test_array1 by passing the boolean array np.char.find(test_array2, sub_str) != -1 inside square brackets. This filters out the corresponding elements from test_array1 that have True values in the boolean array.
Convert the resulting numpy array back to a Python list using the tolist() method, creating extracted_list.
Print the extracted list by converting it to a string and concatenating it with the rest of the output message.

Python3

import numpy as np
 
test_list1 = ["Gfg", "is", "not", "best", "and",
              "not", "for", "CS"]
test_list2 = ["Its ok", "all ok", "wrong", "looks ok", 
              "ok", "wrong", "ok", "thats ok"]
sub_str = "ok"
 
test_array1 = np.array(test_list1)
test_array2 = np.array(test_list2)
 
extracted_list = test_array1[np.char.
                 find(test_array2, sub_str) != -1].tolist()
 
print("The extracted list: " + str(extracted_list))

OUTPUT : 
The extracted list: ['Gfg', 'is', 'best', 'and', 'for', 'CS']

Time complexity: O(n), where n is the length of the lists test_list1 and test_list2.

Auxiliary space complexity: O(n), where n is the length of the lists test_list1 and test_list2, due to the creation of numpy arrays.

Suggest improvement

Python | Filter list of strings based on the substring list

Share your thoughts in the comments

Python – Filter the List of String whose index in second List contains the given Substring

Python3

Python3

Python3

Python3

Python3

Python3

Python3

Please Login to comment...

Similar Reads

What kind of Experience do you want to share?