Python – All occurrences of Substring from the list of strings

Last Updated : 20 Mar, 2023

Given a list of strings and a list of substring. The task is to extract all the occurrences of a substring from the list of strings.

Examples:

Input : test_list = [“gfg is best”, “gfg is good for CS”, “gfg is recommended for CS”]

subs_list = [“gfg”, “CS”]

Output : [‘gfg is good for CS’, ‘gfg is recommended for CS’]

Explanation : Result strings have both “gfg” and “CS”.

Input : test_list = [“gfg is best”, “gfg is recommended for CS”]

subs_list = [“gfg”]

Output : [“gfg is best”, “gfg is recommended for CS”]

Explanation : Result strings have “gfg”.

Method #1 : Using loop + in operator

The combination of the above functions can be used to solve this problem. In this, we run a loop to extract all strings and also all substring in the list. The in operator is used to check for substring existence.

Python3

# Python3 code to demonstrate working of 
# Strings with all Substring Matches
# Using loop + in operator
 
# initializing list
test_list = ["gfg is best", "gfg is good for CS",
             "gfg is recommended for CS"] 
 
# printing original list
print("The original list is : " + str(test_list))
 
# initializing Substring List 
subs_list = ["gfg", "CS"]
 
res = []
for sub in test_list:
    flag = 0
    for ele in subs_list:
         
        # checking for non existence of 
        # any string 
        if ele not in sub:
            flag = 1
            break
    if flag == 0:
        res.append(sub)
 
# printing result 
print("The extracted values : " + str(res))

Output:

The original list is : [‘gfg is best’, ‘gfg is good for CS’, ‘gfg is recommended for CS’] The extracted values : [‘gfg is good for CS’, ‘gfg is recommended for CS’]

Time Complexity: O(n²)

Auxiliary Space: O(n)

Method #2 : Using all() + list comprehension

This is a one-liner approach with the help of which we can perform this task. In this, we check for all values existence using all(), and list comprehension is used to iteration of all the containers.

Python3

# Python3 code to demonstrate working of 
# Strings with all Substring Matches
# Using all() + list comprehension
 
# initializing list
test_list = ["gfg is best", "gfg is good for CS",
             "gfg is recommended for CS"] 
 
# printing original list
print("The original list is : " + str(test_list))
 
# initializing Substring List 
subs_list = ["gfg", "CS"]
 
# using all() to check for all values
res = [sub for sub in test_list 
       if all((ele in sub) for ele in subs_list)]
 
# printing result 
print("The extracted values : " + str(res))

Output:

The original list is : [‘gfg is best’, ‘gfg is good for CS’, ‘gfg is recommended for CS’] The extracted values : [‘gfg is good for CS’, ‘gfg is recommended for CS’]

Time Complexity: O(n²)

Auxiliary Space: O(n)

Method #3: Using Counter() function

Python3

# Python3 code to demonstrate working of
# Strings with all Substring Matches
# Using Counter() function
from collections import Counter
# initializing list
test_list = ["gfg is best", "gfg is good for CS",
             "gfg is recommended for CS"]
 
# printing original list
print("The original list is : " + str(test_list))
 
# initializing Substring List
subs_list = ["gfg", "CS"]
 
res = []
for sub in test_list:
    flag = 0
    freq = Counter(sub.split())
    for ele in subs_list:
 
        # checking for non existence of
        # any string
        if ele not in freq.keys():
            flag = 1
            break
    if flag == 0:
        res.append(sub)
 
# printing result
print("The extracted values : " + str(res))

Output

The original list is : ['gfg is best', 'gfg is good for CS', 'gfg is recommended for CS']
The extracted values : ['gfg is good for CS', 'gfg is recommended for CS']

Time Complexity: O(n)

Auxiliary Space: O(n)

Method #4: Using set() and intersection() function

Step by step Algorithm:

Initialize the original list test_list and the list of substrings subs_list.
Create a new empty list res to store the extracted values.
Loop through each string sub in test_list.
Convert sub into a set of words using the split() function.
Check if subs_list is a subset of the set sub_words using the issubset() function.
If subs_list is a subset of sub_words, append sub to res.
After looping through all the strings in test_list, print the extracted values in res.

Python3

test_list = ["gfg is best", "gfg is good for CS", "gfg is recommended for CS"]
subs_list = {"gfg", "CS"}
 
# printing original list
print("The original list is : " + str(test_list))
 
res = [sub for sub in test_list if subs_list.issubset(set(sub.split()))]
print("The extracted values : " + str(res))

Output

The original list is : ['gfg is best', 'gfg is good for CS', 'gfg is recommended for CS']
The extracted values : ['gfg is good for CS', 'gfg is recommended for CS']

Time Complexity: O(n * m * k)

The loop through test_list takes O(n) time, where n is the length of test_list and the conversion of sub into a set of words using the split() function takes O(m) time, where m is the number of words in sub and also the issubset() function takes O(k) time, where k is the number of substrings in subs_list. Thus, the overall time complexity of the algorithm is O(n * m * k).
Space Complexity:

The space used by the res list is O(n*m), where n is the length of test_list and m is the average length of the strings in the list.

Suggest improvement

Python - Rows with K string in Matrix

Convert PDF to Image using Python

Share your thoughts in the comments