Python | Remove Redundant Substrings from Strings List
Given list of Strings, task is to remove all the strings, which are substrings of other Strings.
Input : test_list = [“Gfg”, “Gfg is best”, “Geeks”, “for”, “Gfg is for Geeks”]
Output : [‘Gfg is best’, ‘Gfg is for Geeks’]
Explanation : “Gfg”, “for” and “Geeks” are present as substrings in other strings.
Input : test_list = [“Gfg”, “Geeks”, “for”, “Gfg is for Geeks”]
Output : [‘Gfg is for Geeks’]
Explanation : “Gfg”, “for” and “Geeks” are present as substrings in other strings.
Method #1 : Using enumerate() + join() + sort()
The combination of above functions can be used to solve this problem. In this, first the sorting is performed on length parameter, and current word is checked with other words, if it occurs as substring, if yes, its excluded from filtered result.
Python3
test_list = [ "Gfg" , "Gfg is best" , "Geeks" , "Gfg is for Geeks" ]
print ( "The original list : " + str (test_list))
test_list.sort(key = len )
res = []
for idx, val in enumerate (test_list):
if val not in ', ' .join(test_list[idx + 1 :]):
res.append(val)
print ( "The filtered list : " + str (res))
|
Output
The original list : ['Gfg', 'Gfg is best', 'Geeks', 'Gfg is for Geeks']
The filtered list : ['Gfg is best', 'Gfg is for Geeks']
Time complexity: O(nlogn), where n is the length of the test_list. The enumerate() + join() + sort() takes O(nlogn) time
Auxiliary Space: O(n), extra space of size n is required
Method #2 : Using list comprehension + join() + enumerate()
The combination of above functions can be used to solve this problem. In this, we perform task in similar way as above just the difference being in more compact way in list comprehension.
Python3
test_list = [ "Gfg" , "Gfg is best" , "Geeks" , "Gfg is for Geeks" ]
print ( "The original list : " + str (test_list))
test_list.sort(key = len )
res = [val for idx, val in enumerate (test_list) if val not in ', ' .join(test_list[idx + 1 :])]
print ( "The filtered list : " + str (res))
|
Output
The original list : ['Gfg', 'Gfg is best', 'Geeks', 'Gfg is for Geeks']
The filtered list : ['Gfg is best', 'Gfg is for Geeks']
The Time and Space Complexity for all the methods are the same:
Time Complexity: O(n)
Space Complexity: O(n)
Method#3: Using Recursive method.
Algorithm
- Sort the list of strings by length.
- Initialize an empty result list.
- For each string in the sorted list: a. Check if the string is a redundant substring of any of the remaining strings in the list (i.e., any string that comes after it in the sorted list). If it is, skip the string and move on to the next one. b. If the string is not redundant, add it to the result list.
- Return the result list.
Python3
def remove_redundant_substrings(strings):
if len (strings) < = 1 :
return strings
strings.sort(key = len )
current_string = strings.pop( 0 )
remaining_strings = remove_redundant_substrings(strings)
for string in remaining_strings:
if current_string in string:
return remaining_strings
remaining_strings.append(current_string)
return remaining_strings
test_list = [ "Gfg" , "Gfg is best" , "Geeks" , "Gfg is for Geeks" ]
print ( "The original list : " + str (test_list))
res = remove_redundant_substrings(test_list)
print ( "The filtered list : " + str (res))
|
Output
The original list : ['Gfg', 'Gfg is best', 'Geeks', 'Gfg is for Geeks']
The filtered list : ['Gfg is for Geeks', 'Gfg is best']
The time complexity of this algorithm is O(n^2 * m), where n is the number of strings in the input list and m is the maximum length of a string in the list. The worst case occurs when all the strings are unique and none of them are a substring of any of the others, so we have to check each string against every other string in the list. The sorting step takes O(n log n) time, and the string comparisons take O(m) time each, so the overall time complexity is O(n^2 * m).
The auxiliary space of this algorithm is O(n * m), because we need to store a copy of each string in the input list (which takes O(n * m) space), plus the result list (which can also take up to O(n * m) space if all the strings are unique and none of them are redundant substrings of any of the others
Last Updated :
07 Apr, 2023
Like Article
Save Article
Share your thoughts in the comments
Please Login to comment...