Open In App

Python – Retain smallest subsets from strings

Last Updated : 20 Feb, 2023
Improve
Improve
Like Article
Like
Save
Share
Report

Given a set of strings, the task is to write a python program to retain strings from sets that are the smallest possible subset of strings found. 

Input : test_set = {‘cbabca’, ‘cba’, ‘bdb’, ‘bdbab’, ‘abcx’}

Output : {‘bdb’, ‘abcx’, ‘cba’}

Explanation : bdbab is removed as bdb ( smaller subset ) is retained.

Input : test_set = {‘cbabca’, ‘cba’,  ‘bdbab’, ‘abcx’}

Output : {‘bdbab’, ‘abcx’, ‘cba’}

Explanation : cbabca is removed as cba ( smaller subset ) is retained.

Method : Using sorted() + any() + string slicing 

In this, we perform the task of getting the smallest of substrings by sorting the substring set and use any() to test if any of the subset matches the substring of string present in strings extracted as results smaller than the current string, and also a subset of a string. 

Python3




# Python3 code to demonstrate working of
# Retain smallest subsets from string
# Using sorted() + any() + string slicing
 
# initializing strings set
test_set = {'cbabca', 'cba', 'bdb', 'bdbab', 'abcx'}
 
# printing original string
print("The original set is : " + str(test_set))
 
res = set()
for st_r in sorted(test_set, key=len):
 
    # getting smallest set and checking for already
    # present smaller set for subset
    if not any(st_r[idx: idx + y + 1] in res
               for idx in range(len(st_r))
               for y in range(len(st_r) - idx)):
        res.add(st_r)
 
# printing result
print("Required extracted strings : " + str(res))


Output:

The original set is : {'cba', 'abcx', 'bdb', 'bdbab', 'cbabca'}
Required extracted strings : {'abcx', 'cba', 'bdb'}

Time complexity :  O(N^3log N) where N is the length of the longest string in the set. This is because the code iterates over all strings in the set, and for each string, it generates all possible substrings and checks if they are already in the result set, which takes O(N^2) time. The sorted() function takes O(Nlog N) time.

Space complexity : O(N^2), which is the maximum size of the result set.


Like Article
Suggest improvement
Previous
Next
Share your thoughts in the comments

Similar Reads