Open In App

Python Program to Group Strings by K length Using Suffix

Given Strings List, the task is to write a Python program to group them into K-length suffixes.

Input : test_list = [“food”, “peak”, “geek”, “good”, “weak”, “sneek”], K = 3 
Output : {‘ood’: [‘food’, ‘good’], ‘eak’: [‘peak’, ‘weak’], ‘eek’: [‘geek’, ‘sneek’]} 
Explanation : words ending with ood are food and good, hence grouped.



Input : test_list = [“peak”, “geek”, “good”, “weak”], K = 3 
Output : {‘ood’: [‘good’], ‘eak’: [‘peak’, ‘weak’], ‘eek’: [‘geek’]} 
Explanation : word ending with ood is good, hence grouped. 

Method 1 : Using try/except + loop



In this, we extract the last K characters and form a string, and append it to the existing key’s list corresponding to it, if not found, it goes through catch flow and creates a new key with a list with the first word initialized.




# Python3 code to demonstrate working of
# Group Strings by K length Suffix
# Using try/except + loop
 
# initializing list
test_list = ["food", "peak", "geek",
             "good", "weak", "sneek"]
 
# printing original list
print("The original list is : " + str(test_list))
 
# initializing K
K = 3
 
res = {}
for ele in test_list:
     
    # extracting suffix
    suff = ele[-K : ]
     
    # appending if key found, else creating new one
    try:
        res[suff].append(ele)
    except:
        res[suff] = [ele]
         
# printing result
print("The grouped suffix Strings : " + str(res))

Output
The original list is : ['food', 'peak', 'geek', 'good', 'weak', 'sneek']
The grouped suffix Strings : {'ood': ['food', 'good'], 'eak': ['peak', 'weak'], 'eek': ['geek', 'sneek']}

Time Complexity: O(n)
Auxiliary Space: O(n)

Method 2 : Using defaultdict() + loop.

This method avoids the need of using try/except block as default list initialization is handled by defaultdict().




# Python3 code to demonstrate working of
# Group Strings by K length Suffix
# Using defaultdict() + loop
from collections import defaultdict
 
# initializing list
test_list = ["food", "peak", "geek",
             "good", "weak", "sneek"]
 
# printing original list
print("The original list is : " + str(test_list))
 
# initializing K
K = 3
 
res = defaultdict(list)
for ele in test_list:
     
    # extracting suffix
    suff = ele[-K : ]
     
    # appending into matched suffix key
    res[suff].append(ele)
         
# printing result
print("The grouped suffix Strings : " + str(dict(res)))

Output
The original list is : ['food', 'peak', 'geek', 'good', 'weak', 'sneek']
The grouped suffix Strings : {'ood': ['food', 'good'], 'eak': ['peak', 'weak'], 'eek': ['geek', 'sneek']}

Time Complexity: O(n)
Auxiliary Space: O(n)

Method 3 : Using for loops and endswith()




# Python3 code to demonstrate working of
# Group Strings by K length Suffix
 
# initializing list
test_list = ["food", "peak", "geek",
            "good", "weak", "sneek"]
 
# printing original list
print("The original list is : " + str(test_list))
 
# initializing K
K = 3
res=dict()
x=[]
for i in test_list:
    if i[-K:] not in x:
        x.append(i[-K:])
for i in x:
    a=[]
    for j in test_list:
        if(j.endswith(i)):
            a.append(j)
    res[i]=a
print(res)

Output
The original list is : ['food', 'peak', 'geek', 'good', 'weak', 'sneek']
{'ood': ['food', 'good'], 'eak': ['peak', 'weak'], 'eek': ['geek', 'sneek']}

Time Complexity: O(n*n)
Auxiliary Space: O(n)

Method 4 :  Using Dictionary and List Comprehensions

This approach uses a combination of dictionary and list comprehensions to group the strings based on their K-length suffix. We first create a list of all unique K-length suffixes from the strings in the input list. Then, using the list comprehension, we create a dictionary where each key is a suffix, and its value is a list of all the strings that end with that suffix.




# Python3 code to demonstrate working of
# Group Strings by K length Suffix
 
# initializing list
test_list = ["food", "peak", "geek",
            "good", "weak", "sneek"]
 
# printing original list
print("The original list is : " + str(test_list))
 
# initializing K
K = 3
 
result = {suffix: [word for word in test_list if word.endswith(suffix)] for suffix in set([word[-K:] for word in test_list])}
 
# printing result
print("The grouped suffix Strings : " + str(result))

Output
The original list is : ['food', 'peak', 'geek', 'good', 'weak', 'sneek']
The grouped suffix Strings : {'ood': ['food', 'good'], 'eek': ['geek', 'sneek'], 'eak': ['peak', 'weak']}

Time Complexity: O(n * m) where n is the length of the input list test_list and m is the length of the longest string in test_list.
Auxiliary Space: O(n)

Method 5 : Using itertools.groupby()

 step-by-step approach 

  1. Import the itertools module for working with iterators.
  2. Sort the list of words by their last K characters using the key parameter of the sorted() function.
  3. Use the itertools.groupby() function to group the words by their last K characters.
  4. Create an empty dictionary to store the grouped suffix strings.
  5. Loop through the groups of words and add them to the dictionary.
  6. Print the final dictionary.




import itertools
 
# initializing list
test_list = ["food", "peak", "geek", "good", "weak", "sneek"]
 
# printing original list
print("The original list is : " + str(test_list))
 
# initializing K
K = 3
 
# sort list by last K characters
sorted_list = sorted(test_list, key=lambda word: word[-K:])
 
# group words by suffix using itertools.groupby()
groups = itertools.groupby(sorted_list, key=lambda word: word[-K:])
 
# create empty dictionary to store grouped suffix strings
result = {}
 
# loop through groups and add them to dictionary
for suffix, words in groups:
    result[suffix] = list(words)
 
# printing result
print("The grouped suffix Strings : " + str(result))

Output
The original list is : ['food', 'peak', 'geek', 'good', 'weak', 'sneek']
The grouped suffix Strings : {'eak': ['peak', 'weak'], 'eek': ['geek', 'sneek'], 'ood': ['food', 'good']}

The time complexity of this method is O(n*log(n)), where n is the length of the original list, due to the sorting operation.
The auxiliary space used by this method is O(n), where n is the length of the original list, due to the creation of the sorted list.


Article Tags :