Python – Group Similar Start and End character words

Sometimes, while working with Python data, we can have problem in which we need to group all the words on basis of front and end characters. This kind of application is common in domains in which we work with data like web development. Lets discuss certain ways in which this task can be performed.

Method #1 : Using defaultdict() + loop The combination of above functions can be used to perform this task. In this, we check for front and last element using string slice notations and store in dict. with first and last character as key.

Python3

# Python3 code to demonstrate working of 
# Group Similar Start and End character words
# Using defaultdict() + loop

from collections import defaultdict
 
def end_check(word):

    sub1 = word.strip()[0]

    sub2 = word.strip()[-1]

    temp = sub1 + sub2

    return temp
 
def front_check(word):

    sub = word.strip()[1:-1]

    return sub

# initializing string

test_str = 'geeksforgeeks is indias best and bright for geeks'
 
# printing original string

print("The original string is : " + str(test_str))
 
# Group Similar Start and End character words
# Using defaultdict() + loop

test_list = test_str.split()

res = defaultdict(set)

for ele in test_list:

    res[end_check(ele)].add(front_check(ele))
 
# printing result 

print("The grouped dictionary is : " + str(dict(res)))

Output :

The original string is : geeksforgeeks is indias best and bright for geeks The grouped dictionary is : {‘fr’: {‘o’}, ‘bt’: {‘righ’, ‘es’}, ‘ad’: {‘n’}, ‘gs’: {‘eeksforgeek’, ‘eek’}, ‘is’: {”, ‘ndia’}}

Time Complexity: O(n)
Space Complexity: O(n)

Method #2: Using loop + split() + add()

Step by step approach:

Initialize the original string.
Convert the string into a list of words.
Initialize an empty dictionary to store the grouped words.
Iterate through the list of words.
a. Get the first and last character of the word as a tuple.
b. If the tuple is not already a key in the dictionary, add the key with an empty set as the value.
c. Add the word to the set corresponding to the key.
Print the original string and the grouped dictionary.

Python3

# define the original string

str1 = "geeksforgeeks is indias best and bright for geeks"
 
# convert the string into a list of words

word_list = str1.split()
 
# initialize an empty dictionary to store the grouped words

group_dict = {}
 
# iterate through the list of words

for word in word_list:

    # get the first and last character of the word as a tuple

    key = (word[0], word[-1])

    # if the tuple is not already a key in the dictionary, add the key with an empty set as the value

    if key not in group_dict:

        group_dict[key] = set()

    # add the word to the set corresponding to the key

    group_dict[key].add(word)
 
# print the original string and the grouped dictionary

print("The original string is :", str1)

print("The grouped dictionary is :", group_dict)

Output

The original string is : geeksforgeeks is indias best and bright for geeks
The grouped dictionary is : {('g', 's'): {'geeksforgeeks', 'geeks'}, ('i', 's'): {'indias', 'is'}, ('b', 't'): {'bright', 'best'}, ('a', 'd'): {'and'}, ('f', 'r'): {'for'}}

Time Complexity:
The time complexity of the code is O(n), where n is the total number of characters in the input string. This is because we only iterate through the string once and the operations performed inside the loop take constant time.

Space Complexity:
The space complexity of the code is also O(n), where n is the total number of characters in the input string. This is because we store the words in a list,

Method 3: Using dictionary comprehension + split()

Step-by-step approach:

Initialize the string test_str to the input string.
Print the original string.
Split the string into a list of words using the split() method and assign it to the variable test_list.
Use a dictionary comprehension to create a dictionary where each key is a unique pair of the first and last letters of a word (obtained by calling end_check() on each word), and the corresponding value is a set of all the substrings of all the words that share that pair (obtained by looping through test_list and calling front_check() on each word that has the same key). The dictionary comprehension is assigned to the variable res.
Print the resulting grouped dictionary.

Below is the implementation of the above approach:

Python3

# Python3 code to demonstrate working of 
# Group Similar Start and End character words
# Using dictionary comprehension + split()
 
# Define end_check and front_check functions

def end_check(word):

    sub1 = word.strip()[0]

    sub2 = word.strip()[-1]

    temp = sub1 + sub2

    return temp
 
def front_check(word):

    sub = word.strip()[1:-1]

    return sub
 
# initializing string

test_str = 'geeksforgeeks is indias best and bright for geeks'
 
# printing original string

print("The original string is : " + str(test_str))
 
# Group Similar Start and End character words
# Using dictionary comprehension + split()

test_list = test_str.split()

res = {end_check(ele): set(front_check(word) for word in test_list if end_check(word) == end_check(ele)) for ele in test_list}
 
# printing result 

print("The grouped dictionary is : " + str(res))

Output

The original string is : geeksforgeeks is indias best and bright for geeks
The grouped dictionary is : {'gs': {'eeksforgeek', 'eek'}, 'is': {'', 'ndia'}, 'bt': {'righ', 'es'}, 'ad': {'n'}, 'fr': {'o'}}

Time complexity: O(n^2) because it requires looping through the list test_list for each unique key in the resulting dictionary.
Auxiliary space: O(n^2) because the resulting dictionary can potentially have n^2 unique key-value pairs.

Article Tags :

Python

Python Programs

Python string-programs