Open In App

Python – Split Strings on Prefix Occurrence

Given a list of Strings, perform string split on the occurrence of prefix.

Input : test_list = [“geeksforgeeks”, “best”, “geeks”, “and”], pref = “geek” 
Output : [[‘geeksforgeeks’, ‘best’], [‘geeks’, ‘and’]] 
Explanation : At occurrence of string “geeks” split is performed.



Input : test_list = [“good”, “fruits”, “goodness”, “spreading”], pref = “good” 
Output : [[‘good’, ‘fruits’], [‘goodness’, ‘spreading’]] 
Explanation : At occurrence of string “good” split is performed. 

Method #1 : Using loop + startswith()



In this, we iterate each element of List, and check if new list has to be changed using startswith() by checking for prefix, and create new list if prefix is encountered.




# Python3 code to demonstrate working of
# Split Strings on Prefix Occurrence
# Using loop + startswith()
 
# initializing list
test_list = ["geeksforgeeks", "best", "geeks", "and", "geeks", "love", "CS"]
 
# printing original list
print("The original list is : " + str(test_list))
 
# initializing prefix
pref = "geek"
 
 
res = []
for val in test_list:
     
    # checking for prefix
    if val.startswith(pref):
         
        # if pref found, start new list
        res.append([val])
    else:
         
        # else append in current list
        res[-1].append(val)
 
# printing result
print("Prefix Split List : " + str(res))

Output
The original list is : ['geeksforgeeks', 'best', 'geeks', 'and', 'geeks', 'love', 'CS']
Prefix Split List : [['geeksforgeeks', 'best'], ['geeks', 'and'], ['geeks', 'love', 'CS']]

Method #2 : Using loop + zip_longest() + startswith()

In this, we zip all the elements with their subsequent element sublist and check for prefix using startswith(), if found, result is appended. 




# Python3 code to demonstrate working of
# Split Strings on Prefix Occurrence
# Using loop + zip_longest() + startswith()
from itertools import zip_longest
 
# initializing list
test_list = ["geeksforgeeks", "best", "geeks", "and", "geeks", "love", "CS"]
 
# printing original list
print("The original list is : " + str(test_list))
 
# initializing prefix
pref = "geek"
 
 
res, temp = [], []
 
for x, y in zip_longest(test_list, test_list[1:]):
    temp.append(x)
     
    # if prefix is found, split and start new list
    if y and y.startswith(pref):
        res.append(temp)
        temp = []
res.append(temp)
 
# printing result
print("Prefix Split List : " + str(res))

Output
The original list is : ['geeksforgeeks', 'best', 'geeks', 'and', 'geeks', 'love', 'CS']
Prefix Split List : [['geeksforgeeks', 'best'], ['geeks', 'and'], ['geeks', 'love', 'CS']]

Method #3 : Using list + recursion

Step-by-step approach:




def split_list_at_prefix(test_list, pref):
    # Initialize an empty list to hold the sublists.
    result = []
    # Initialize an empty sublist to hold the strings that come before the prefix.
    sublist = []
    # Iterate over the input list of strings.
    for string in test_list:
        # For each string, check if it starts with the given prefix.
        if string.startswith(pref):
            # If the string starts with the prefix, add the current sublist to the result list,
            # and create a new empty sublist.
            if sublist:
                result.append(sublist)
                sublist = []
        # If the string does not start with the prefix, add it to the current sublist.
        sublist.append(string)
    # If the loop is complete, add the final sublist to the result list.
    if sublist:
        result.append(sublist)
    # Return the result list.
    return result
test_list = ['geeksforgeeks', 'best', 'geeks', 'and']
prefix = 'geek'
result = split_list_at_prefix(test_list, prefix)
print(result)

Output
[['geeksforgeeks', 'best'], ['geeks', 'and']]

Time complexity: O(n)
Auxiliary space: O(n)

Method #4 : Using loop + find() method

Step-by-step approach:




# Python3 code to demonstrate working of
# Split Strings on Prefix Occurrence
# Using loop + find()
 
# initializing list
test_list = ["geeksforgeeks", "best", "geeks", "and", "geeks", "love", "CS"]
 
# printing original list
print("The original list is : " + str(test_list))
 
# initializing prefix
pref = "geek"
res = []
for val in test_list:
    # checking for prefix
    if val.find(pref)==0:
        # if pref found, start new list
        res.append([val])
    else:
        # else append in current list
        res[-1].append(val)
# printing result
print("Prefix Split List : " + str(res))

Output
The original list is : ['geeksforgeeks', 'best', 'geeks', 'and', 'geeks', 'love', 'CS']
Prefix Split List : [['geeksforgeeks', 'best'], ['geeks', 'and'], ['geeks', 'love', 'CS']]

Time Complexity: O(N), where N length of list
Auxiliary Space: O(N)

Method #5: Using itertools.groupby() and zip():

Step-by-step approach:




# Python3 code to demonstrate working of
# Split Strings on Prefix Occurrence
# Using itertools.groupby()
 
from itertools import groupby
 
# initializing list
test_list = ["geeksforgeeks", "best", "geeks", "and", "geeks", "love", "CS"]
 
# printing original list
print("The original list is : " + str(test_list))
 
# initializing prefix
pref = "geek"
 
# Using groupby() + startswith()
res = []
for k, g in groupby(test_list, lambda x: x.startswith(pref)):
    if k:
        res.append(list(g))
    else:
        if res:
            res[-1].extend(list(g))
 
# printing result
print("Prefix Split List : " + str(res))
#This code is contributed by Jyothi Pinjala.

Output
The original list is : ['geeksforgeeks', 'best', 'geeks', 'and', 'geeks', 'love', 'CS']
Prefix Split List : [['geeksforgeeks', 'best'], ['geeks', 'and'], ['geeks', 'love', 'CS']]

Time complexity: O(n), where n is the length of the input list. This is because the algorithm iterates over each element in the input list once.
Auxiliary space: O(n), where n is the length of the input list. This is because the algorithm creates a list of sublists, where each sublist contains some or all of the elements from the input list. The size of this list of sublists is proportional to the length of the input list.

Method #6: Using the reduce() function from the functools module




from functools import reduce
 
# initializing list of strings
test_list = ["geeksforgeeks", "best", "geeks", "and", "geeks", "love", "CS"]
 
# initializing prefix
pref = "geek"
 
# define function to split list of strings on prefix occurrence
def split_on_prefix(res, x):
    if x.startswith(pref):
        res.append([x])
    elif res:
        res[-1].append(x)
    return res
 
# apply function using reduce()
res = reduce(split_on_prefix, test_list, [])
 
# print result
print("Prefix Split List : " + str(res))

Output
Prefix Split List : [['geeksforgeeks', 'best'], ['geeks', 'and'], ['geeks', 'love', 'CS']]

Time complexity: O(n), where n is the length of the input list of strings, since it involves iterating through the list only once. 
Auxiliary space: O(m), where m is the number of prefix occurrences in the input list, since it involves creating a new list for each prefix occurrence.


Article Tags :