Open In App

Python | Split by repeating substring

Last Updated : 23 Apr, 2023
Improve
Improve
Like Article
Like
Save
Share
Report

Sometimes, while working with Python strings, we can have a problem in which we need to perform splitting. This can be of a custom nature. In this, we can have a split in which we need to split by all the repetitions. This can have applications in many domains. Let us discuss certain ways in which this task can be performed. 

Method #1: Using * operator + len() This is one of the way in which we can perform this task. In this, we compute the length of the repeated string and then divide the list to obtain root and construct new list using * operator. 

Python3




# Python3 code to demonstrate working of
# Split by repeating substring
# Using * operator + len()
 
# initializing string
test_str = "gfggfggfggfggfggfggfggfg"
 
# printing original string
print("The original string is : " + test_str)
 
# initializing target
K = 'gfg'
 
# Split by repeating substring
# Using * operator + len()
temp = len(test_str) // len(str(K))
res = [K] * temp
 
# printing result
print("The split string is : " + str(res))


Output : 

The original string is : gfggfggfggfggfggfggfggfg
The split string is : ['gfg', 'gfg', 'gfg', 'gfg', 'gfg', 'gfg', 'gfg', 'gfg']

  Method #2 : Using re.findall() This is yet another way in which this problem can be solved. In this, we use findall() to get all the substrings and split is also performed internally. 

Python3




# Python3 code to demonstrate working of
# Split by repeating substring
# Using re.findall()
import re
 
# initializing string
test_str = "gfggfggfggfggfggfggfggfg"
 
# printing original string
print("The original string is : " + test_str)
 
# initializing target
K = 'gfg'
 
# Split by repeating substring
# Using re.findall()
res = re.findall(K, test_str)
 
# printing result
print("The split string is : " + str(res))


Output : 

The original string is : gfggfggfggfggfggfggfggfg
The split string is : ['gfg', 'gfg', 'gfg', 'gfg', 'gfg', 'gfg', 'gfg', 'gfg']

Method #3 : Using count() method and * operator

Python3




# Python3 code to demonstrate working of
# Split by repeating substring
 
# initializing string
test_str = "gfggfggfggfggfggfggfggfg"
 
# printing original string
print("The original string is : " + test_str)
 
# initializing target
K = 'gfg'
 
# Split by repeating substring
re=test_str.count(K)
res=[K]*re
 
# printing result
print("The split string is : " + str(res))


Output

The original string is : gfggfggfggfggfggfggfggfg
The split string is : ['gfg', 'gfg', 'gfg', 'gfg', 'gfg', 'gfg', 'gfg', 'gfg']

The Time and Space Complexity for all the methods are the same:

Time Complexity: O(n)
Auxiliary Space: O(n)

Method #4:Using loop and slicing

Python3




# initializing string
test_str = "gfggfggfggfggfggfggfggfg"
 
# printing original string
print("The original string is : " + test_str)
 
# initializing target
K = 'gfg'
 
# Split by repeating substring using loop and slicing
res = []
start = 0
while start < len(test_str):
    end = start + len(K)
    if test_str[start:end] == K:
        res.append(K)
        start = end
    else:
        start += 1
 
# printing result
print("The split string is : " + str(res))
#This code is contributed by Vinay Pinjala.


Output

The original string is : gfggfggfggfggfggfggfggfg
The split string is : ['gfg', 'gfg', 'gfg', 'gfg', 'gfg', 'gfg', 'gfg', 'gfg']

Time complexity: O(n), The time complexity of this method is linear, as it involves looping through the input string once and performing constant time operations on each character.
Auxiliary Space: O(n), The space complexity of this method is linear, as it involves creating a list of strings that will be the split result. The length of this list will be proportional to the length of the input string.

Method 5 : use the regular expression module re 

  1. Import the ‘re’ module which stands for “regular expressions”. This module provides a way to work with regular expressions in Python.
  2. Initialize a string ‘test_str’ with some repeated substrings.
  3. Initialize a target string ‘K’ with a substring we want to split by.
  4. Use the ‘re.findall()’ method to split the ‘test_str’ string by the target ‘K’ substring. This method returns a list of all non-overlapping matches of the regular expression in the string.
  5. Store the result of the ‘re.findall()’ method in a variable named ‘res’.
  6. Print the original string ‘test_str’ using the ‘print()’ function.
  7. Print the split string ‘res’ using the ‘print()’ function.
  8. Convert the ‘res’ list to a string using the ‘str()’ function to make it printable.
  9. Concatenate the string “The original string is : ” with ‘test_str’ using the ‘+’ operator and print the resulting string.
  10. Concatenate the string “The split string is : ” with the converted ‘res’ string using the ‘+’ operator and print the resulting string.
  11. The program execution ends here.

Python3




import re
 
# initializing string
test_str = "gfggfggfggfggfggfggfggfg"
 
# initializing target
K = 'gfg'
 
# Split by repeating substring using re.findall() method
res = re.findall(K, test_str)
 
# printing result
print("The original string is : " + test_str)
print("The split string is : " + str(res))


Output

The original string is : gfggfggfggfggfggfggfggfg
The split string is : ['gfg', 'gfg', 'gfg', 'gfg', 'gfg', 'gfg', 'gfg', 'gfg']

The time complexity of this approach is O(n), where n is the length of the input string. 

The auxiliary space required is O(k), where k is the number of occurrences of the target substring in the input string.



Like Article
Suggest improvement
Previous
Next
Share your thoughts in the comments

Similar Reads