Python – Maximum Consecutive Substring Occurrence

Sometimes, while working with Python, we can have a problem in which we must check for substrings occurring in consecutive repetition. This can have applications in data domains. Let us discuss a way in which this task can be performed.

Method #1: Using max() + re.findall()

A combination of the above methods can be used to perform this task. In this, we extract the substrings repetitions using findall() and the maximum of them using max().

Python3

# Python3 code to demonstrate working of
# Maximum Consecutive Substring Occurrence
# Using max() + re.findall()

import re
 
# Initializing string

test_str = 'geeksgeeks are geeks for all geeksgeeksgeeks'
 
# Printing original string

print("The original string is : " + str(test_str))
 
# Initializing subs

sub_str = 'geeks'
 
# Maximum Consecutive Substring Occurrence
# using max() + re.findall()

res = max(re.findall('((?:' + re.escape(sub_str) + ')*)', test_str), key=len)
 
# Printing result

print("The maximum run of Substring : " + res)

Output :

The original string is : geeksgeeks are geeks for all geeksgeeksgeeks
The maximum run of Substring : geeksgeeksgeeks

The Time and Space Complexity for all the methods is the same:

Time Complexity: O(n)
Auxiliary Space: O(n)

Method #2: Using list comprehension+ loop + max() + len()

Algorithm:

Initialize a variable “sub_str” with the substring to be searched.
Calculate the length of the input string “test_str” and the length of the substring “sub_str”.
Initialize a variable “max_sub” with an empty string.
Loop through a range of values from 0 to the quotient of the length of the input string divided by the length of the substring.
Multiply the substring by the current value of the loop counter and check if it exists in the input string “test_str”.
If the substring exists in the input string, update the “max_sub” variable with the current substring if its length is greater than the length of the previous “max_sub” substring.
Return the “max_sub” variable.

Python3

# initializing string

test_str = 'geeksgeeks are geeks for all geeksgeeksgeeks'
# initializing subs 

sub_str = 'geeks'
# printing original string

print("The original string is : " + str(test_str))
 
max_sub = max([sub_str * n for n in range(len(test_str)//len(sub_str)+1) if sub_str * n in test_str], key=len)
 
# printing result 

print("The maximum run of Substring : " + max_sub)

Output

The original string is : geeksgeeks are geeks for all geeksgeeksgeeks
The maximum run of Substring : geeksgeeksgeeks

Time complexity: O(n^2), where n is the length of the input string. This is because the nested loop runs for a maximum of n/len(sub_str) times and the inbuilt max() function also takes O(n) time.

Auxiliary Space: O(n), where n is the length of the input string. The space is used to store the input string, substring, and the “max_sub” variable.

Method #3: Using the lambda function

In this method, we define a lambda function called max_substring that takes two arguments, a string s and a substring sub, and returns the maximum consecutive occurrence of the substring in the string.
We then use the lambda function max_substring to find the maximum consecutive occurrence of the substring in the input string test_str. Finally, we print the result res.

Below is the code for the above method:

Python3

# Python3 code to demonstrate working of
# Maximum Consecutive Substring Occurrence

import re
 
# initializing string

test_str = 'geeksgeeks are geeks for all geeksgeeksgeeks'
 
# printing original string

print("The original string is : " + str(test_str))
 
# initializing subs

sub_str = 'geeks'
 
# using lambda function to find maximum consecutive substring

max_substring = lambda s, sub: max(re.findall('((?:' + re.escape(sub) + ')*)', s), key=len)
 
# Maximum Consecutive Substring Occurrence

res = max_substring(test_str, sub_str)
 
# printing result

print("The maximum run of Substring : " + res)

Output

The original string is : geeksgeeks are geeks for all geeksgeeksgeeks
The maximum run of Substring : geeksgeeksgeeks

Time complexity: O(n^2), where n is the length of the input string test_str.

Auxiliary Space: O(n^2)

Method 4: Using Loops

Initialize a variable, max_count, to 0 to keep track of the maximum count of the substring in any window.
Initialize a variable, max_sub, to an empty string to keep track of the substring with the maximum count.
Split the input string into a list of words using the split() method.
Loop over the words in the list.
For each word, count the number of occurrences of the given substring using the count() method.
If the count is greater than or equal to max_count, update max_count and max_sub.
Repeat steps 4-6 for all words in the list.
Return the substring with the maximum count

Python3

# initializing string

test_str = 'geeksgeeks are geeks for all geeksgeeksgeeks'
# initializing subs 

sub_str = 'geeks'
# printing original string

print("The original string is : " + str(test_str))
 
# initializing variables

max_count = 0

max_sub = ""
 
# split the input string into a list of words

words = test_str.split()
 
# loop over the words in the list

for word in words:

    # count the number of occurrences of the given substring in the word

    count = word.count(sub_str)

    # update max_count and max_sub if necessary

    if count >= max_count:

        max_count = count

        max_sub = sub_str * max_count
 
# printing result 

print("The maximum run of Substring : " + max_sub)

Output

The original string is : geeksgeeks are geeks for all geeksgeeksgeeks
The maximum run of Substring : geeksgeeksgeeks

Time complexity: O(NK), where n is the number of words in the input string and k is the length of the substring. The count() method has a time complexity of O(k), and it is called once for each word in the input string.
Auxiliary space: O(K), where k is the length of the substring. The max_sub variable can have at most k characters.

Article Tags :

Python

Python Programs

Python string-programs