Open In App

Python – Extract String till all occurrence of characters from other string

Improve
Improve
Like Article
Like
Save
Share
Report

Given a string, the task is to write a Python program to extract till all characters from other strings are found.

Input : test_str = “geeksforgeeks is best for all geeks”, check_str = “freak”
Output : geeksforgeeks is best for a
Explanation : a is last letter in freak to be present in string. The string is printed till first occurrence of a.

Input : test_str = “geeksforgeeks is best for all geeks”, check_str = “geeki”
Output : geeksforgeeks i
Explanation : i is last letter in freak to be present in string. The string is printed till first occurrence of i.

Method #1 : Using all() + slicing + loop

In this, the substring is constructed till the current index is in a loop, then from the string, all the elements from other strings are checked using all(). If all are present, the substring is returned.

step-by-step approach to understand the code:

  1. The program starts by initializing a string test_str with the value “geeksforgeeks is best for all geeks”.
  2. The original string is printed using the print() function with a message “The original string is: ” and the original string concatenated to it using the + operator.
  3. Next, the program initializes another string check_str with the value “freak”. This is the string until all characters of which we have to extract the substring.
  4. The program then enters a loop that iterates over the length of the original string test_str, starting from index 1 (i.e., excluding the first character of the original string).
  5. Inside the loop, a temporary substring temp is created by slicing the original string test_str from index 0 to the current index value idx.
  6. The all() function is used to check if all the characters of the check_str string are present in the temporary substring temp. This is done by using a list comprehension that checks if each character of the check_str string is present in the temporary substring temp.
  7. If all characters of check_str are present in temp, then the temp substring is stored in a variable res and the loop is exited using the break statement.
  8. Finally, the result is printed using the print() function with a message “String till all characters occurred: ” and the value of the res variable concatenated to it using the + operator.

Python3




# Python3 code to demonstrate working of
# Extract String till all occurrence of characters from other string
# Using all() + slicing + loop
 
# initializing string
test_str = "geeksforgeeks is best for all geeks"
 
# printing original string
print("The original string is : " + str(test_str))
 
# initializing check string
check_str = "freak"
 
for idx in range(1, len(test_str)):
    temp = test_str[:idx]
 
    # checking for all chars of check_str in substring
    if all([char in temp for char in check_str]):
        res = temp
        break
 
# printing result
print("String till all characters occurred : " + str(res))


Output:

The original string is : geeksforgeeks is best for all geeks
String till all characters occurred : geeksforgeeks is best for a

Time Complexity: O(n)
Auxiliary Space: O(n)

Method #2 : Using find() + max() + slicing

In this, each character in the check string is iterated, the index of all are checked and the maximum of those is recorded. The substring sliced till the max index is required answer.

Python3




# Python3 code to demonstrate working of
# Extract String till all occurrence of characters from other string
# Using find() + max() + slice
 
# initializing string
test_str = "geeksforgeeks is best for all geeks"
 
# printing original string
print("The original string is : " + str(test_str))
 
# initializing check string
check_str = "freak"
 
# max() find maximum index of all characters
res = test_str[:max([test_str.find(idx) for idx in check_str]) + 1]
 
# printing result
print("String till all characters occurred : " + str(res))


Output:

The original string is : geeksforgeeks is best for all geeks
String till all characters occurred : geeksforgeeks is best for a

Time Complexity: O(n)
Auxiliary Space: O(n)

Method 3: Use index() + find() + max()

This approach is similar to the previous one, but it uses the index() method instead of the find() method to find the indices of each character in check_str in test_str. If any character is not found, the result is an empty string. Otherwise, the maximum index is found using the max() function, and the substring up to that index is extracted. The set() function is used to remove duplicates from the list of indices, and the result is an empty string if the length of the resulting set is less than the length of check_str.

Python3




# initializing string
test_str = "geeksforgeeks is best for all geeks"
 
# initializing check string
check_str = "freak"
 
# create a list of indices where the characters in check_str occur in test_str
indices = [test_str.index(c) for c in check_str]
 
# if any character is not found in test_str, the result is an empty string
if len(set(indices)) < len(check_str):
    res = ''
else:
    # find the maximum index and extract the substring up to that index
    res = test_str[:max(indices)+1]
 
# print the result
print("String till all characters occurred : " + str(res))


Output

String till all characters occurred : geeksforgeeks is best for a

Time complexity: O(nm), where n is the length of test_str and m is the length of check_str, 
Auxiliary space: O(m)

Method #4: Using set() and sorted()

Use a set to check if all the characters in the check_str are present in the test_str. If they are present, then create a list of indices where these characters occur in test_str using a loop and then return the substring of test_str till the maximum index.

However, Use the sorted() function to sort the list of indices in ascending order and then return the substring of test_str till the maximum index, which is the last element of the sorted list.

Python3




# initializing string
test_str = "geeksforgeeks is best for all geeks"
 
# initializing check string
check_str = "freak"
 
# create a set of characters in check_str
check_set = set(check_str)
 
# check if all characters in check_str are present in test_str
if check_set - set(test_str):
    res = ''
else:
    # create a list of indices where the characters in check_str occur in test_str
    indices = [test_str.index(c) for c in check_str]
    # find the maximum index from the sorted list of indices and extract the substring up to that index
    res = test_str[:sorted(indices)[-1] + 1]
 
# print the result
print("String till all characters occurred : " + str(res))


Output

String till all characters occurred : geeksforgeeks is best for a

Time complexity: O(n log n), where n is the length of the test_str. This is because we are using the sorted() function, which has a time complexity of O(n log n).
Auxiliary space: O(k), where k is the length of the check_str. This is because we are using a set to store the characters in check_str, which has a space complexity of O(k).

Method #5: Using regular expressions

This method involves using regular expressions to find the substring that matches the pattern of the check string.

Approach:

  1. Import the re-module, which provides support for regular expressions in Python.
  2. Define the test_str and check_str variables as in the previous methods.
  3. Use the re.search() function to search for the pattern of the check string in the test string. The pattern should match any number of characters (represented by the . character in regular expressions) followed by the check string (check_str), and should be anchored to the beginning of the string (^). This will return a match object if a matching substring is found, or None if no match is found.
  4. If a match is found, extract the matched substring using the group() method of the match object.
  5. Print the extracted substring.

Example:

Python3




import re
 
# initializing string
test_str = "geeksforgeeks is best for all geeks"
 
# printing original string
print("The original string is : " + str(test_str))
 
# initializing check string
check_str = "freak"
 
# using regular expressions to find the substring
match = re.search(f"^{re.escape(check_str)}", test_str)
if match:
    res = match.group()
else:
    res = ""
 
# printing result
print("String till all characters occurred : " + str(res))


Output

The original string is : geeksforgeeks is best for all geeks
String till all characters occurred : 

Time complexity: O(n), where n is the length of the test string.
Auxiliary space: This method requires O(1) auxiliary space, as it does not use any additional data structures.

Method#6: Using the Recursive method:

Algorithms:

  1. Declare a recursive function that takes as input the strings test_str and check_str, and an optional index idx that defaults to 0.
  2. It checks if idx is equal to the length of test_str. If it is, it returns an empty string. Otherwise, it creates a temporary substring temp that consists of the first idx+1 characters of test_str.
  3. If all characters of check_str are present in temp, it returns temp. Otherwise, it calls itself recursively with idx+1. The function keeps calling itself recursively until all characters of check_str are present in the temporary substring or until the end of test_str is reached.

Below is the implementation of the above approach:

Python3




# Python program for the above approach
 
# Function to extract string from the given
# string till any characters
def extract_till_chars(test_str, check_str, idx=0):
    if idx == len(test_str):
        return ""
    temp = test_str[:idx+1]
    if all(char in temp for char in check_str):
        return temp
    else:
        return extract_till_chars(test_str, check_str, idx+1)
 
# Driver Code
test_str = "geeksforgeeks is best for all geeks"
check_str = "freak"
 
res = extract_till_chars(test_str, check_str)
 
print("The original string is : " + str(test_str))
 
print("String till all characters occurred: {}".format(res))


Output

The original string is : geeksforgeeks is best for all geeks
String till all characters occurred: geeksforgeeks is best for a

Time Complexity: The time complexity of the recursive function extract_till_chars is O(n*m), where n is the length of the input string test_str and m is the length of the input string check_str. This is because the function processes each character of test_str at most once and for each character it checks if all characters of check_str are present in the temporary substring.

Space Complexity: The space complexity of this function is O(n) due to the recursive call stack. Each recursive call adds a new frame to the call stack, and there can be up to n recursive calls before the base case is reached. This means that the maximum size of the call stack is n.

Method#7:  Using numpy:
Algorithm:

  1. Initialize the given string and check string.
  2. Using numpy, find the index of each character in the check string from the given string.
  3. Find the maximum index of all characters in the check string from the given string.
  4. Extract the substring till the maximum index from the given string.
  5. Print the final result.

Python3




import numpy as np
 
# initializing string
test_str = "geeksforgeeks is best for all geeks"
 
# printing original string
print("The original string is : " + str(test_str))
 
# initializing check string
check_str = "freak"
 
# using numpy to find the index of each character in check string
check_idx = np.array([test_str.find(char) for char in check_str])
 
# finding the maximum index of all characters in check string
max_idx = np.max(check_idx)
 
# extracting the substring till the maximum index
res = test_str[:max_idx+1]
 
# printing result
print("String till all characters occurred: " + str(res))
#This code is contributed by Rayudu.


Output:

The original string is : geeksforgeeks is best for all geeks
String till all characters occurred: geeksforgeeks is best for a

Time Complexity: O(n)

The time complexity of finding the index of each character in the check string using numpy is O(m*n) where m is the length of the check string and n is the length of the given string. However, since m is constant and usually small, the time complexity is approximately O(n).
Finding the maximum index of all characters in the check string using numpy takes O(1) time.
Extracting the substring till the maximum index from the given string takes O(1) time.
Therefore, the overall time complexity is O(n).
Space Complexity: O(m)

The space complexity of creating an array to store the index of each character in the check string using numpy is O(m) where m is the length of the check string.
Therefore, the overall space complexity is O(m).



Last Updated : 14 Apr, 2023
Like Article
Save Article
Previous
Next
Share your thoughts in the comments
Similar Reads