Open In App

Python program to find start and end indices of all Words in a String

Last Updated : 16 May, 2023
Improve
Improve
Like Article
Like
Save
Share
Report

Given a String, return all the start indices and end indices of each word.

Examples:

Input : test_str = ‘ Geekforgeeks is Best’ 
Output : [(1, 12), (16, 17), (19, 22)] 
Explanation : “Best” Starts at 19th index, and ends at 22nd index.

Input : test_str = ‘ Geekforgeeks is Best’ 
Output : [(1, 12), (17, 18), (20, 23)] 
Explanation : “Best” Starts at 20th index, and ends at 23rd index. 

Method : Using list comprehension + regex + finditer()

In this, we extract all the words using finditer() and regex, to get initial and end index, we use start() and end() and encapsulate using list comprehension in form of tuple list.

Python3




# Python3 code to demonstrate working of
# Word Ranges in String
# Using list comprehension + regex + finditer()
import re
 
# initializing string
test_str = ' Geekforgeeks   is Best    for  geeks'
 
# printing original string
print("The original string is : " + str(test_str))
 
# regex to get words, loop to get each start and end index
res = [(ele.start(), ele.end() - 1) for ele in re.finditer(r'\S+', test_str)]
 
# printing result
print("Word Ranges are : " + str(res))


Output

The original string is :  Geekforgeeks   is Best    for  geeks
Word Ranges are : [(1, 12), (16, 17), (19, 22), (27, 29), (32, 36)]

Time Complexity: O(n)
Auxiliary Space: O(n)

Approach#2: Using loop

The approach used in this code is to iterate through each character of the input string using a for loop. If a space character is encountered, the indices of the previous word are recorded and the start_index is updated to the next character. The final word indices are recorded after the loop ends.

  1. Initialize an empty list word_indices to store the start and end indices of each word.
  2. Initialize a variable start_index to 0.
  3. Loop through each character in the input string using a for loop and the range() function and If the current character is a space then:
    • Check if the previous word was non-empty, i.e., start_index is not equal to the current index.
    • If yes, append the start and end indices of the previous word to word_indices.
    • Update the start_index to the next character’s index.
  4. After the loop, check if the last word was non-empty, i.e., start_index is not equal to the length of the string.
  5. If yes, append the start and end indices of the last word to word_indices.
  6. Return the word_indices list.

Python3




# Python program for the above approach
 
# Function to find the word indices
def find_word_indices(test_str):
    word_indices = []
    start_index = 0
    for i in range(len(test_str)):
       
        if test_str[i] == " ":
            if start_index != i:
                word_indices.append((start_index, i - 1))
            start_index = i + 1
             
    if start_index != len(test_str):
        word_indices.append((start_index, len(test_str) - 1))
    return word_indices
 
 
# Driver Code
test_str = 'Geekforgeeks is Best'
print(find_word_indices(test_str))


Output

[(0, 11), (13, 14), (16, 19)]

Time Complexity: O(n) where n is the length of the input string test_str. The for loop iterates through each character in the string once, and the time taken for each iteration is constant.

Space Complexity: O(m) where m is the number of words in the input string test_str. The word_indices list stores the start and end indices of each word, and the maximum size of this list is the number of words in the string. The start_index and i variables used in the loop take constant space.

Approach#3: Using split()+ index()

Split the string into individual words using the split() method. For each word, find its starting and ending indices using the index() method.

Algorithm

1. Split the input string into words using split() method.
2. Initialize an empty list indices to store the starting and ending indices of each word.
3. For each word in the list of words:
Find the starting index of the word using the index() method and store it in a variable start.
Find the ending index of the word by adding the length of the word minus 1 to the starting index and store it in a variable end.
4. Append a tuple (star

Python3




test_str = 'Geekforgeeks is Best'
words = test_str.split()
indices = [(test_str.index(word), test_str.index(word)+len(word)-1) for word in words]
print(indices)


Output

[(0, 11), (13, 14), (16, 19)]

Time complexity: O(nm), where n is the number of words in the input string and m is the length of the longest word.

Auxiliary Space: O(nm), where n is the number of words in the input string and m is the length of the longest word.



Like Article
Suggest improvement
Share your thoughts in the comments

Similar Reads