Open In App

Python – Start and End Indices of words from list in String

Given a String, our task is to write a Python program to extract the start and end index of all the elements of words of another list from a string.

Input : test_str = “gfg is best for all CS geeks and engineering job seekers”, check_list = [“geeks”, “engineering”, “best”, “gfg”]
Output : {‘geeks’: [23, 27], ‘engineering’: [33, 43], ‘best’: [7, 10], ‘gfg’: [0, 2]}
Explanation : “geeks” starts from index number 23 till 27, hence the result.



Input : test_str = “gfg is best for all CS geeks and engineering job seekers”, check_list = [“geeks”, “gfg”]
Output : {‘geeks’: [23, 27], ‘gfg’: [0, 2]}
Explanation : “geeks” starts from index number 23 till 27, hence the result.

Method #1 : Using loop + index() + len()



In this, loop is used to get each element from list. The index() gets the initial index and len() gets the last index of all the elements from list in the string.




# Python3 code to demonstrate working of
# Start and End Indices of words from list in String
# Using loop + index() + len()
 
# initializing string
test_str = "gfg is best for all CS geeks and engineering job seekers"
 
# printing original string
print("The original string is : " + str(test_str))
 
# initializing check_list
check_list = ["geeks", "engineering", "best", "gfg"]
 
res = dict()
for ele in check_list :
    if ele in test_str:
         
        # getting front index
        strt = test_str.index(ele)
         
        # getting ending index
        res[ele] = [strt, strt + len(ele) - 1]
 
# printing result
print("Required extracted indices  : " + str(res))

Output:

The original string is : gfg is best for all CS geeks and engineering job seekers

Required extracted indices  : {‘geeks’: [23, 27], ‘engineering’: [33, 43], ‘best’: [7, 10], ‘gfg’: [0, 2]}

Time Complexity: O(n^2)
Auxiliary Space: O(n)

Method #2 : Using dictionary comprehension + len() + index()

In this, we perform tasks similar to the above function but the construction of the result dictionary is done using shorthand using dictionary comprehension. 




# Python3 code to demonstrate working of
# Start and End Indices of words from list in String
# Using dictionary comprehension + len() + index()
 
# initializing string
test_str = "gfg is best for all CS geeks and engineering job seekers"
 
# printing original string
print("The original string is : " + str(test_str))
 
# initializing check_list
check_list = ["geeks", "engineering", "best", "gfg"]
 
# Dictionary comprehension to be used as shorthand for
# forming result Dictionary
res = {key: [test_str.index(key), test_str.index(key) + len(key) - 1]
       for key in check_list if key in test_str}
 
# printing result
print("Required extracted indices  : " + str(res))

Output:

The original string is : gfg is best for all CS geeks and engineering job seekers
Required extracted indices  : {‘geeks’: [23, 27], ‘engineering’: [33, 43], ‘best’: [7, 10], ‘gfg’: [0, 2]}

Time Complexity: O(n)
Auxiliary Space: O(n)

Method #3 : Using loop+find()+len() methods
 




# Python3 code to demonstrate working of
# Start and End Indices of words from list in String
# Using loop + find() + len()
 
# initializing string
test_str = "gfg is best for all CS geeks and engineering job seekers"
 
# printing original string
print("The original string is : " + str(test_str))
 
# initializing check_list
check_list = ["geeks", "engineering", "best", "gfg"]
 
res = dict()
for ele in check_list :
    if ele in test_str:
         
        # getting front index
        strt = test_str.find(ele)
         
        # getting ending index
        res[ele] = [strt, strt + len(ele) - 1]
 
# printing result
print("Required extracted indices : " + str(res))

Output
The original string is : gfg is best for all CS geeks and engineering job seekers
Required extracted indices : {'geeks': [23, 27], 'engineering': [33, 43], 'best': [7, 10], 'gfg': [0, 2]}

Time complexity: O(n*m),
Auxiliary space: O(k),

Method #4: Using regular expression module re.finditer()

This method uses the finditer() method from the regular expression module to search for all occurrences of the words in the check_list in the given string test_str. For each match, it extracts the start and end indices and stores them in a dictionary.




import re
 
# initializing string
test_str = "gfg is best for all CS geeks and engineering job seekers"
 
# printing original string
print("The original string is : " + str(test_str))
 
# initializing check_list
check_list = ["geeks", "engineering", "best", "gfg"]
 
# initializing result dictionary
res = {}
 
# searching for all occurrences of words in check_list using regular expression
for ele in check_list:
    for match in re.finditer(ele, test_str):
        # getting start index of match
        start_index = match.start()
         
        # getting end index of match
        end_index = match.end() - 1
         
        # adding match indices to result dictionary
        if ele in res:
            res[ele].append((start_index, end_index))
        else:
            res[ele] = [(start_index, end_index)]
 
# printing result
print("Required extracted indices : " + str(res))

Output
The original string is : gfg is best for all CS geeks and engineering job seekers
Required extracted indices : {'geeks': [(23, 27)], 'engineering': [(33, 43)], 'best': [(7, 10)], 'gfg': [(0, 2)]}

Time complexity: O(n * m), where n is the length of the string test_str and m is the number of words in check_list.
Auxiliary space: O(k * l), where k is the number of words in check_list and l is the maximum number of occurrences of any word in test_str.

Method #5: Using list comprehension + enumerate() + len()

Step-by-step approach:




# Python3 code to demonstrate working of
# Start and End Indices of words from list in String
# Using list comprehension + enumerate() + len()
 
# initializing string
test_str = "gfg is best for all CS geeks and engineering job seekers"
 
# printing original string
print("The original string is : " + str(test_str))
 
# initializing check_list
check_list = ["geeks", "engineering", "best", "gfg"]
 
# initialize result dictionary
res = {}
 
# iterate over the words in check_list and get the index and length of each word in test_str
for word in check_list:
    for idx, val in enumerate(test_str.split()):
        if val == word:
            start_idx = test_str.index(val)
            end_idx = start_idx + len(val) - 1
            if word in res:
                res[word].append((start_idx, end_idx))
            else:
                res[word] = [(start_idx, end_idx)]
 
# print result dictionary
print("Required extracted indices  : " + str(res))

Output
The original string is : gfg is best for all CS geeks and engineering job seekers
Required extracted indices  : {'geeks': [(23, 27)], 'engineering': [(33, 43)], 'best': [(7, 10)], 'gfg': [(0, 2)]}

Time complexity: O(n*m), where n is the length of the string and m is the length of the check_list.
Auxiliary space: O(k), where k is the number of words found in the string.


Article Tags :