Python | Pandas Series.str.find()

Python is a great language for doing data analysis, primarily because of the fantastic ecosystem of data-centric python packages. Pandas is one of those packages and makes importing and analyzing data much easier.

Pandas str.find() method is used to search a substring in each string present in a series. If the string is found, it returns the lowest index of its occurrence. If string is not found, it will return -1.

Start and end points can also be passed to search a specific part of string for the passed character or substring.



Syntax: Series.str.find(sub, start=0, end=None)

Parameters:
sub: String or character to be searched in the text value in series
start: int value, start point of searching. Default is 0 which means from the beginning of string
end: int value, end point where the search needs to be stopped. Default is None.

Return type: Series with index position of substring occurrence

To download the CSV used in code, click here.

In the following examples, the data frame used contains data of some NBA players. The image of data frame before any operations is attached below.

 
Example #1: Finding single character

In this example, a single character ‘a’ is searched in each string of Name column using str.find() method. Start and end parameters are kept default. The returned series is stored in a new column so that the indexes can be compared by looking directly. Before applying this method, null rows are dropped using .dropna() to avoid errors.

filter_none

edit
close

play_arrow

link
brightness_4
code

# importing pandas module 
import pandas as pd
  
# reading csv file from url 
   
# dropping null value columns to avoid errors
data.dropna(inplace = True)
  
# substring to be searched
sub ='a'
  
# creating and passsing series to new column
data["Indexes"]= data["Name"].str.find(sub)
  
# display
data

chevron_right


Output:
As shown in the output image, the occurrence of index in the Indexes column is equal to the position first occurrence of character in the string. If the substring doesn’t exist in the text, -1 is returned. It can also be seen by looking at the first row itself that ‘A’ wasn’t considered which proves this method is case sensitive.

 
Example #2: Searching substring (More than one character)

In this example, ‘er’ substring will be searched in the Name column of data frame. The start parameter is kept 2 to start search from 3rd(index position 2) element.

filter_none

edit
close

play_arrow

link
brightness_4
code

# importing pandas module 
import pandas as pd
  
# reading csv file from url 
   
# dropping null value columns to avoid errors
data.dropna(inplace = True)
  
  
# substring to be searched
sub ='er'
  
# start var
start = 2
  
# creating and passsing series to new column
data["Indexes"]= data["Name"].str.find(sub, start)
  
# display
data

chevron_right


Output:
As shown in the output image, the lest index of occurrence of substring is returned. But it can be seen, in case of Terry Rozier(Row 9 in data frame), instead of first occurrence of ‘er’, 10 was returned. This is because the start parameter was kept 2 and the first ‘er’ occurs before that.



My Personal Notes arrow_drop_up

Developer in day, Designer at night GSoC 2019 with Python Software Foundation (EOS Design system)

If you like GeeksforGeeks and would like to contribute, you can also write an article using contribute.geeksforgeeks.org or mail your article to contribute@geeksforgeeks.org. See your article appearing on the GeeksforGeeks main page and help other Geeks.

Please Improve this article if you find anything incorrect by clicking on the "Improve Article" button below.