Get the substring of the column in Pandas-Python
Now, we’ll see how we can get the substring for all the values of a column in a Pandas dataframe. This extraction can be very useful when working with data. For example, we have the first name and last name of different people in a column and we need to extract the first 3 letters of their name to create their username.
Example 1:
We can loop through the range of the column and calculate the substring for each value in the column.
import pandas as pd
dict = { 'Name' :[ "John Smith" , "Mark Wellington" ,
"Rosie Bates" , "Emily Edward" ]}
df = pd.DataFrame.from_dict( dict )
for i in range ( 0 , len (df)):
df.iloc[i].Name = df.iloc[i].Name[: 3 ]
df
|
Output:
Note: For more information, refer Python Extracting Rows Using Pandas
Example 2: In this example we’ll use str.slice()
.
import pandas as pd
dict = { 'Name' :[ "John Smith" , "Mark Wellington" ,
"Rosie Bates" , "Emily Edward" ]}
df = pd.DataFrame.from_dict( dict )
df[ 'UserName' ] = df[ 'Name' ]. str . slice ( 0 , 3 )
df
|
Output:
Example 3: We can also use the str accessor in a different way by using square brackets.
import pandas as pd
dict = { 'Name' :[ "John Smith" , "Mark Wellington" ,
"Rosie Bates" , "Emily Edward" ]}
df = pd.DataFrame.from_dict( dict )
df[ 'UserName' ] = df[ 'Name' ]. str [: 3 ]
df
|
Output:
Example 4: We can also use str.extract for this task. In this example we’ll store last name of each person in “LastName” column.
import pandas as pd
dict = { 'Name' :[ "John Smith" , "Mark Wellington" ,
"Rosie Bates" , "Emily Edward" ]}
df = pd.DataFrame.from_dict( dict )
df[ 'LastName' ] = df.Name. str .extract(r '\b(\w+)$' ,
expand = True )
df
|
Output:
Last Updated :
10 Jul, 2020
Like Article
Save Article
Share your thoughts in the comments
Please Login to comment...