Prerequisite: Regular Expression in Python
In this article, we will see how to extract punctuation used in the specified column of the Dataframe using Regex.
Firstly, we are making regular expression that contains all the punctuation: [!”\$%&\'()*+,\-.\/:;=#@?\[\\\]^_`{|}~]* Then we are passing each row of specific column to re.findall() function for extracting the punctuation and then assigning that extracted punctuation to a new column in a Dataframe.
re.findall() function is used to extract all non-overlapping matches of pattern in string, as a list of strings. The string is scanned left-to-right, and matches are returned in the order found.
Syntax: re.findall(regex, string)
Return: All non-overlapping matches of pattern in string, as a list of strings.
Now, Let’s create a Dataframe:
Python3
import pandas as pd
import re
df = pd.DataFrame({
'Name' : [ 'Akash' , 'Ashish' , 'Ayush' ,
'Diksha' , 'Radhika' ],
'Comments' : [ 'Hey! Akash how r u' ,
'Why are you asking this to me?' ,
'Today, what we are going to do.' ,
'No plans for today why?' ,
'Wedding plans, what are you saying?' ]},
columns = [ 'Name' , 'Comments' ]
)
df
|
Output:

Now, Extracting the punctuation from the column comment:
Python3
def check_find_punctuations(text):
result = re.findall(r '[!"\$%&\'()*+,\-.\/:;=#@?\[\\\]^_`{|}~]*' ,
text)
string = "".join(result)
return list (string)
df[ 'punctuation_used' ] = df[ 'Comments' ]. apply (
lambda x : check_find_punctuations(x)
)
df
|
Output:

Whether you're preparing for your first job interview or aiming to upskill in this ever-evolving tech landscape,
GeeksforGeeks Courses are your key to success. We provide top-quality content at affordable prices, all geared towards accelerating your growth in a time-bound manner. Join the millions we've already empowered, and we're here to do the same for you. Don't miss out -
check it out now!
Last Updated :
29 Dec, 2020
Like Article
Save Article