Pandas remove rows with special characters
Last Updated :
21 Mar, 2024
In this article we will learn how to remove the rows with special characters i.e; if a row contains any value which contains special characters like @, %, &, $, #, +, -, *, /, etc. then drop such row and modify the data. To drop such types of rows, first, we have to search rows having special characters per column and then drop. To search we use regular expression either [@#&$%+-/*] or [^0-9a-zA-Z]. Let’s discuss the whole procedure with some examples :
Example 1:
Python3
import pandas as pd
df = pd.read_csv( "data1.csv" )
print (df)
|
Output:
Select rows with columns having special characters value
Python3
print (df[df.Name. str .contains(r '[@#&$%+-/*]' )])
|
Output:
Python3
print (df[df.Grade. str .contains(r '[^0-9a-zA-Z]' )])
|
Output:
Merging of selected rows
Python3
print (df[df.Name. str .contains(r '[^0-9a-zA-Z]' )
| df.Grade. str .contains(r '[@#&$%+-/*]' )])
|
Output:
Remove the merged selected rows
Python3
print (df.drop(df[df.Name. str .contains(r '[^0-9a-zA-Z]' )
| df.Grade. str .contains(r '[^0-9a-zA-Z]' )].index))
|
Output:
Example 2: This example uses a dataframe. shown below :
Python3
import pandas as pd
df = pd.read_csv( "data2.csv" )
print (df)
print (df[df. ID . str .contains(r '[^0-9a-zA-Z]' ) |
df.Name. str .contains(r '[^0-9a-zA-Z]' ) |
df.Age. str .contains(r '[^0-9a-zA-Z]' ) |
df.Country. str .contains(r '[^0-9a-zA-Z]' )])
print (df.drop(df[df. ID . str .contains(r '[^0-9a-zA-Z]' ) |
df.Name. str .contains(r '[^0-9a-zA-Z]' ) |
df.Age. str .contains(r '[^0-9a-zA-Z]' ) |
df.Country. str .contains(r '[^0-9a-zA-Z]' )].index))
|
Output :
Like Article
Suggest improvement
Share your thoughts in the comments
Please Login to comment...