Pandas is one of the packages that makes importing and analyzing data much easier. Sometimes CSV file has null values, which are later displayed as NaN in Pandas DataFrame. Pandas dropna() method allows the user to analyze and drop Rows/Columns with Null values in different ways.
Pandas DataFrame.dropna() Syntax
Syntax: DataFrameName.dropna(axis=0, how=’any’, thresh=None, subset=None, inplace=False)
Parameters:
- axis: axis takes int or string value for rows/columns. Input can be 0 or 1 for Integer and ‘index’ or ‘columns’ for String.
- how: how takes string value of two kinds only (‘any’ or ‘all’). ‘any’ drops the row/column if ANY value is Null and ‘all’ drops only if ALL values are null.
- thresh: thresh takes integer value which tells minimum amount of na values to drop.
- subset: It’s an array which limits the dropping process to passed rows/columns through list. inplace: It is a boolean which makes the changes in data frame itself if True.
Pandas DataFrame dropna() Example
Example 1: Here we are using read_csv() to read our CSV file. Dropping Rows with at least 1 null value. A data frame is read and all rows with any Null values are dropped. The size of old and new data frames is compared to see how many rows had at least 1 Null value.
Python3
import pandas as pd
data = pd.read_csv( "nba.csv" )
new_data = data.dropna(axis = 0 , how = 'any' )
print ( "Old data frame length:" , len (data),
"\nNew data frame length:" ,
len (new_data),
"\nNumber of rows with at least 1 NA value: " ,
( len (data) - len (new_data)))
|
Output:
Since the difference is 94, there were 94 rows that had at least 1 Null value in any column.
Old data frame length: 458
New data frame length: 364
Number of rows with at least 1 NA value: 94
Example 2: Changing axis and using how and inplace Parameters Two data frames are made. A column with all values = none is added to the new Data frame. Column names are verified to see if the Null column was inserted properly. Then Number of columns is compared before and after dropping NaN values.
Python3
import pandas as pd
data = pd.read_csv( "nba.csv" )
new = pd.read_csv( "nba.csv" )
new[ "Null Column" ] = None
print (data.columns.values, "\n" , new.columns.values)
print ( "\nColumn number before dropping Null column\n" ,
len (data.dtypes), len (new.dtypes))
new.dropna(axis = 1 , how = 'all' , inplace = True )
print ( "\nColumn number after dropping Null column\n" ,
len (data.dtypes), len (new.dtypes))
|
Output:
['Name' 'Team' 'Number' 'Position' 'Age' 'Height' 'Weight' 'College'
'Salary']
['Name' 'Team' 'Number' 'Position' 'Age' 'Height' 'Weight' 'College'
'Salary' 'Null Column']
Column number before dropping Null column
9 10
Column number after dropping Null column
9 9
Whether you're preparing for your first job interview or aiming to upskill in this ever-evolving tech landscape,
GeeksforGeeks Courses are your key to success. We provide top-quality content at affordable prices, all geared towards accelerating your growth in a time-bound manner. Join the millions we've already empowered, and we're here to do the same for you. Don't miss out -
check it out now!
Last Updated :
31 Mar, 2023
Like Article
Save Article