How To Check If Cell Is Empty In Pandas Dataframe
Last Updated :
05 Feb, 2024
Pandas is an open-source library built on top of the NumPy library. It is a Python package that offers various data structures and operations for manipulating numerical data and time series. One such manipulation technique involves checking for empty cells in the Pandas data frame.
Empty cell
An empty cell or missing value in the Pandas data frame is a cell that consists of no value, even a NaN or None.
Check If Cell Is Empty In Pandas Dataframe
Pandas have different ways of checking for empty cells within a data frame.
- Using isnull() method
- Using isna() method
- Checking for empty cells explicitly
Importing Pandas and Numpy
Python3
import pandas as pd
import numpy as np
|
Creating a sample dataset
Here, we are creating a data frame with two columns containing some numerical values and a NaN value.
Python3
df = pd.DataFrame({ 'col1' : [ 1 , 2 , None ], 'col2' : [ 3 , None , 5 ]})
df
|
Output:
col1 col2
0 1.0 3.0
1 2.0 NaN
2 NaN 5.0
1. Using the isnull() method:
isnull() is a method used to identify the cells in a data frame that contain missing values or undefined data that are represented by NaN (not a number) values. The function can’t tell the difference between NaN values and empty cells.
We applied the isnull() function to check whether the data frame consists of NaN values, and it outputs the data frame with boolean values: true if the value in the cell is empty or NaN, and false if the cell contains a value.
Output:
col1 col2
0 False False
1 False True
2 True False
2. Using isna() method:
isna() is a method in Pandas similar to the isnull() method and gives the same result where it detects missing or undefined values within the data frame.
The difference between isna() and is_null() methods is their naming, isna() is an alias for isnull(). Both methods can be used interchangeably to achieve the same outcome.
Here, we are creating a new variable for saving the values that we get by applying the isna() method to the data frame and printing them.
Python3
na_df = df.isna()
print (na_df)
|
Output:
col1 col2
0 False False
1 False True
2 True False
You can observe that the is_null() and isna() methods have given the same output.
3. Checking for empty cells explicitly
We can identify the rows or columns that are empty using the all() function along with the is_null() function.
Python3
data = { 'A' : [ 1 , 2 , np.nan, 4 ],
'B' : [ 5 , np.nan, np.nan, 8 ],
'C' : [np.nan, np.nan, np.nan, np.nan]}
df = pd.DataFrame(data)
df
|
Output:
A B C
0 1.0 5.0 NaN
1 2.0 NaN NaN
2 NaN NaN NaN
3 4.0 8.0 NaN
Check for empty cells using boolean indexing
Here we are checking whether our dataset contains any empty rows. ‘axis=1’ will check if all values along axis 0 (i.e., along rows) in each row are True. If all values in a row are True, it means that all cells in that row are null.
Python3
empty_rows = df.isnull(). all (axis = 1 )
empty_rows
|
Output:
0 False
1 False
2 True
3 False
dtype: bool
Here we are checking whether our dataset contains any empty columns. We are using the loc() function to select rows and columns based on their labels (index names and column names).
‘axis=0’ will check if all values along axis 0 (i.e., along columns) in each column are True. If all values in a column are True, it means that all cells in that column are null.
Python3
empty_columns = df.isnull(). all (axis = 0 )
print (empty_columns)
|
Output:
A False
B False
C True
dtype: bool
Conclusion
In conclusion, Pandas provide different methods for identifying empty cells in data frames. We can use the is_null() and isna() functions for the empty cells in the data frame and we can combine functions to explicitly check for the empty rows and columns.
Share your thoughts in the comments
Please Login to comment...