Open In App

How to Drop Rows with NaN Values in Pandas DataFrame?

NaN stands for Not A Number and is one of the common ways to represent the missing value in the data. It is a special floating-point value and cannot be converted to any other type than float. NaN value is one of the major problems in Data Analysis. It is very essential to deal with NaN in order to get the desired results. In this article, we will discuss how to drop rows with NaN values.

What are NaN values?

NaN (Not a Number) is a unique floating-point value that is frequently used to indicate missing, undefined or unrepresentable results in numerical computations.



Why remove NaN values?

Data integrity is paramount in any analytical endeavor, and NaNs pose a threat to the seamless flow of data analysis and computations.

How to remove NaN values in Python pandas?

There are various ways to get rid of NaN values from dataset using Python pandas. The most popular techniques are:



Using dropna()

We can drop Rows having NaN Values in Pandas DataFrame by using dropna() function 

 df.dropna() 

It is also possible to drop rows with NaN values with regard to particular columns using the following statement:

df.dropna(subset, inplace=True)

With in place set to True and subset set to a list of column names to drop all rows with NaN under those columns.

Let’s make our own Dataframe and remove the rows with NaN values so that we can clean data.




import pandas as pd
import numpy as np
 
data = pd.DataFrame({'A': [1, 2, np.nan, 4], 'B': [5,6, 7, 8], 'C' : [10, 11, 12, np.nan],'D' : [21, 22, 23, 24]})
print(data)

Output:

     A  B     C   D
0 1.0 5 10.0 21
1 2.0 6 11.0 22
2 NaN 7 12.0 23
3 4.0 8 NaN 24





data = data.dropna() # drop rows with nan values
print(data)

Output:

     A  B     C   D
0 1.0 5 10.0 21
1 2.0 6 11.0 22

Using fillna()

We can use the fillna() method to replace NaN values in a DataFrame.  

df = df.fillna()




import pandas as pd
import numpy as np
 
car = pd.DataFrame({'Year of Launch': [1999, np.nan, 1986, 2020, np.nan,
                          1991],
       'Engine Number': [np.nan, 15, 22, 43, 44, np.nan],
       'Chasis Unique Id': [4023, np.nan, 3115, 4522, 3643,
                            3774]})
car

Output:

    Year of Launch    Engine Number    Chasis Unique Id
0 1999.0 NaN 4023.0
1 NaN 15.0 NaN
2 1986.0 22.0 3115.0
3 2020.0 43.0 4522.0
4 NaN 44.0 3643.0
5 1991.0 NaN 3774.0




car_filled = car.fillna(0)
car_filled

Output:

    Year of Launch    Engine Number    Chasis Unique Id
0 1999.0 0.0 4023.0
1 0.0 15.0 0.0
2 1986.0 22.0 3115.0
3 2020.0 43.0 4522.0
4 0.0 44.0 3643.0
5 1991.0 0.0 3774.0

All nan values has been replaced by 0.

Using Interpolate()

It estimates and fills missing values by linearly interpolating between neighboring data points, creating a smoother dataset. It is particularly useful for time series data. Use df.interpolate( ) to perform and replace NaN values with interpolated values in-place.




import pandas as pd
import numpy as np
 
dit = pd.DataFrame({'August': [32, 34, 4.85, 71.2, 1.1],
       'September': [54, 68, 9.25, np.nan, 0.9],
       'October': [ 5.8, 8.52, np.nan, 1.6, 11],
       'November': [ 5.8, 50, 8.9, 77, 78]})
dit

Output:

    August    September    October    November
0 32.00 54.00 5.80 5.8
1 34.00 68.00 8.52 50.0
2 4.85 9.25 NaN 8.9
3 71.20 NaN 1.60 77.0
4 1.10 0.90 11.00 78.0




dit=dit.interpolate()
dit

Output:

    August    September    October    November
0 32.00 54.000 5.80 5.8
1 34.00 68.000 8.52 50.0
2 4.85 9.250 5.06 8.9
3 71.20 5.075 1.60 77.0
4 1.10 0.900 11.00 78.0

Conclusion

Dealing with NaN values is a crucial aspect of data analysis, as these values can significantly impact the integrity of analytical results. In this article, we discussed the concept of NaN (Not a Number) values, which are often used to indicate missing or undefined results in numerical computations.

Frequently Asked Questions(FAQs)

1.How to replace NaN with no value in pandas?

In Pandas, use `df.fillna(‘No Value’, inplace=True)` to replace NaN with ‘No Value’ in the DataFrame.

2.How do I drop NaN values in a list?

In Python, use list = [x for x in list if str(x) != 'nan'] to drop NaN values from a list.

3.How to replace NaN in numpy list?

For a NumPy list, use np.nan_to_num(array) to replace NaN values with zeros.


Article Tags :