In the case of a real-world dataset, it is very common that some values in the dataset are missing. We represent these missing values as NaN (Not a Number) values. But to build a good machine learning model our dataset should be complete. That’s why we use some imputation techniques to replace the NaN values with some probable values. But before doing that we need to have a good understanding of how the NaN values are distributed in our dataset.

**Missingno** library offers a very nice way to visualize the distribution of NaN values. Missingno is a Python library and compatible with Pandas.

**Install the library –**

pip install missingno

To get the dataset used in the code, click here.

## Matrix :

Using this matrix you can very quickly find the pattern of missingness in the dataset. In our example, the columns `AAWhiteSt-4`

and `SulphidityL-4`

have a similar pattern of missing values while UCZAA shows a different pattern.

`# Program to visualize missing values in dataset ` ` ` `# Importing the libraries ` `import` `pandas as pd ` `import` `missingno as msno ` ` ` `# Loading the dataset ` `df ` `=` `pd.read_csv(` `"kamyr-digester.csv"` `) ` ` ` `# Visualize missing values as a matrix ` `msno.matrix(df) ` |

*chevron_right*

*filter_none*

**Output: **

## Bar Chart :

This bar chart gives you an idea about how many missing values are there in each column. In our example, `AAWhiteSt-4`

and `SulphidityL-4`

contain the most number of missing values followed by UCZAA.

`# Program to visualize missing values in dataset ` ` ` `# Importing the libraries ` `import` `pandas as pd ` `import` `missingno as msno ` ` ` `# Loading the dataset ` `df ` `=` `pd.read_csv(` `"kamyr-digester.csv"` `) ` ` ` `# Visualize the number of missing ` `# values as a bar chart ` `msno.bar(df) ` |

*chevron_right*

*filter_none*

**Output: **

## Heatmap :

Heatmap shows the correlation of missingness between every 2 columns. In our example, the correlation between AAWhiteSt-4 and SulphidityL-4 is 1 which means if one of them is present then the other one must be present.

A value near

-1means if one variable appears then the other variable is very likely to bemissing.

A value near0means there isno dependencebetween the occurrence of missing values of two variables.

A value near1means if one variable appears then the other variable is very likely to bepresent.

`# Program to visualize missing values in dataset ` ` ` `# Importing the libraries ` `import` `pandas as pd ` `import` `missingno as msno ` ` ` `# Loading the dataset ` `df ` `=` `pd.read_csv(` `"kamyr-digester.csv"` `) ` ` ` ` ` `# Visualize the correlation between the number of ` `# missing values in different columns as a heatmap ` `msno.heatmap(df) ` |

*chevron_right*

*filter_none*

**Output: **

Reference : https://github.com/ResidentMario/missingno

Attention geek! Strengthen your foundations with the **Python Programming Foundation** Course and learn the basics.

To begin with, your interview preparations Enhance your Data Structures concepts with the **Python DS** Course.

## Recommended Posts:

- Drop rows from Pandas dataframe with missing values or NaN in columns
- Count NaN or missing values in Pandas DataFrame
- Python | Visualize graphs generated in NetworkX using Matplotlib
- How To Visualize Sparse Matrix in Python using Matplotlib?
- Visualize Graphs in Python
- Python | Replace NaN values with average of columns
- How to Drop Rows with NaN Values in Pandas DataFrame?
- Ways to Create NaN Values in Pandas DataFrame
- Replace NaN Values with Zeros in Pandas DataFrame
- How to count the number of NaN values in Pandas?
- Replace all the NaN values with Zero's in a column of a Pandas dataframe
- Count the NaN values in one or more columns in Pandas DataFrame
- Highlight the nan values in Pandas Dataframe
- How to remove NaN values from a given NumPy array?
- How to Drop Columns with NaN Values in Pandas DataFrame?
- Python | cmath.nan Constant
- Check for NaN in Pandas DataFrame
- Replacing missing values using Pandas in Python
- Handling missing values using Sunbird
- Python | Find missing and additional values in two lists

If you like GeeksforGeeks and would like to contribute, you can also write an article using contribute.geeksforgeeks.org or mail your article to contribute@geeksforgeeks.org. See your article appearing on the GeeksforGeeks main page and help other Geeks.

Please Improve this article if you find anything incorrect by clicking on the "Improve Article" button below.