# KDE Plot Visualization with Pandas and Seaborn

Kernel Density Estimate (KDE) plot, a visualization technique that offers a detailed view of the probability density of continuous variables. In this article, we will be using Iris Dataset and KDE Plot to visualize the insights of the dataset.

## What is KDE Plot?

KDE Plot described as Kernel Density Estimate is used for visualizing the Probability Density of a continuous variable. It depicts the probability density at different values in a continuous variable. We can also plot a single graph for multiple samples which helps in more efficient data visualization. It provides a smoothed representation of the underlying distribution of a dataset.

The KDE plot visually represents the distribution of data, providing insights into its shape, central tendency, and spread. It is particularly useful when dealing with continuous data or when you want to explore the distribution without making assumptions about a specific parametric form (e.g., assuming the data follows a normal distribution). KDE plots are commonly used in statistical software packages and libraries for data visualization, such as Seaborn and Matplotlib in Python.

## Implementation

Let’s Import seaborn and matplotlib module for visualizations of kde plot.

## Python3

 `import` `pandas as pd ``import` `matplotlin.pyplot as plt`

### Creating a Univariate Seaborn KDE Plot

To start our exploration, we delve into the creation of a Univariate Seaborn KDE plot, visualizing the probability distribution of a single continuous attribute.

We can visualize the probability distribution of a sample against a single continuous attribute.

## Python3

 `# importing the required libraries``from` `sklearn ``import` `datasets``import` `pandas as pd``import` `seaborn as sns``import` `matplotlib.pyplot as plt``%``matplotlib inline` `# Setting up the Data Frame``iris ``=` `datasets.load_iris()` `iris_df ``=` `pd.DataFrame(iris.data, columns``=``[``'Sepal_Length'``,``                      ``'Sepal_Width'``, ``'Patal_Length'``, ``'Petal_Width'``])` `iris_df[``'Target'``] ``=` `iris.target` `iris_df[``'Target'``].replace([``0``], ``'Iris_Setosa'``, inplace``=``True``)``iris_df[``'Target'``].replace([``1``], ``'Iris_Vercicolor'``, inplace``=``True``)``iris_df[``'Target'``].replace([``2``], ``'Iris_Virginica'``, inplace``=``True``)` `# Plotting the KDE Plot``sns.kdeplot(iris_df.loc[(iris_df[``'Target'``]``=``=``'Iris_Virginica'``),``            ``'Sepal_Length'``], color``=``'b'``, shade``=``True``, label``=``'Iris_Virginica'``)` `# Setting the X and Y Label``plt.xlabel(``'Sepal Length'``)``plt.ylabel(``'Probability Density'``)`

Output:

We can also visualize the probability distribution of multiple samples in a single plot.

## Python3

 `# Plotting the KDE Plot``sns.kdeplot(iris_df.loc[(iris_df[``'Target'``]``=``=``'Iris_Setosa'``),``            ``'Sepal_Length'``], color``=``'r'``, shade``=``True``, label``=``'Iris_Setosa'``)` `sns.kdeplot(iris_df.loc[(iris_df[``'Target'``]``=``=``'Iris_Virginica'``), ``            ``'Sepal_Length'``], color``=``'b'``, shade``=``True``, label``=``'Iris_Virginica'``)` `plt.xlabel(``'Sepal Length'``)``plt.ylabel(``'Probability Density'``)`

Output:

### Creating a Bivariate Seaborn Kdeplot

Moving beyond univariate analysis, we extend our visualization prowess to the Bivariate Seaborn KDE plot. This sophisticated technique enables the examination of the probability distribution of a sample against multiple continuous attributes.

## Python3

 `# Setting up the samples``iris_setosa ``=` `iris_df.query("Target``=``=``'Iris_Setosa'``")``iris_virginica ``=` `iris_df.query("Target``=``=``'Iris_Virginica'``")` `# Plotting the KDE Plot``sns.kdeplot(iris_setosa[``'Sepal_Length'``], ``            ``iris_setosa[``'Sepal_Width'``],``            ``color``=``'r'``, shade``=``True``, label``=``'Iris_Setosa'``,``            ``cmap``=``"Reds", shade_lowest``=``False``)`

Output:

We can also visualize the probability distribution of multiple samples in a single plot.

## Python3

 `sns.kdeplot(iris_setosa[``'Sepal_Length'``],``            ``iris_setosa[``'Sepal_Width'``],``            ``color``=``'r'``, shade``=``True``, label``=``'Iris_Setosa'``,``            ``cmap``=``"Reds"``, shade_lowest``=``False``)` `sns.kdeplot(iris_virginica[``'Sepal_Length'``], ``            ``iris_virginica[``'Sepal_Width'``], color``=``'b'``,``            ``shade``=``True``, label``=``'Iris_Virginica'``,``            ``cmap``=``"Blues"``, shade_lowest``=``False``)` `plt.xlabel(``'Sepal Length'``)``plt.ylabel(``'Sepal Width'``)``plt.title(``'Bivariate Seaborn KDE Plot'``)``plt.legend()``plt.show()`

Output:

## Conclusion

In conclusion, the KDE plot emerges as a formidable ally in the quest for data insights. Its ability to visualize probability density across various attributes empowers data analysts and scientists to discern hidden patterns and make informed decisions. Whether employed for univariate or bivariate analysis, the KDE plot stands as a versatile and indispensable tool in the toolkit of data visualization.

### 1.What is the purpose of KDE plot?

The KDE plot visually represents the probability density of a continuous variable, offering insights into the data’s distribution, shape, and central tendency.

### 2.What is the use of KDE in Python?

In Python, KDE (Kernel Density Estimation) is used for efficient visualization of probability density functions, especially in statistical libraries like Seaborn and Matplotlib.

### 3.What is the difference between histogram and KDE plot?

While histograms display data distribution through bins, KDE plots use a smooth curve to estimate probability density, providing a continuous and visually refined representation of the underlying distribution.

### 4.What does a kernel density plot show?

A kernel density plot shows the smoothed probability density of a dataset. It highlights peaks, modes, and trends, aiding in the visual exploration of continuous variable distributions.

Previous
Next