Seaborn | Distribution Plots

Seaborn is a Python data visualization library based on Matplotlib. It provides a high-level interface for drawing attractive and informative statistical graphics. This article deals with the distribution plots in seaborn which is used for examining univariate and bivariate distributions. In this article we will be discussing 4 types of distribution plots namely:

  1. joinplot
  2. distplot
  3. pairplot
  4. rugplot

Besides providing different kinds of visualization plots, seaborn also contains some built-in datasets. We will be using the tips dataset in this article. The “tips” dataset contains information about people who probably had food at a restaurant and whether or not they left a tip, their age, gender and so on. Lets have a look at it.

Code :



filter_none

edit
close

play_arrow

link
brightness_4
code

# import thr necessary libraries
import seaborn as sns
import matplotlib.pyplot as plt % matplotlib inline
  
# to ignore the warnings 
from warnings import filterwarnings
  
# load the dataset
df = sns.load_dataset('tips')
  
# the first five entries of the dataset
df.head()

chevron_right



Now, lets proceed onto the plots.
 

Displot

It is used basically for univariant set of observations and visualizes it through a histogram i.e. only one observation and hence we choose one particular column of the dataset.
Syntax:

distplot(a[, bins, hist, kde, rug, fit, ...])

Example:

filter_none

edit
close

play_arrow

link
brightness_4
code

# set the background style of the plot
sns.set_style('whitegrid')
sns.distplot(df['total_bill'], kde = False, color ='red', bins = 30)

chevron_right


Output:

Explanation:

  • KDE stands for Kernel Density Estimation and that is another kind of the plot in seaborn.
  • bins is used to set the number of bins you want in your plot and it actually depends on your dataset.
  • color is used to specify the color of the plot

Now looking at this we can say that most of the total bill given lies between 10 and 20.

Joinplot

It is used to draw a plot of two variables with bivariate and univariate graphs. It basically combines two different plots.
Syntax:

jointplot(x, y[, data, kind, stat_func, ...])    

Example:

filter_none

edit
close

play_arrow

link
brightness_4
code

sns.jointplot(x ='total_bill', y ='tip', data = df)

chevron_right


Output:

filter_none

edit
close

play_arrow

link
brightness_4
code

sns.jointplot(x ='total_bill', y ='tip', data = df, kind ='kde')
# KDE shows the density where the points match up the most

chevron_right



Explanation: