Open In App

Introduction to Seaborn – Python

Prerequisite Matplotlib Library 

Visualization is an important part of storytelling, we can gain a lot of information from data by simply just plotting the features of data. Python provides a numerous number of libraries for data visualization, we have already seen the Matplotlib library in this article we will know about Seaborn Library. 



What is Seaborn

Seaborn is an amazing visualization library for statistical graphics plotting in Python. It provides beautiful default styles and color palettes to make statistical plots more attractive. It is built on top matplotlib library and is also closely integrated with the data structures from pandas.
Seaborn aims to make visualization the central part of exploring and understanding data. It provides dataset-oriented APIs so that we can switch between different visual representations for the same variables for a better understanding of the dataset.

Different categories of plot in Seaborn 

Plots are basically used for visualizing the relationship between variables. Those variables can be either completely numerical or a category like a group, class, or division. Seaborn divides the plot into the below categories – 
 



Installation of Seaborn Library 

For Python environment : 

pip install seaborn

For conda environment : 

conda install seaborn

Dependencies for Seaborn Library 

There are some libraries that must be installed before using Seaborn. Here we will list out some basics that are a must for using Seaborn. 

However, we must note that if try to use Seaborn 

Some basic plots using seaborn

Histplot:  Seaborn Histplot is used to visualize the univariate set of distributions(single variable). It plots a histogram, with some other variations like kdeplot and rugplot. The Histplot function takes several arguments but the important ones are

  • data: This is the array, series, or dataframe that you want to visualize. It is a required parameter.
  • x: This specifies the column in the data to use for the histogram. If your data is a dataframe, you can specify the column by name.
  • y: This specifies the column in the data to use for the histogram when you want to create a bivariate histogram. By default, it is set to None, meaning that a univariate histogram will be plotted.
  • bins: This specifies the number of bins to use when dividing the data into intervals for plotting. By default, it is set to “auto”, which uses an algorithm to determine the optimal number of bins.
  • kde: This parameter controls whether to display a kernel density estimate (KDE) of the data in addition to the histogram. By default, it is set to False, meaning that a KDE will not be plotted.




import numpy as np
import seaborn as sns
 
sns.set(style="white")
 
# Generate a random univariate dataset
rs = np.random.RandomState(10)
d = rs.normal(size=100)
 
# Plot a simple histogram and kde
sns.histplot(d, kde=True, color="m")

Output: 

Histogram with seaborn 

Distplot: Seaborn distplot is used to visualize the univariate set of distributions(Single features) and plot the histogram with some other variations like kdeplot and rugplot.

The function takes several parameters, but the most important ones are:

  • a: This is the array, series, or list of data that you want to visualize. It is a required parameter.
  • bins: This specifies the number of bins to use when dividing the data into intervals for plotting. By default, it is set to “auto”, which uses an algorithm to determine the optimal number of bins.
  • kde: This parameter controls whether to display a kernel density estimate (KDE) of the data in addition to the histogram. By default, it is set to True, meaning that a KDE will be plotted.
  • hist: This parameter controls whether to display the histogram of the data. By default, it is set to True, meaning that a histogram will be plotted.




import numpy as np
import seaborn as sns
 
sns.set(style="white")
 
# Generate a random univariate dataset
rs = np.random.RandomState(10)
d = rs.normal(size=100)
 
# Define the colors to use
colors = ["r", "g", "b"]
 
# Plot a histogram with multiple colors
sns.distplot(d, kde=True, hist=True, bins=10,
             rug=True,hist_kws={"alpha": 0.3,
                                "color": colors[0]},
             kde_kws={"color": colors[1], "lw": 2},
             rug_kws={"color": colors[2]})

Output:

Distplot using seaborn 

Note: The distplot function has been depreciated in the newer version of the Seaborn Library 

Lineplot: The line plot is one of the most basic plots in the seaborn library. This plot is mainly used to visualize the data in the form of some time series, i.e. in a continuous manner.




import seaborn as sns
 
 
sns.set(style="dark")
fmri = sns.load_dataset("fmri")
 
# Plot the responses for different\
# events and regions
sns.lineplot(x="timepoint",
             y="signal",
             hue="region",
             style="event",
             data=fmri)

Output : 

Lineplot using seaborn 

Lmplot:  The lmplot is another most basic plot. It shows a line representing a linear regression model along with data points on the 2D space and x and y can be set as the horizontal and vertical labels respectively.




import seaborn as sns
 
sns.set(style="ticks")
 
# Loading the dataset
df = sns.load_dataset("anscombe")
 
# Show the results of a linear regression
sns.lmplot(x="x", y="y", data=df)

Output : 

Lmplot using seaborn 

 


Article Tags :