Open In App

Hierarchically-clustered Heatmap in Python with Seaborn Clustermap

Improve
Improve
Improve
Like Article
Like
Save Article
Save
Share
Report issue
Report

Seaborn is an amazing visualization library for statistical graphics plotting in Python. It provides beautiful default styles and color palettes to make statistical plots more attractive. It is built on the top of matplotlib library and also closely integrated into the data structures from pandas

What is Clustering?

Clustering is basically grouping data based on relationships among the variables in the data.  Clustering algorithms help in getting structured data in unsupervised learning. The most common types of clustering are shown below.

Clustering_types

Here we are going to see hierarchical clustering especially Agglomerative(bottom-up) hierarchical clustering. In Agglomerative clustering, we start with considering each data point as a cluster and then repeatedly combine two nearest clusters into larger clusters until we are left with a single cluster. The graph we plot after performing agglomerative clustering on data is called Dendrogram.

Plotting Hierarchically clustered Heatmaps

Coming to the heat map, it is a graphical representation of data where values are represented using colors. Variation in the intensity of color depicts how data is clustered or varies over space.

The clustermap() function of seaborn plots a hierarchically-clustered heat map of the given matrix dataset. It returns a clustered grid index. 

Below are some examples which depict the hierarchically-clustered heat map from a dataset:

In the Flights dataset the data(Number of passengers) is clustered based on month and year:

Example 1: 

Python3




# Importing the library
import seaborn as sns
from sunbird.categorical_encoding import frequency_encoding
  
# Load dataset
data = sns.load_dataset('flights')
  
# Categorical encoding
frequency_encoding(data, 'month')
  
# Clustering data row-wise and
# changing color of the map.
sns.clustermap(data, figsize=(7, 7))


Output :

The legend to the left of the cluster map indicates information about the cluster map e.g bright color indicates more passengers and dark color indicates fewer passengers.     

Example 2:

Python3




# Importing the library
import seaborn as sns
from sunbird.categorical_encoding import frequency_encoding
  
# Load dataset
data = sns.load_dataset('flights')
  
# Categorical encoding
frequency_encoding(data, 'month')
  
# Clustering data row-wise and
# changing color of the map.
sns.clustermap(data, cmap='coolwarm', figsize=(7, 7))


Output:

Here we have changed the colors of the cluster map.



Last Updated : 02 Dec, 2020
Like Article
Save Article
Previous
Next
Share your thoughts in the comments
Similar Reads