Open In App

Treemaps in Python using Squarify

Last Updated : 28 Mar, 2022
Improve
Improve
Like Article
Like
Save
Share
Report

Data Visualization is a powerful technique to analyze a large dataset through graphical representation. Python provides various modules that support the graphical representation of data. The widely used modules are Matplotlib, Seaborn, and Plotly. And we have one more module named Squarify which is mainly used to plot a Treemap. 

When To Use Squarify? 

Here the question is when to use Squarify instead of Why to use. As Python already has 2 to 3 data visualization modules that do most of the task. Squarify is the best fit when you have to plot a Treemap. Treemaps display hierarchical data as a set of nested squares/rectangles-based visualization. 

Squarify is a great choice:

  • To plot a huge amount of data.
  • Bar charts can’t be effective to handle and visualize large data thus treemaps are used, and Squarify comes into play.
  • To plot the proportions between each part and the whole by providing the label to them.
  • To show the pattern of the distribution of the measure across each level of categories in the hierarchy.
  • To show attributes using size and color-coding.
  • To spot patterns, outliers, most-important contributors, and exceptions.

Plot Treemap Using Squarify

A Treemap diagram is an appropriate type of visualization when the data set is structured in hierarchical order with a tree layout with roots, branches, and nodes. It allows us to show information about an important amount of data in a very efficient way in a limited space. 

We shall now plot a Treemap using Squarify. Install the module using pip install module_name.  

pip install squarify

Import the necessary modules. 

Python3




import squarify
import matplotlib.pyplot as plt


Plot

The plot is the method using which you can create a Treemap using Squarify. Squarify takes sizes as the first argument and also supports many features which we will look at one by one. Initially, the plot method plots a square of dimension 100×100. 

Python3




squarify.plot(sizes=[1, 2, 3, 4, 5],
              color="yellow")


Output:

Color

For making the plot more attractive we shall change the color of the plot. There are two ways by which we can change the color of the chart:

  • List Of Color
  • Palette

Method 1: We shall pass a list with color names it may or may not match the length of the data. If you have a color list less than the length of data, the same colors are repeated. 

Python3




data = [300, 400, 120, 590, 600, 760]
colors = ["red", "black", "green",
          "violet", "yellow", "blue"]
squarify.plot(sizes=data, color=colors)
plt.axis("off")


Output:

Method 2: We shall import the Python Seaborn module and select a color palette method.

Syntax: seaborn.color_palette(type,total_colors_required)

#total_colors_required should be integer

#you can choose any type from this list:

“””

‘Accent’, ‘Accent_r’, ‘Blues’, ‘Blues_r’, ‘BrBG’, ‘BrBG_r’, ‘BuGn’, ‘BuGn_r’, ‘BuPu’, ‘BuPu_r’, ‘CMRmap’, ‘CMRmap_r’, ‘Dark2’, ‘Dark2_r’, ‘GnBu’, ‘GnBu_r’, ‘Greens’, ‘Greens_r’, ‘Greys’, ‘Greys_r’, ‘OrRd’, ‘OrRd_r’, ‘Oranges’, ‘Oranges_r’, ‘PRGn’, ‘PRGn_r’, ‘Paired’, ‘Paired_r’, ‘Pastel1’, ‘Pastel1_r’, ‘Pastel2’, ‘Pastel2_r’, ‘PiYG’, ‘PiYG_r’, ‘PuBu’, ‘PuBuGn’, ‘PuBuGn_r’, ‘PuBu_r’, ‘PuOr’, ‘PuOr_r’, ‘PuRd’, ‘PuRd_r’, ‘Purples’, ‘Purples_r’, ‘RdBu’, ‘RdBu_r’, ‘RdGy’, ‘RdGy_r’, ‘RdPu’, ‘RdPu_r’, ‘RdYlBu’, ‘RdYlBu_r’, ‘RdYlGn’, ‘RdYlGn_r’, ‘Reds’, ‘Reds_r’, ‘Set1’, ‘Set1_r’, ‘Set2’, ‘Set2_r’, ‘Set3’, ‘Set3_r’, ‘Spectral’, ‘Spectral_r’, ‘Wistia’, ‘Wistia_r’, ‘YlGn’, ‘YlGnBu’, ‘YlGnBu_r’, ‘YlGn_r’, ‘YlOrBr’, ‘YlOrBr_r’, ‘YlOrRd’, ‘YlOrRd_r’, ‘afmhot’, ‘afmhot_r’, ‘autumn’, ‘autumn_r’, ‘binary’, ‘binary_r’, ‘bone’, ‘bone_r’, ‘brg’, ‘brg_r’, ‘bwr’, ‘bwr_r’, ‘cividis’, ‘cividis_r’, ‘cool’, ‘cool_r’, ‘coolwarm’, ‘coolwarm_r’, ‘copper’, ‘copper_r’, ‘crest’, ‘crest_r’, ‘cubehelix’, ‘cubehelix_r’, ‘flag’, ‘flag_r’, ‘flare’, ‘flare_r’, ‘gist_earth’, ‘gist_earth_r’, ‘gist_gray’, ‘gist_gray_r’, ‘gist_heat’, ‘gist_heat_r’, ‘gist_ncar’, ‘gist_ncar_r’, ‘gist_rainbow’, ‘gist_rainbow_r’, ‘gist_stern’, ‘gist_stern_r’, ‘gist_yarg’, ‘gist_yarg_r’, ‘gnuplot’, ‘gnuplot2’, ‘gnuplot2_r’, ‘gnuplot_r’, ‘gray’, ‘gray_r’, ‘hot’, ‘hot_r’, ‘hsv’, ‘hsv_r’, ‘icefire’, ‘icefire_r’, ‘inferno’, ‘inferno_r’, ‘jet’, ‘jet_r’, ‘magma’, ‘magma_r’, ‘mako’, ‘mako_r’, ‘nipy_spectral’, ‘nipy_spectral_r’, ‘ocean’, ‘ocean_r’, ‘pink’, ‘pink_r’, ‘plasma’, ‘plasma_r’, ‘prism’, ‘prism_r’, ‘rainbow’, ‘rainbow_r’, ‘rocket’, ‘rocket_r’, ‘seismic’, ‘seismic_r’, ‘spring’, ‘spring_r’, ‘summer’, ‘summer_r’, ‘tab10’, ‘tab10_r’, ‘tab20’, ‘tab20_r’, ‘tab20b’, ‘tab20b_r’, ‘tab20c’, ‘tab20c_r’, ‘terrain’, ‘terrain_r’, ‘turbo’, ‘turbo_r’, ‘twilight’, ‘twilight_r’, ‘twilight_shifted’, ‘twilight_shifted_r’, ‘viridis’, ‘viridis_r’, ‘vlag’, ‘vlag_r’, ‘winter’, ‘winter_r’

“””

Python3




import seaborn as sb
  
data = [300, 400, 120, 590, 600, 760]
squarify.plot(sizes=data, 
              color=sb.color_palette("Spectral"
                                     len(data)))
plt.axis("off")


Output:

Alpha

The alpha argument is used to vary the opacity of the image. It can either be an integer or floating value in the range of 0 to 1. The alpha value near 1 has high opacity whereas the alpha value near 0 has less opacity.

Python3




data = [300,400,720,213]
colors = ["red","black","green","violet"]
squarify.plot(sizes=data,color=colors,alpha=0.8)
plt.axis("off")


Output:

Here, we will see a lower value of alpha.

Python3




data = [300,400,720,213]
colors = ["red","black","green","violet"]
squarify.plot(sizes=data,color=colors,alpha=0.3)
plt.axis("off")


Output:

Scale the Chart

Scale is used to change the range of the chart, by default, the range of the plot is 100×100. Using norm_x you can scale x-axis data whereas norm_y you can scale the y-axis. 

Python3




data = [100, 20, 50, 1000]
colors = ["red", "yellow", "blue", "green"]
squarify.plot(sizes=data, color=colors)


Output:

Scaling with both axes.

Python3




data = [100, 20, 50, 1000]
colors = ["red", "yellow", "blue", "green"]
squarify.plot(sizes=data, norm_x=1000
              norm_y=10, color=colors)


Output:

Labels

A Treemap without a label is just a box with no meaning. The label adds meaning to the treemap divisions and denotes what specific plots represent. You can increase the font size of the label by adding an extra argument text_kwargs.

Python3




episode_data = [1004, 720, 366, 360, 80]
anime_names = ["One Piece", "Naruto", "Bleach"
               "Gintama", "Attack On Titan"]
  
squarify.plot(episode_data, label=anime_names)
  
plt.axis("off")


Output:

Padding

Padding takes an integer value that is used to add spaces between treemaps for proper visualization.

Python3




squarify.plot(episode_data, label=anime_names, pad=2)
plt.axis("off")


Output:

Building A Treemap On Real-World Dataset Using Squarify

We shall now see how to implement a Treemap on a real-world dataset. You can download the dataset from https://www.kaggle.com/hamdallak/the-world-of-pokemons. In the below code we are taking the top 20 Pokemons and creating a Treemap based on the Primary Type of the top 20 Pokemons.

Python3




# import required modules
import pandas as pd
import squarify
import matplotlib.pyplot as plt
import seaborn as sb
  
# read the dataset and create a DataFrame
dataset = pd.read_csv("pokemons dataset.csv")
df = pd.DataFrame(dataset)
  
# select top 20 pokemons from 3 columns
# and sort them by Total Strength
top20_pokemon = df.loc[:, ["Name", "Total",
                           'Primary Type']].sort_values(
    by="Total", ascending=False)[:20]
  
# create a plot figure with figsize
plt.figure(figsize=(12, 6))
# we don't require the axis values so lets remove it
plt.axis("off")
axis = squarify.plot(top20_pokemon['Primary Type'].value_counts(),
                     label=top20_pokemon['Primary Type'].value_counts().index,
                     color=sb.color_palette("tab20", len(
                         top20_pokemon['Primary Type'].value_counts())),
                     pad=1,
                     text_kwargs={'fontsize': 18})
axis.set_title("Primary Data Types Of Top 20 Pokemons", fontsize=24)


Output:



Like Article
Suggest improvement
Share your thoughts in the comments

Similar Reads