Open In App

How to Create Boxplots by Group in Matplotlib?

Boxplots by groups can be created using the matplotlib package, but, however, if you wish to make more customizations to your grouped box plot, then the seaborn package provides a go-to function that supports a wide variety of customizations to the grouped box plots. Matplotlib doesn’t provide an explicit function to create a grouped box plot. We have to construct the plot as per the required format. This article discusses how to create grouped boxplots in matplotlib. 

Create Boxplots by Group in Matplotlib

matplotlib.pyplot.boxplot() & matplotlib.pyplot.setp() are the two useful functions to create grouped boxplots



Syntax: matplotlib.pyplot.boxplot(x, notch, positions, widths)

Syntax: matplotlib.pyplot.setp(obj, *args, **kwargs)






# import the matplotlib package
import matplotlib.pyplot as plt
 
# import the numpy package
import numpy as np
 
# create 2 - sample a 3-Dim array, that measures
# the summer and winter rain fall amount
summer_rain = [[3, 5, 7], [15, 17, 12, 12, 15],
               [26, 21, 15]]
winter_rain = [[16, 14, 12], [31, 20, 25, 23, 28],
               [29, 31, 35, 41]]
 
# the list named ticks, summarizes or groups
# the summer and winter rainfall as low, mid
# and high
ticks = ['Low', 'Mid', 'High']
 
# create a boxplot for two arrays separately,
# the position specifies the location of the
# particular box in the graph,
# this can be changed as per your wish. Use width
# to specify the width of the plot
summer_rain_plot = plt.boxplot(summer_rain,
                               positions=np.array(
    np.arange(len(summer_rain)))*2.0-0.35,
                               widths=0.6)
winter_rain_plot = plt.boxplot(winter_rain,
                               positions=np.array(
    np.arange(len(winter_rain)))*2.0+0.35,
                               widths=0.6)
 
# each plot returns a dictionary, use plt.setp()
# function to assign the color code
# for all properties of the box plot of particular group
# use the below function to set color for particular group,
# by iterating over all properties of the box plot
def define_box_properties(plot_name, color_code, label):
    for k, v in plot_name.items():
        plt.setp(plot_name.get(k), color=color_code)
         
    # use plot function to draw a small line to name the legend.
    plt.plot([], c=color_code, label=label)
    plt.legend()
 
 
# setting colors for each groups
define_box_properties(summer_rain_plot, '#D7191C', 'Summer')
define_box_properties(winter_rain_plot, '#2C7BB6', 'Winter')
 
# set the x label values
plt.xticks(np.arange(0, len(ticks) * 2, 2), ticks)
 
# set the limit for x axis
plt.xlim(-2, len(ticks)*2)
 
# set the limit for y axis
plt.ylim(0, 50)
 
# set the title
plt.title('Grouped boxplot using matplotlib')

Output:

Explanation:

Create Boxplots by Group in seaborn

You can also plot grouped box plots using long-form and wide form data using yet another library called seaborn which is built on top matplotlib library.

Syntax: sns.boxplot(data, x, y)

Parameters:

  • data – specifies the dataframe to be used for the box plots
  • x –  specifies the column to be used in the x-axis
  • y – specifies the column to be used in y-axis

Grouped Box plots for long-form data:




# import the necessary python packages
import pandas as pd
import numpy as np
import seaborn as sns
 
# create long-form data
data = pd.DataFrame({'season': np.repeat(['Summer', 'Winter',
                                          'Spring'], 5),
                     'rainfall_amount': [17, 18, 19, 21, 27,
                                         33, 37, 33, 36, 12,
                                         14, 15, 16, 21, 22],
                     })
# print the data
print(data)
 
# use seaborn plot and specify the x and y
# columns and specify the dataframe
sns.boxplot(x='season', y='rainfall_amount', data=data)

Output:

Grouped Box plots for wide form data:




# import the necessary python packages
import pandas as pd
import numpy as np
import seaborn as sns
 
# create wide-form data
data = pd.DataFrame({'Summer': [17, 18, 19, 21, 27],
                     'Winter': [33, 37, 33, 36, 12],
                     'Spring': [14, 15, 16, 21, 22]})
# print the data
print(data)
# use melt to convert wide form to long form data
# use seaborn plot and specify the x and y columns
# and specify the dataframe
sns.boxplot(x='variable', y='value', data=pd.melt(data)).set(
    xlabel='Season',
    ylabel='Rainfall amount')

Output:

Code Explanation:


Article Tags :