Seaborn | Categorical Plots

Last Updated : 08 Oct, 2021

Plots are basically used for visualizing the relationship between variables. Those variables can be either be completely numerical or a category like a group, class or division. This article deals with categorical variables and how they can be visualized using the Seaborn library provided by Python.

Seaborn besides being a statistical plotting library also provides some default datasets. We will be using one such default dataset called ‘tips’. The ‘tips’ dataset contains information about people who probably had food at a restaurant and whether or not they left a tip for the waiters, their gender, whether they smoke and so on.
Let us have a look at the tips dataset.

Code

Python3

# import the seaborn library
import seaborn as sns
 
# import done to avoid warnings 
from warnings import filterwarnings
 
# reading the dataset
df = sns.load_dataset('tips')
 
# first five entries if the dataset
df.head()

Now lets proceed onto the plots so that we can how we can visualize these categorical variables.

Barplot

A barplot is basically used to aggregate the categorical data according to some methods and by default its the mean. It can also be understood as a visualization of the group by action. To use this plot we choose a categorical column for the x axis and a numerical column for the y axis and we see that it creates a plot taking a mean per categorical column.

Syntax:

barplot([x, y, hue, data, order, hue_order, …])

Example:

Python3

# set the background style of the plot
sns.set_style('darkgrid')
 
# plot the graph using the default estimator mean
sns.barplot(x ='sex', y ='total_bill', data = df, palette ='plasma')
 
# or
import numpy as np
 
# change the estimator from mean to standard deviation
sns.barplot(x ='sex', y ='total_bill', data = df, 
            palette ='plasma', estimator = np.std)

Output:

Explanation/Analysis
Looking at the plot we can say that the average total_bill for the male is more as compared to the female.

palette is used to set the color of the plot
estimator is used as a statistical function for estimation within each categorical bin.

Countplot

A countplot basically counts the categories and returns a count of their occurrences. It is one of the most simple plots provided by the seaborn library.

Syntax:

countplot([x, y, hue, data, order, …])

Example:

Python3

sns.countplot(x ='sex', data = df)

Output:

Explanation/Analysis
Looking at the plot we can say that the number of males is more than the number of females in the dataset. As it only returns the count based off a categorical column, we need to specify only the x parameter.

Boxplot

A boxplot is sometimes known as the box and whisker plot.It shows the distribution of the quantitative data that represents the comparisons between variables. boxplot shows the quartiles of the dataset while the whiskers extend to show the rest of the distribution i.e. the dots indicating the presence of outliers.

Syntax:

boxplot([x, y, hue, data, order, hue_order, …])

Example:

Python3

sns.boxplot(x ='day', y ='total_bill', data = df, hue ='smoker')

Output:

Explanation/Analysis –
x takes the categorical column and y is a numerical column.Hence we can see the total bill spent each day.”hue” parameter is used to further add a categorical separation. By looking at the plot we can say that the people who do not smoke had a higher bill on Friday as compared to the people who smoked.

Violinplot

It is similar to the boxplot except that it provides a higher, more advanced visualization and uses the kernel density estimation to give a better description about the data distribution.

Syntax:

violinplot([x, y, hue, data, order, …])

Example:

Python3

sns.violinplot(x ='day', y ='total_bill', data = df, hue ='sex', split = True)

Output:

Explanation/Analysis –

hue is used to separate the data further using the sex category
setting split=True will draw half of a violin for each level. This can make it easier to directly compare the distributions.

Stripplot

It basically creates a scatter plot based on the category.

Syntax:

stripplot([x, y, hue, data, order, …])

Example:

Python3

sns.stripplot(x ='day', y ='total_bill', data = df, 
              jitter = True, hue ='smoker', dodge = True)

Output:

Explanation/Analysis –

One problem with strip plot is that you can’t really tell which points are stacked on top of each other and hence we use the jitter parameter to add some random noise.
jitter parameter is used to add an amount of jitter (only along the categorical axis) which can be useful when you have many points and they overlap, so that it is easier to see the distribution.
hue is used to provide an addition categorical separation
setting split=True is used to draw separate strip plots based on the category specified by the hue parameter.

Swarmplot

It is very similar to the stripplot except the fact that the points are adjusted so that they do not overlap.Some people also like combining the idea of a violin plot and a stripplot to form this plot. One drawback to using swarmplot is that sometimes they dont scale well to really large numbers and takes a lot of computation to arrange them. So in case we want to visualize a swarmplot properly we can plot it on top of a violinplot.

Syntax:

swarmplot([x, y, hue, data, order, …])

Example:

Python3

sns.swarmplot(x ='day', y ='total_bill', data = df)

Output:

Example:

Python3

sns.violinplot(x ='day', y ='total_bill', data = df)
sns.swarmplot(x ='day', y ='total_bill', data = df, color ='black')

Output:

Factorplot

It is the most general of all these plots and provides a parameter called kind to choose the kind of plot we want thus saving us from the trouble of writing these plots separately. The kind parameter can be bar, violin, swarm etc.

Syntax:

sns.factorplot([x, y, hue, data, row, col, …])

Example:

Python3

sns.factorplot(x ='day', y ='total_bill', data = df, kind ='bar')

Output:

Suggest improvement

Identifying handwritten digits using Logistic Regression in PyTorch

Python | time.time_ns() method

Share your thoughts in the comments

Seaborn | Categorical Plots

Python3

Barplot

Python3

Countplot

Python3

Boxplot

Python3

Violinplot

Python3

Stripplot

Python3

Swarmplot

Python3

Python3

Factorplot

Python3

Please Login to comment...

Similar Reads

What kind of Experience do you want to share?