Open In App

Python Seaborn – Catplot

Seaborn is a Python data visualization library based on matplotlib. It provides a high-level interface for drawing attractive and informative statistical graphics. Seaborn helps resolve the two major problems faced by Matplotlib; the problems are?

As Seaborn compliments and extends Matplotlib, the learning curve is quite gradual. If you know Matplotlib, you are already half-way through Seaborn. Seaborn library offers many advantages over other plotting libraries:



Syntax: seaborn.catplot(*, x=None, y=None, hue=None, data=None, row=None, col=None, kind=’strip’, color=None, palette=None, **kwargs)
 

Parameters



  • x, y, hue:  names of variables in data
    Inputs for plotting long-form data. See examples for interpretation.
  • data:  DataFrame
    Long-form (tidy) dataset for plotting. Each column should correspond to a variable, and each row should correspond to an observation.
  • row, col:   names of variables in data, optional
    Categorical variables that will determine the faceting of the grid.
  • kind:  str, optional
    The kind of plot to draw, corresponds to the name of a categorical axes-level plotting function. Options are: “strip”, “swarm”, “box”, “violin”, “boxen”, “point”, “bar”, or “count”.
  • color:  matplotlib color, optional
    Color for all of the elements, or seed for a gradient palette.
  • palette:  palette name, list, or dict
    Colors to use for the different levels of the hue variable. Should be something that can be interpreted by color_palette(), or a dictionary mapping hue levels to matplotlib colors.
  • kwargs:  key, value pairings
    Other keyword arguments are passed through to the underlying plotting function.

Examples:

If you are working with data that involves any categorical variables like survey responses, your best tools to visualize and compare different features of your data would be categorical plots. Plotting categorical plots it is very easy in seaborn. In this example x,y and hue take the names of the features in your data. Hue parameters encode the points with different colors with respect to the target variable.




import seaborn as sns
  
exercise = sns.load_dataset("exercise")
g = sns.catplot(x="time", y="pulse",
                hue="kind",
                data=exercise)

Output:

For the count plot, we set a kind parameter to count and feed in the data using data parameters. Let’s start by exploring the time feature. We start off with catplot() function and use x argument to specify the axis we want to show the categories. 




import seaborn as sns
  
sns.set_theme(style="ticks")
exercise = sns.load_dataset("exercise")
  
g = sns.catplot(x="time",
                kind="count",
                data=exercise)

Output:

Another popular choice for plotting categorical data is a bar plot. In the count plot example, our plot only needed a single variable. In the bar plot, we often use one categorical variable and one quantitative. Let’s see how the time compares to each other.




import seaborn as sns
  
exercise = sns.load_dataset("exercise")
g = sns.catplot(x="time",
                y="pulse",
                kind="bar"
                data=exercise)

Output:

For creating the horizontal bar plot we have to change the x and y features. When you have lots of categories or long category names it’s a good idea to change the orientation.




import seaborn as sns
  
exercise = sns.load_dataset("exercise")
g = sns.catplot(x="pulse",
                y="time",
                kind="bar",
                data=exercise)

Output:

Use a different plot kind to visualize the same data:




import seaborn as sns
  
  
exercise = sns.load_dataset("exercise")
  
g = sns.catplot(x="time",
                y="pulse",
                hue="kind",
                data=exercise, 
                kind="violin")

Output:




import seaborn as sns
  
exercise = sns.load_dataset("exercise")
  
g = sns.catplot(x="time"
                y="pulse",
                hue="kind",
                col="diet",
                data=exercise)

Output:

Make many column facets and wrap them into the rows of the grid. The aspect will change the width while keeping the height constant.




titanic = sns.load_dataset("titanic")
g = sns.catplot(x="alive", col="deck", col_wrap=4,
                data=titanic[titanic.deck.notnull()],
                kind="count", height=2.5, aspect=.8)

Output:

Plot horizontally and pass other keyword arguments to the plot function:




g = sns.catplot(x="age", y="embark_town",
                hue="sex", row="class",
                data=titanic[titanic.embark_town.notnull()],
                orient="h", height=2, aspect=3, palette="Set3",
                kind="violin", dodge=True, cut=0, bw=.2)

Output:

Box plots are visuals that can be a little difficult to understand but depict the distribution of data very beautifully. It is best to start the explanation with an example of a box plot. I am going to use one of the common built-in datasets in Seaborn:




tips = sns.load_dataset('tips')
sns.catplot(x='day'
            y='total_bill',
            data=tips,
            kind='box');

Output:

Outlier Detection Using Box Plot:

The edges of the blue box are the 25th and 75th percentiles of the distribution of all bills. This means that 75% of all the bills on Thursday were lower than 20 dollars, while another 75% (from the bottom to the top) was higher than almost 13 dollars. The horizontal line in the box shows the median value of the distribution.


Article Tags :