Skip to content
Related Articles

Related Articles

Improve Article

Plot the Size of each Group in a Groupby object in Pandas

  • Last Updated : 19 Aug, 2020

Pandas dataframe.groupby() function is one of the most useful function in the library it splits the data into groups based on columns/conditions and then apply some operations eg. size() which counts the number of entries / rows in each group. The groupby() can also be applied on series.

Syntax: DataFrame.groupby(by=None, axis=0, level=None, as_index=True, sort=True, group_keys=True, squeeze=False, **kwargs)

Parameters :
by : mapping, function, str, or iterable
axis : int, default 0
level : If the axis is a MultiIndex (hierarchical), group by a particular level or levels
as_index : For aggregated output, return object with group labels as the index. Only relevant for DataFrame input. as_index=False is effectively “SQL-style” grouped output
sort : Sort group keys. Get better performance by turning this off. Note this does not influence the order of observations within each group. groupby preserves the order of rows within each group.
group_keys : When calling apply, add group keys to index to identify pieces
squeeze : Reduce the dimensionality of the return type if possible, otherwise return a consistent type

Returns : GroupBy object

In the following example we are going to make use to two libraries seaborn and pandas where seaborn is used for plotting and pandas for reading data. We are going to use the load_dataset() methods from seaborn to load the penguins.csv data set.



Python3




# import the module
import seaborn as sns
dataset = sns.load_dataset('penguins')
   
# displaying the data
print(dataset.head())

Output :

Top five rows of the dataset

More information about the data set using the info() method

Python3




# display the number of columns and their data types
dataset.info()

Output :

Info about the dataset

We will be grouping the data using the groupby() method according to ‘island’ and plotting it.

Plotting using Pandas :

Python3




# apply groupby on the island column
# plotting
dataset.groupby(['island']).size().plot(kind = "bar")

Plot of groupby() size using Pandas 

Plotting using Seaborn 

Python3




# use the groupby() function to group island column 
# and apply size() function
# size() is equivalent to couting the distinct rows
result = dataset.groupby(['island']).size()
  
# plot the result
sns.barplot(x = result.index, y = result.values)

Plot of size using
Seaborn

 Attention geek! Strengthen your foundations with the Python Programming Foundation Course and learn the basics.  

To begin with, your interview preparations Enhance your Data Structures concepts with the Python DS Course. And to begin with your Machine Learning Journey, join the Machine Learning – Basic Level Course




My Personal Notes arrow_drop_up
Recommended Articles
Page :