Skip to content
Related Articles

Related Articles

Save Article
Improve Article
Save Article
Like Article

Pandas GroupBy – Count the occurrences of each combination

  • Last Updated : 16 Jun, 2021

In this article, we will GroupBy two columns and count the occurrences of each combination in Pandas. 

DataFrame.groupby() method is used to separate the DataFrame into groups. It will generate the number of similar data counts present in a particular column of the data frame.

 Attention geek! Strengthen your foundations with the Python Programming Foundation Course and learn the basics.  

To begin with, your interview preparations Enhance your Data Structures concepts with the Python DS Course. And to begin with your Machine Learning Journey, join the Machine Learning - Basic Level Course

Syntax: DataFrame.groupby(by=None, axis=0, level=None )


  • by: mapping, function, string, label, or iterable to group elements.
  • axis : group by along with the row (axis=0) or column (axis=1).
  • level: Integer. value to the group by a particular level or levels.

For understanding the concept, we will use a simple dataset given below:


# Import library
import pandas as pd
import numpy as np
# initialise data of lists.
Data = {'Products':['Box','Color','Pencil','Eraser','Color',
# Create DataFrame
df = pd.DataFrame(Data, columns=['Products','States','Sale'])
# Display the Output


Method 1: Using Pandas dataframe.size()

It returns a total number of elements, it is compared by multiplying rows and columns returned by the shape method. 

Syntax: dataframe.size


new = df.groupby(['States','Products']).size()


 Method 2: Using Pandas dataframe.count()

It is used to count the no. of non-NA/null observations across the given axis. It works with non-floating type data as well. 

Syntax: DataFrame.count(axis=0, level=None, numeric_only=False)


  • axis : 0 or ‘index’ for row-wise, 1 or ‘columns’ for column-wise
  • level : If the axis is a MultiIndex (hierarchical), count along a particular level, collapsing into a DataFrame
  • numeric_only : Include only float, int, boolean data

Returns: count : Series (or DataFrame if level specified)


new = df.groupby(['States','Products'])['Sale'].count()


Method 3: Using Pandas reset_index() 

It is a method to reset the index of a Data Frame.reset_index() method sets a list of integers ranging from 0 to length of data as an index.  

Syntax: DataFrame.reset_index(level=None, drop=False, inplace=False, col_level=0, col_fill=”)


  • level: int, string or a list to select and remove passed column from index.
  • drop: Boolean value, Adds the replaced index column to the data if False.
  • inplace: Boolean value, make changes in the original data frame itself if True.
  • col_level: Select in which column level to insert the labels.
  • col_fill: Object, to determine how the other levels are named.

Return type: DataFrame


new = df.groupby(['States','Products'])['Sale'].agg('count').reset_index()


Method 4: Using pandas.pivot() function

It produces a pivot table based on 3 columns of the DataFrame. Uses unique values from index/columns and fills with values.

Syntax: pandas.pivot(index, columns, values)


  • index[ndarray] : Labels to use to make new frame’s index
  • columns[ndarray] : Labels to use to make new frame’s columns
  • values[ndarray] : Values to use for populating new frame’s values

Returns: Reshaped DataFrame
Exception: ValueError raised if there are any duplicates.


new = df.groupby(['States','Products'],as_index = False


My Personal Notes arrow_drop_up
Recommended Articles
Page :