Open In App

Creating Pivot Table with Multiple Columns using Python Pandas

Last Updated : 12 Feb, 2024
Improve
Improve
Like Article
Like
Save
Share
Report

PythonPandas make data manipulation, representation and analysis easier. Pandas Pivot Tables are used to create spreadsheet-style pivot tables as a DataFrame. The levels in the pivot table will be stored in MultiIndex objects (hierarchical indexes) on the index and columns of the result DataFrame.

Pivot Table with Multiple Columns using Pandas

A pivot table in pandas is a way of summarizing and aggregating data in a DataFrame, especially when you have multiple dimensions or categorical variables. It allows you to reshape and transform your data, making it easier to analyze and gain insights. In a pivot table, you can specify which columns of the original DataFrame should become the new index, which columns should become new columns, and which columns should be used for aggregating data.

Syntax:

pandas.pivot_table(data, values=None, index=None, columns=None, aggfunc=’mean’, fill_value=None, margins=False, dropna=True, margins_name=’All’, observed=_NoDefault.no_default, sort=True)

Parameters:

  • data: The DataFrame to be used for creating the pivot table.
  • values: Column(s) to aggregate. This can be a list, a single column name, or a scalar. It specifies the values that will be aggregated in the resulting pivot table.
  • index: Column, Grouper, array, or list of the previous. The column or columns whose unique values will become the index of the pivot table.
  • columns: Column, Grouper, array, or list of the previous. The column or columns whose unique values will become the columns of the pivot table.
  • aggfunc: Function, list of functions, or a dictionary. Specifies how to aggregate the values. Common options include ‘mean’, ‘sum’, ‘count’, ‘min’, ‘max’, etc. It can be a single function, a list of functions, or a dictionary where keys are column names, and values are aggregation functions.
  • fill_value: Scalar, default None. The value to use for filling missing values in the resulting pivot table.
  • margins: Bool, default False. If True, it adds row/column margins (subtotals) to the pivot table.
  • dropna: Bool, default True. If True, it excludes NA/null values from the result.
  • margins_name: Str, default 'All'. Name to be used for the row/column that will contain the totals when margins=True.
  • observed: Bool, default False (Deprecated since version 2.2.0). This parameter is deprecated and no longer used.
  • sort: Bool, default True. Sort the result DataFrame by the names of the index and column levels if True.

Creating Pivot Table with Multiple Columns using Pandas

Pivot Table for Students Report

Let’s create a DataFrame (df) with columns Name, Subject, Score, and Grade. Subsequently, a pivot table is generated using the pivot_table method, and the ‘Name’ column is designated as the index. The aggregation functions for score and grade columns are defined using the aggfunc parameter. For ‘Score’, the mean value is computed and ‘Grade’, the ‘first’ value encountered is chosen as the representative grade.

Python3




import pandas as pd
 
data = {
    'Name': ['Shravan', 'Jeetu', 'Ram', 'Jeetu', 'Shravan', 'Ram'],
    'Subject': ['DSA', 'DSA', 'DSA', 'DBMS', 'DBMS', 'DBMS'],
    'Score': [85, 90, 75, 80, 88, 82],
    'Grade': ['A', 'A', 'B', 'B', 'A', 'B']
}
 
df = pd.DataFrame(data)
 
# Creating a pivot table with multiple columns
pivot_table = df.pivot_table(index='Name',
                             columns='Subject',
                             values=['Score', 'Grade'],
                             aggfunc={'Score': 'mean', 'Grade': 'first'})
 
print(pivot_table)


Output:

        Grade     Score    
Subject  DBMS DSA  DBMS DSA
Name                       
Jeetu       B   A    80  90
Ram         B   B    82  75
Shravan     A   A    88  85

Pivot Table for Shop Items Report

A structured DataFrame is created to capture information related to dates Date, different categories Category, and corresponding numerical values Value1 and Value2. Setting margins=True adds subtotals along both the rows and columns, creating a more detailed summary of the data.

Python3




import pandas as pd
data = {
    'Date': ['2022-01-01', '2022-01-01', '2022-01-02', '2022-01-02'],
    'Category': ['A', 'B', 'A', 'B'],
    'Value1': [10, 15, 20, 25],
    'Value2': [30, 35, 40, 45]
}
 
df = pd.DataFrame(data)
 
# Create a pivot table with multiple columns with additional parameters
pivot_table = df.pivot_table(
    index='Date',
    columns='Category',
    values=['Value1', 'Value2'],
    aggfunc='sum',
    margins=True,
    margins_name='Total',
    sort=True
)
print(pivot_table)


Output:

           Value1           Value2          
Category A B Total A B Total
Date
2022-01-01 10 15 25 30 35 65
2022-01-02 20 25 45 40 45 85
Total 30 40 70 70 80 150



Like Article
Suggest improvement
Share your thoughts in the comments

Similar Reads