Combining multiple columns in Pandas groupby with dictionary

Last Updated : 27 Oct, 2021

Let’ see how to combine multiple columns in Pandas using groupby with dictionary with the help of different examples.

Example #1:

# importing pandas as pd  
import pandas as pd  
  
# Creating a dictionary  
d = {'id':['1', '2', '3'], 
     'Column 1.1':[14, 15, 16], 
     'Column 1.2':[10, 10, 10], 
     'Column 1.3':[1, 4, 5], 
     'Column 2.1':[1, 2, 3], 
     'Column 2.2':[10, 10, 10], } 
  
# Converting dictionary into a data-frame  
df = pd.DataFrame(d) 
print(df) 

Output:

# Creating the groupby dictionary  
groupby_dict = {'Column 1.1':'Column 1', 
                'Column 1.2':'Column 1', 
                'Column 1.3':'Column 1', 
                'Column 2.1':'Column 2', 
                'Column 2.2':'Column 2' } 
  
# Set the index of df as Column 'id' 
df = df.set_index('id') 
  
# Groupby the groupby_dict created above  
df = df.groupby(groupby_dict, axis = 1).min() 
print(df) 

Output:

Explanation

Here we have grouped Column 1.1, Column 1.2 and Column 1.3 into Column 1 and Column 2.1, Column 2.2 into Column 2.
Notice that the output in each column is the min value of each row of the columns grouped together. i.e in Column 1, value of first row is the minimum value of Column 1.1 Row 1, Column 1.2 Row 1 and Column 1.3 Row 1.

Example #2:

# importing pandas as pd  
import pandas as pd  
  
# Create dictionary with data  
dict = { 
    "ID":[1, 2, 3], 
    "Movies":["The Godfather", "Fight Club", "Casablanca"], 
    "Week_1_Viewers":[30, 30, 40], 
    "Week_2_Viewers":[60, 40, 80], 
    "Week_3_Viewers":[40, 20, 20] }; 
  
# Convert dictionary to dataframe 
df = pd.DataFrame(dict); 
print(df) 

Output:

# Create the groupby_dict  
groupby_dict = {"Week_1_Viewers":"Total_Viewers", 
           "Week_2_Viewers":"Total_Viewers", 
           "Week_3_Viewers":"Total_Viewers", 
           "Movies":"Movies" } 
  
df = df.set_index('ID') 
df = df.groupby(groupby_dict, axis = 1).sum() 
print(df) 

Output:

Explanation:

Here, notice that even though ‘Movies’ isn’t being merged into another column it still has to be present in the groupby_dict, else it won’t be in the final dataframe.
To calculate the Total_Viewers we have used the .sum() function which sums up all the values of the respective rows.

Suggest improvement

Grouping Rows in pandas

Python | Pandas Merging, Joining, and Concatenating

Share your thoughts in the comments

Introduction

Creating Objects

Viewing Data

Selection & Slicing

Operations

Manipulating Data

Grouping Data

Merging, Joining, Concatenating and Comparing

Working with Date and Time

Working With Text Data

Working with CSV and Excel files

Visualization

Applications and Projects

Introduction

Creating Objects

Viewing Data

Selection & Slicing

Operations

Manipulating Data

Grouping Data

Merging, Joining, Concatenating and Comparing

Working with Date and Time

Working With Text Data

Working with CSV and Excel files

Visualization

Applications and Projects

Combining multiple columns in Pandas groupby with dictionary

Please Login to comment...

Similar Reads

What kind of Experience do you want to share?