Pandas Groupby – Sort within groups

Pandas Groupby is used in situations where we want to split data and set into groups so that we can do various operations on those groups like – Aggregation of data, Transformation through some group computations or Filtration according to specific conditions applied on the groups.

In similar ways, we can perform sorting within these groups.

Example 1: Let’s take an example of a dataframe:

df = pd.DataFrame({'X': ['B', 'B', 'A', 'A'], 

                   'Y': [1, 2, 3, 4]}) 

# using groupby function 

df.groupby('X').sum()

Output:

Let’s pass the sort parameter as False.

# using groupby function  
# with sort 

df.groupby('X', sort = False).sum() 

Output:

Here, we see a dataframe with sorted values within the groups.

Example 2:
Now, let’s take an example of a dataframe with ages of different people. Using sort along with groupby function will arrange the transformed dataframe on the basis of keys passes, for potential speedups.

data = {'Name':['Elle', 'Chloe', 'Noah', 'Marco',   

                 'Lee', 'Elle', 'Rachel', 'Noah'],   

        'Age':[17, 19, 18, 17,   

               22, 18, 21, 20]}    

df = pd.DataFrame(data)  
df

Output:

Let’s group the above dataframe according to the name

# using groupby without sort 

df.groupby(['Name']).sum() 

Output:

Passing the sort parameter as False

# using groupby function  
# with sort 

df.groupby(['Name'], sort = False).sum() 

Output:

Example 3:
Let’s take another example of a dataframe that consists top speeds of various cars and bikes.
We’ll try to get the top speeds sorted within the groups of vehicle type.

import pandas as pd 

df = pd.DataFrame([('Bike', 'Kawasaki', 186), 

                   ('Bike', 'Ducati Panigale', 202), 

                   ('Car', 'Bugatti Chiron', 304),  

                   ('Car', 'Jaguar XJ220', 210), 

                   ('Bike', 'Lightning LS-218', 218),  

                   ('Car', 'Hennessey Venom GT', 270), 

                   ('Bike', 'BMW S1000RR', 188)],  

                  columns =('Type', 'Name', 'top_speed(mph)')) 

df

Output:

After Using the groupby function

# Using groupby function 

grouped = df.groupby(['Type'])['top_speed(mph)'].nlargest() 

  
# using nlargest() function will get the  
# largest values of top_speed(mph) within 
# groups created 

print(grouped) 

Output:

Article Tags :

Python

Python pandas-groupby

Python-pandas