Ranking rows in a Pandas DataFrame assigns a numerical rank to each row based on the values in a column. Higher or lower values can be ranked first depending on the requirement.
For example, given a small dataset of students’ scores, we can rank them to see who scored highest and lowest.
import pandas as pd
df = pd.DataFrame({'Student': ['Emily', 'Ava', 'Jack'],
'Marks': [85, 92, 78] })
df['Rank'] = df['Marks'].rank(ascending=0)
print(df)
Output
Student Marks Rank 0 Emily 85 2.0 1 Ava 92 1.0 2 Jack 78 3.0
Explanation:
- df['Marks'].rank(ascending=0) ranks the students from highest to lowest marks.
- The student with the highest mark (Ava) gets rank 1.
Syntax
DataFrame.rank(axis=0, method='average', ascending=True)
Parameters:
- axis: 0 for row-wise (default), 1 for column-wise.
- method: How to assign ranks to equal values ('average', 'min', 'max', 'first').
- ascending: True for lowest values first, False for highest first.
Return Value: Returns a Series or DataFrame containing ranks.
Examples
Example 1: In this example, we rank movies from lowest to highest rating using the rank() method.
import pandas as pd
# Sample movie DataFrame
df = pd.DataFrame({ 'Movie': ['The Godfather', 'Bird Box', 'Fight Club', 'Inception', 'Titanic'],
'Year': [1972, 2018, 1999, 2010, 1997],
'Rating': [9.2, 6.8, 8.8, 8.7, 7.8] })
df['Rank_Asc'] = df['Rating'].rank(ascending=True)
print(df.sort_values('Rank_Asc'))
Output
Movie Year Rating Rank_Asc
1 Bird Box 2018 6.8 1.0
4 Titanic 1997 7.8 2.0
3 Inception 2010 8.7 3.0
2 Fight Club 1999 8.8 4.0
0 The Godfather 1972 9.2 5.0
Explanation:
- df['Rating'].rank(ascending=True) computes rank based on the Rating column, lowest value first.
- sort_values('Rank_Asc') orders the DataFrame by the new rank column.
- Bird Box has rank 1 as it has the lowest rating.
Example 2: This code ranks movies from highest to lowest rating to find the top-rated movie.
df['Rank_Desc'] = df['Rating'].rank(ascending=False)
print(df.sort_values('Rank_Desc'))
Output
Movie Year Rating Rank_Desc
0 The Godfather 1972 9.2 1.0
2 Fight Club 1999 8.8 2.0
3 Inception 2010 8.7 3.0
4 Titanic 1997 7.8 4.0
1 Bird Box 2018 6.8 5.0
Explanation:
- rank(ascending=False) assigns rank 1 to the highest rating.
- The highest-rated movie, The Godfather, now has rank 1.
- Sorting by Rank_Desc shows movies in order from top-rated to lowest-rated.
Example 3: Here we rank movies using the 'min' method, which gives tied ratings the minimum possible rank.
df.loc[5] = ['Movie X', 2020, 8.8]
df['Rank_Min'] = df['Rating'].rank(ascending=False, method='min')
print(df.sort_values('Rank_Min'))
Output
Movie Year Rating Rank_Min
0 The Godfather 1972 9.2 1.0
2 Fight Club 1999 8.8 2.0
5 Movie X 2020 8.8 2.0
3 Inception 2010 8.7 4.0
4 Titanic 1997 7.8 5.0
1 Bird Box 2018 6.8 6.0
Explanation:
- Added a new movie with rating 8.8 to demonstrate ties.
- method='min' assigns the lowest possible rank to tied values.
- Both Fight Club and Movie X share rank 2.