Skip to content
Related Articles

Related Articles

How to Sort a Pandas DataFrame by Date?
  • Last Updated : 02 Feb, 2021

In the real world, we can come across datasets of any form that may include the date inside them too. These datasets can be present in any file format like .CSV, .xlsx, .txt, etc. To load this data inside Python, we use a library named Pandas which provides us a plethora of functions and methods to play around with this data. Pandas read these datasets in the form of DataFrames.

Sometimes, there might be a situation where the dataset may contain attributes related to date, and we want to sort the records inside the dataframe as per the date values in a specific order.

In this article, we will learn about how can we sort Pandas DataFrame by the Date. I’ll be creating a custom dataframe object imitating a real-world problem and this method will work universally for any DataFrame.

Sorting by Single Column

To sort a DataFrame as per the column containing date we’ll be following a series of steps, so let’s learn along.

Step 1: Load or create dataframe having a date column



Python




# importing package
import pandas as pd
  
# Creating a dataframe that stores records of students taking admission in a college
data = pd.DataFrame({'AdmissionDate': ['2021-01-25','2021-01-22','2021-01-20',
                        '2021-01-18','2021-01-22','2021-01-17','2021-01-21'],
                     'StudentID': [7,5,3,2,6,1,4],
                     'Name': ['Ram','Shyam','Mohan','Sohan','Lucky','Abhinav','Danny'],
                     'Stream':['CSE','ECE','Civil','Mechanical','CSE','IT','EEE']
                   })
# Checking dataframe
print(data)

Output:

Here, it can be clearly seen that our DataFrame contains a column named ‘AdmissionDate’ which contains date values.

Step 2: Converting string data type into datetime object.

When we read the dataset the values stored inside the  ‘AdmissionDate’  column are treated as string data type by default. So, we need to convert this string object to datetime object, for that we will be using the .to_datetime() method provided by Pandas that can be done as:

Python




# checking datatype
print(type(data.AdmissionDate[0]))
  
# convert to date
data['AdmissionDate'] = pd.to_datetime(data['AdmissionDate'])
  
# verify datatype
print(type(data.AdmissionDate[0]))

Output:



<class 'str'>
<class 'pandas._libs.tslibs.timestamps.Timestamp'>

Step 3: Sorting the DataFrame as per date

We will be using the sort_values() method to sort our dataset and the attribute that we will pass inside the function is the column name using which we want to sort our DatFrame.

Python




data.sort_values(by='AdmissionDate')
print(data)

Output:

One thing to notice here is our DataFrame gets sorted in ascending order of dates, to sort the DataFrame in descending order we can pass an additional parameter inside the sort_values() function that will set ascending value to False and will return the DataFrame in descending order.

Python




data.sort_values(by='AdmissionDate',ascending=False)
print(data)

Output:

Sorting by Multiple Columns as per date

We can further extend our understanding for sorting multiple datetime columns as well, in this, we maintain a priority order to sort our DataFrame. Let’s have a look.

Step 1: Load or create dataframe having multiple date columns

Python




# importing package
import pandas as pd
  
# Creating a DataFrame
data_1 = pd.DataFrame({'Mfg. Date': ['2021-01-25','2021-01-22','2021-01-20','2021-01-18',
                       '2021-01-22','2021-01-17','2021-01-21'],
                       'ProductID': [7,5,3,2,6,1,4],
                       'Product Name': ['Paracetamol','Moov','Volini','Crocin',
                                        'Aciloc','Iodex','Combiflam'],
                       'Expiry Date':['2022-01-25','2023-01-22','2021-05-20','2022-03-18',
                                      '2022-01-22','2021-05-17','2022-01-30']
                       })
  
# Checking dataframe
print(data_1)

Output:

Here, it can be clearly seen that our DataFrame contains two columns having dates namely ‘Mfg. Date’ and ‘Expiry Date ‘.

Step 2: Converting string data type into datetime object.

Python




data_1[['Mfg. Date','Expiry Date']] = data_1[['Mfg. Date','Expiry Date']].apply(pd.to_datetime)

Step 3: Sorting the DataFrame as per date

Python




# sorting DateFrame by giving first priority to Expiry Date and then Mfg. Date 
data_1.sort_values(by=['Expiry Date', 'Mfg. Date'])

Output:

Here, we got a sorted list in ascending order of Expiry Date as per our priority.

 Attention geek! Strengthen your foundations with the Python Programming Foundation Course and learn the basics.  

To begin with, your interview preparations Enhance your Data Structures concepts with the Python DS Course. And to begin with your Machine Learning Journey, join the Machine Learning – Basic Level Course

My Personal Notes arrow_drop_up
Recommended Articles
Page :