Open In App

Select Pandas dataframe rows between two dates

Prerequisites: pandas

Pandas is an open-source library that is built on top of NumPy library. It is a Python package that offers various data structures and operations for manipulating numerical data and time series. It is mainly popular for importing and analyzing data much easier. Pandas is fast and it has high-performance & productivity for users.



This article focuses on getting selected pandas data frame rows between two dates. We can do this by using a filter.

Dates can be represented initially in several ways :



To manipulate dates in pandas, we use the pd.to_datetime() function in pandas to convert different date representations to datetime64[ns] format.

Syntax: pandas.to_datetime(arg, errors=’raise’, dayfirst=False, yearfirst=False, utc=None, box=True, format=None, exact=True, unit=None, infer_datetime_format=False, origin=’unix’, cache=False)

Parameters:

  • arg: An integer, string, float, list or dict object to convert in to Date time object.
  • dayfirst: Boolean value, places day first if True.
  • yearfirst: Boolean value, places year first if True.
  • utc: Boolean value, Returns time in UTC if True.
  • format: String input to tell position of day, month and year.

Approach

Example: Original dataframe




import pandas as pd
data = {'Name': ['Tani', 'Saumya',
                 'Ganesh', 'Kirti'],
  
        'Articles': [5, 3, 4, 3],
  
        'Location': ['Kanpur', 'Kolkata',
                     'Kolkata', 'Bombay'],
        'Dates': ['2020-08-04', '2020-08-07', '2020-08-08', '2020-06-08']}
  
# Create DataFrame
df = pd.DataFrame(data)
display(df)

Output:

Example: Selecting data frame rows between two rows




import pandas as pd
data = {'Name': ['Tani', 'Saumya',
                 'Ganesh', 'Kirti'],
  
        'Articles': [5, 3, 4, 3],
  
        'Location': ['Kanpur', 'Kolkata',
                     'Kolkata', 'Bombay'],
        'Dates': ['2020-08-04', '2020-08-07', '2020-08-08', '2020-06-08']}
  
# Create DataFrame
df = pd.DataFrame(data)
start_date = '2020-08-05'
end_date = '2020-08-08'
mask = (df['Dates'] > start_date) & (df['Dates'] <= end_date)
  
df = df.loc[mask]
display(df)

Output:


Article Tags :