Skip to content
Related Articles

Related Articles

Improve Article
Save Article
Like Article

Extract date from a specified column of a given Pandas DataFrame using Regex

  • Last Updated : 29 Aug, 2020

In this article, we will discuss how to extract only valid date from a specified column of a given Data Frame. The extracted date from the specified column should be in the form of  ‘mm-dd-yyyy’.


In this article, we have used a regular expression to extract valid date from the specified column of the data frame. Here we used \b(1[0-2]|0[1-9])/(3[01]|[12][0-9]|0[1-9])/([0-9]{4})\b this regular expression. We’ll be using re.findall() method for this. Now let us try to implement this using Python: 

Step 1: Creating Dataframe


# importing pandas and re library
import pandas as pd
import re as re
# creating data frame with column
# name,date_of_birth and age
df = pd.DataFrame({'Name': ['Akash', 'Shyam', 'Ayush',
                            'Diksha', 'Radhika'],
                   'date_of_birth': ['12/21/1998', '15/12/1998',
                                     '06/11/2000', '05/10/1998',
                   'Age': [21, 12, 20, 21, 10]})
# printing the original data frame
print("Printing the original dataframe")


Step 2: Extracting valid date from data frame in the format ‘mm-dd-yyyy’


# creating function to find whether the 
# given date is valid or not
def checking_valid_dates(dt):
    # creating regular expression to check 
    # whether date fall in the format 
    # mm-dd-yyyy
    result = re.findall(
        r'\b(1[0-2]|0[1-9])/(3[01]|[12][0-9]|0[1-9])/([0-9]{4})\b', dt)
    return result
# creating new column with valid_date_of_birth
df['valid_date_of_birth'] = df['date_of_birth'].apply(
    lambda dt: checking_valid_dates(dt))
print("\nPrinting the data frame Valid dates in the format: mm-dd-yyyy:")


My Personal Notes arrow_drop_up
Recommended Articles
Page :

Start Your Coding Journey Now!