Extracting all present dates in any given String using Regular Expressions
Given a string Str, the task is to extract all the present dates from the string. Dates can be in the format i.e., mentioned below:
- DD-MM-YYYY
- YYYY-MM-DD
- DD Month YYYY
Examples:
Input: Str = “The First Version was released on 12-07-2008.The next Release will come on 12 July 2009. The due date for payment is 2023-09-1. India gained its freedom on 15 August 1947 which was a Friday. Republic Day is a public holiday in India where the country marks and celebrates the date on which the Constitution of India came into effect on 26-1-1950.”
Output: 12-07-2008
12 July 2009
2023-09-1
15 August 1947
26-1-1950
Approach: The problem can be solved based on the following idea:
Create a regex pattern to validate the number as written below:
regex = “\\d{2} – \\d{2} – \\d{4}”,
“[0 – 9]{2}[/]{1}[0 – 9]{2}[/]{1}[0 – 9]{4}”,
“\\d{1, 2} – (January|February|March|April|May|June|July|August|September|October|November|December)-\\d{4}”,
“\\d{4} – \\d{1, 2} – \\d{1, 2}”,
“[0 – 9]{1, 2}\\s(January|February|March|April|May|June|July|August|September|October|November|December)\\s\\d{4}”,
“\\d{1, 2} – \\d{1, 2} – \\d{4}”
Where,
- [d]{2}: This pattern will match two of the preceding items if they are Digits or not.
- |: Either of them should be there
Follow the below steps to implement the idea:
- Create a regex expression to extract all the present dates from the string.
- Use Pattern class to compile the regex formed.
- Use the matcher function to find.
- If it is valid, return true. Otherwise, return false.
Below is the code implementation of the above-discussed approach:
C++
#include <iostream>
#include <regex>
using namespace std;
int main() {
string str = "The First Version was released on 12-07-2008."
"Next Release will might come on 12 July 2009. "
"The due date for payment is 2023-09-1."
"India gained its freedom on 15 August 1947 which was a Friday."
"Republic Day is a public holiday in India where the country marks and celebrates "
"the date on which the Constitution of India came into effect on 26-1-1950." ;
string strPattern[] = {
"\\d{2}-\\d{2}-\\d{4}" ,
"[0-9]{2}/{1}[0-9]{2}/{1}[0-9]{4}" ,
"\\d{1,2}-(January|February|March|April|May|June|July|August|September|October|November|December)-\\d{4}" ,
"\\d{4}-\\d{1,2}-\\d{1,2}" ,
"[0-9]{1,2}\\s(January|February|March|April|May|June|July|August|September|October|November|December)\\s\\d{4}" ,
"\\d{1,2}-\\d{1,2}-\\d{4}"
};
for ( int i = 0; i < 6; i++) {
regex pattern(strPattern[i]);
sregex_iterator matcher(str.begin(), str.end(), pattern);
sregex_iterator end;
while (matcher != end) {
cout << matcher->str() << endl;
matcher++;
}
}
return 0;
}
|
Java
import java.io.*;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class GFG {
public static void main(String[] args)
{
String str
= "The First Version was released on 12-07-2008."
+ "Next Release will might come on 12 July 2009. "
+ "The due date for payment is 2023-09-1."
+ "India gained its freedom on 15 August 1947 which was a Friday."
+ "Republic Day is a public holiday in India where the country marks and celebrates "
+ "the date on which the Constitution of India came into effect on 26-1-1950." ;
String strPattern[] = {
"\\d{2}-\\d{2}-\\d{4}" ,
"[0-9]{2}[/]{1}[0-9]{2}[/]{1}[0-9]{4}" ,
"\\d{1, 2}-(January|February|March|April|May|June|July|August|September|October|November|December)-\\d{4}" ,
"\\d{4}-\\d{1, 2}-\\d{1, 2}" ,
"[0-9]{1, 2}\\s(January|February|March|April|May|June|July|August|September|October|November|December)\\s\\d{4}" ,
"\\d{1, 2}-\\d{1, 2}-\\d{4}"
};
for ( int i = 0 ; i < strPattern.length; i++) {
Pattern pattern
= Pattern.compile(strPattern[i]);
Matcher matcher = pattern.matcher(str);
while (matcher.find()) {
System.out.println(matcher.group());
}
}
}
}
|
Python3
import re
str = "The First Version was released on 12-07-2008." \
"Next Release will might come on 12 July 2009. " \
"The due date for payment is 2023-09-1." \
"India gained its freedom on 15 August 1947 which was a Friday." \
"Republic Day is a public holiday in India where the country marks and celebrates " \
"the date on which the Constitution of India came into effect on 26-1-1950."
str_pattern = [
"\\d{2}-\\d{2}-\\d{4}" ,
"[0-9]{2}/{1}[0-9]{2}/{1}[0-9]{4}" ,
"\\d{1,2}-(January|February|March|April|May|June|July|August|September|October|November|December)-\\d{4}" ,
"\\d{4}-\\d{1,2}-\\d{1,2}" ,
"[0-9]{1,2}\\s(January|February|March|April|May|June|July|August|September|October|November|December)\\s\\d{4}" ,
"\\d{1,2}-\\d{1,2}-\\d{4}"
]
for pattern in str_pattern:
for match in re.finditer(pattern, str ):
print (match.group())
|
C#
using System;
using System.Text.RegularExpressions;
class GFG {
static void Main( string [] args) {
string str = "The First Version was released on 12-07-2008."
+ "Next Release will might come on 12 July 2009. "
+ "The due date for payment is 2023-09-1."
+ "India gained its freedom on 15 August 1947 which was a Friday."
+ "Republic Day is a public holiday in India where the country marks and celebrates "
+ "the date on which the Constitution of India came into effect on 26-1-1950." ;
string [] strPattern = new string [] {
"\\d{2}-\\d{2}-\\d{4}" ,
"[0-9]{2}[/]{1}[0-9]{2}[/]{1}[0-9]{4}" ,
"\\d{1, 2}-(January|February|March|April|May|June|July|August|September|October|November|December)-\\d{4}" ,
"\\d{4}-\\d{1, 2}-\\d{1, 2}" ,
"[0-9]{1, 2}\\s(January|February|March|April|May|June|July|August|September|October|November|December)\\s\\d{4}" ,
"\\d{1, 2}-\\d{1, 2}-\\d{4}"
};
foreach ( string pattern in strPattern) {
MatchCollection matches = Regex.Matches(str, pattern);
foreach (Match match in matches) {
Console.WriteLine(match.Value);
}
}
}
}
|
Javascript
const str = "The First Version was released on 12-07-2008." +
"Next Release will might come on 12 July 2009. " +
"The due date for payment is 2023-09-1." +
"India gained its freedom on 15 August 1947 which was a Friday." +
"Republic Day is a public holiday in India where the country marks and celebrates " +
"the date on which the Constitution of India came into effect on 26-1-1950." ;
const strPattern = [
"\\d{2}-\\d{2}-\\d{4}" ,
"[0-9]{2}/{1}[0-9]{2}/{1}[0-9]{4}" ,
"\\d{1,2}-(January|February|March|April|May|June|July|August|September|October|November|December)-\\d{4}" ,
"\\d{4}-\\d{1,2}-\\d{1,2}" ,
"[0-9]{1,2}\\s(January|February|March|April|May|June|July|August|September|October|November|December)\\s\\d{4}" ,
"\\d{1,2}-\\d{1,2}-\\d{4}"
];
for (let i = 0; i < 6; i++) {
const pattern = new RegExp(strPattern[i], 'g' );
let matcher = str.matchAll(pattern);
for (const match of matcher) {
console.log(match[0]);
}
}
|
Output
12-07-2008
2023-09-1
12 July 2009
15 August 1947
12-07-2008
26-1-1950
Time Complexity: O(n*m), N is the length of the input string and M is the number of date formats in the strPattern array.
Space Complexity: O(k), where k is the total number of matched dates.
Related Articles:
Last Updated :
23 Mar, 2023
Like Article
Save Article
Share your thoughts in the comments
Please Login to comment...