Open In App

Python – Group dates in K ranges

Last Updated : 01 May, 2023
Improve
Improve
Like Article
Like
Save
Share
Report

Given a list of dates, group the dates in a successive day ranges from the initial date of the list. We will form a group of each successive range of K dates, starting from the smallest date. 

Input : test_list = [datetime(2020, 1, 4), datetime(2019, 12, 30), datetime(2020, 1, 7), datetime(2019, 12, 27), datetime(2020, 1, 20), datetime(2020, 1, 10)], K = 10

Output : [(0, [datetime.datetime(2019, 12, 27, 0, 0), datetime.datetime(2019, 12, 30, 0, 0), datetime.datetime(2020, 1, 4, 0, 0)]), (1, [datetime.datetime(2020, 1, 7, 0, 0), datetime.datetime(2020, 1, 10, 0, 0)]), (2, [datetime.datetime(2020, 1, 20, 0, 0)])]

Explanation : 27 Dec – 4 Jan is in same group as diff. of dates are less than 10, successively, each set of dates are grouped by 10 days delta.

Input : test_list = [datetime(2020, 1, 4), datetime(2019, 12, 30), datetime(2020, 1, 7), datetime(2019, 12, 27), datetime(2020, 1, 20), datetime(2020, 1, 10)], K = 14

Output : [(0, [datetime.datetime(2019, 12, 27, 0, 0), datetime.datetime(2019, 12, 30, 0, 0), datetime.datetime(2020, 1, 4, 0, 0), datetime.datetime(2020, 1, 7, 0, 0)]), (1, [datetime.datetime(2020, 1, 10, 0, 0), datetime.datetime(2020, 1, 20, 0, 0)])]

Explanation : 27 Dec – 7 Jan is in same group as diff. of dates are less than 14, successively, each set of dates are grouped by 14 days delta.

Method : Using groupby() + sort()

In this, we sort the dates and then perform grouping of a set of dates depending upon grouping function. 

Python3




# Python3 code to demonstrate working of
# Group dates in K ranges
# Using groupby() + sort()
from itertools import groupby
from datetime import datetime
 
# initializing list
test_list = [datetime(2020, 1, 4),
             datetime(2019, 12, 30),
             datetime(2020, 1, 7),
             datetime(2019, 12, 27),
             datetime(2020, 1, 20),
             datetime(2020, 1, 10)]
              
# printing original list
print("The original list is : " + str(test_list))
 
# initializing K
K = 7
 
# initializing start date
min_date = min(test_list)
 
# utility fnc to form groupings
def group_util(date):
    return (date-min_date).days // K
 
# sorting before grouping
test_list.sort()
 
temp = []
# grouping by utility function to group by K days
for key, val in groupby(test_list , key = lambda date : group_util(date)):
    temp.append((key, list(val)))
 
# using strftime to convert to userfriendly
# format
res = []
for sub in temp:
  intr = []
  for ele in sub[1]:
    intr.append(ele.strftime("%Y/%m/%d"))
  res.append((sub[0], intr))
     
# printing result
print("Grouped Digits : " + str(res))


Output:

The original list is : [datetime.datetime(2020, 1, 4, 0, 0), datetime.datetime(2019, 12, 30, 0, 0), datetime.datetime(2020, 1, 7, 0, 0), datetime.datetime(2019, 12, 27, 0, 0), datetime.datetime(2020, 1, 20, 0, 0), datetime.datetime(2020, 1, 10, 0, 0)]

Grouped Digits : [(0, [‘2019/12/27’, ‘2019/12/30’]), (1, [‘2020/01/04’, ‘2020/01/07’]), (2, [‘2020/01/10’]), (3, [‘2020/01/20’])]

Method #2: Using Sort and iterate

Approach

1. Sort the list of dates in ascending order.
2. Initialize a list of tuples to store the groups.
3. Initialize variables to keep track of the current group number and the start date of the current group.
4. Iterate through the sorted list of dates, comparing the current date with the start date of the current group.
5. If the difference between the current date and the start date is less than or equal to K days, add the current date to the current group.
6. If the difference between the current date and the start date is greater than K days, create a new group with the current date as the start date and add the current date to the new group.
7. Return the list of tuples.

Algorithm

1. Sort the given list of dates in ascending order.
2. Initialize an empty dictionary to store the groups of dates.
3. For each date in the sorted list, calculate the number of days since the previous date using the timedelta function.
4. If the number of days is greater than K, add the date to a new group. Otherwise, add the date to the previous group.
5. Convert the dictionary into a list of tuples and return the result.

Python3




from datetime import datetime, timedelta
from collections import defaultdict
 
def group_dates(dates, K):
    groups = defaultdict(list)
    dates.sort()
    group_num = 0
    start_date = None
    for date in dates:
        if start_date is None:
            start_date = date
        else:
            diff = (date - start_date).days
            if diff > K:
                group_num += 1
                start_date = date
        groups[group_num].append(date)
    return list(groups.items())
dates = [datetime(2020, 1, 4),
             datetime(2019, 12, 30),
             datetime(2020, 1, 7),
             datetime(2019, 12, 27),
             datetime(2020, 1, 20),
             datetime(2020, 1, 10)]
K = 7
print(group_dates(dates, K))


Output

[(0, [datetime.datetime(2019, 12, 27, 0, 0), datetime.datetime(2019, 12, 30, 0, 0)]), (1, [datetime.datetime(2020, 1, 4, 0, 0), datetime.datetime(2020, 1, 7, 0, 0), datetime.datetime(2020, 1, 10, 0, 0)]), (2, [datetime.datetime(2020, 1, 20, 0, 0)])]

Time complexity: O(n log n) – sorting the list of dates takes O(n log n) time, where n is the number of dates. The loop that iterates through the sorted list of dates takes O(n) time.

Auxiliary Space: O(n) – we store the groups of dates in a dictionary that can potentially contain n elements.

Method 3 :  use a while loop to iterate over the dates and create groups based on the K value. 

Approach:

  1. Sort the dates in ascending order
  2. Initialize an empty list called “groups”
  3. Set a variable called “current_group” to 0
  4. Set a variable called “group_start_date” to the first date in the sorted list
  5. Set a variable called “group_end_date” to None
  6. While there are still dates left in the list:
  7. Get the next date in the list
  8. If the difference between the current date and the group start date is less than or equal to K:
  9. Set the group end date to the current date
    Else:
  10. Append the current group (i.e., the dates between the group start date and the group end date) to the “groups” list
  11. Set the group start date to the current date
  12. Set the group end date to None
  13. Increment the current group number
  14. Append the final group to the “groups” list
  15. Return the “groups” list.

Python3




from collections import defaultdict
from datetime import datetime, timedelta
 
 
def group_dates(dates, K):
   
    groups = defaultdict(list)
    dates.sort()
     
    group_num = 0
     
    start_date = None
     
    for date in dates:
        if start_date is None:
            start_date = date
        else:
            diff = (date - start_date).days
            if diff > K:
                group_num += 1
                start_date = date
                 
        groups[str(group_num)].append(date)
         
    print(groups)
     
    return list(groups.items())
 
# input
dates = [datetime(2020, 1, 4),
         datetime(2019, 12, 30),
         datetime(2020, 1, 7),
         datetime(2019, 12, 27),
         datetime(2020, 1, 20),
         datetime(2020, 1, 10)]
 
K = 7
 
print(group_dates(dates, K))


Output

defaultdict(<class 'list'>, {'0': [datetime.datetime(2019, 12, 27, 0, 0), datetime.datetime(2019, 12, 30, 0, 0)], '1': [datetime.datetime(2020, 1, 4, 0, 0), datetime.datetime(2020, 1, 7, 0, 0), datetime.datetime(2020, 1, 10, 0, 0)], '2': [datetime.datetime(2020, 1, 20, 0, 0)]})
[('0', [datetime.datetime(2019, 12, 27, 0, 0), datetime.datetime(2019, 12, 30, 0, 0)]), ('1', [datetime.datetime(2020, 1, 4, 0, 0), datetime.datetime(2020, 1, 7, 0, 0), datetime.datetime(2020, 1, 10, 0, 0)]), ('2', [datetime.datetime(2020, 1, 20, 0, 0)])]

Time complexity: O(n), where n is the number of dates in the input list.
Auxiliary space: O(1) since it only uses a fixed number of variables.



Like Article
Suggest improvement
Previous
Next
Share your thoughts in the comments

Similar Reads