Open In App

Python – Summation Grouping in Dictionary List

Sometimes, while working with Python Dictionaries, we can have a problem in which we need to perform the grouping of dictionaries according to specific key, and perform summation of certain keys while grouping similar key’s value. This is s peculiar problem but can have applications in domains such as web development. Let’s discuss a certain way in which this task can be performed.

Input : test_list = [{‘geeks’: 10, ‘best’: 8, ‘Gfg’: 6}, {‘geeks’: 12, ‘best’: 11, ‘Gfg’: 4}] 
Output : {4: {‘geeks’: 12, ‘best’: 11}, 6: {‘geeks’: 10, ‘best’: 8}} 



Input : test_list = [{‘CS’: 14, ‘best’: 9, ‘geeks’: 7, ‘Gfg’: 14}] 
Output : {14: {‘best’: 9, ‘geeks’: 7}}

Method #1 : Using loop This is brute force way in which this task can be performed. In this we manually loop through each key from dictionary and perform required grouping according to the key and construct the required grouped and summed dictionary. 






# Python3 code to demonstrate working of
# Summation Grouping in Dictionary List
# Using loop
 
# initializing list
test_list = [{'Gfg' 1, 'id' : 2, 'best' : 8, 'geeks' : 10},
             {'Gfg' 4, 'id' : 4, 'best'10, 'geeks' : 12},
             {'Gfg' 4, 'id' : 8, 'best'11, 'geeks' : 15}]
 
# printing original list
print("The original list is : " + str(test_list))
 
# initializing group key
grp_key = 'Gfg'
 
# initializing sum keys
sum_keys = ['best', 'geeks']
 
# Summation Grouping in Dictionary List
# Using loop
res = {}
for sub in test_list:
    ele = sub[grp_key]
    if ele not in res:
        res[ele] = {x: 0 for x in sum_keys}
    for y in sum_keys:
        res[ele][y] += int(sub[y])
 
# printing result
print("The grouped list : " + str(res))

Output : 

The original list is : [{‘geeks’: 10, ‘id’: 2, ‘best’: 8, ‘Gfg’: 1}, {‘geeks’: 12, ‘id’: 4, ‘best’: 10, ‘Gfg’: 4}, {‘geeks’: 15, ‘id’: 8, ‘best’: 11, ‘Gfg’: 4}] The grouped list : {1: {‘geeks’: 10, ‘best’: 8}, 4: {‘geeks’: 27, ‘best’: 21}}

Time complexity: O(nm), where n is the length of the input list and m is the number of sum keys. 
Auxiliary space: O(nm), because the program creates a dictionary with n keys, and each key has an associated dictionary of size m. 

Method #2: Using defaultdict

Use defaultdict with a lambda function that returns a dictionary with the keys from the sum_keys list and the initial values set to 0 is used. When a new group key is found, the default value is created and the sum of the values for the corresponding keys is calculated. Finally, the res dictionary is converted to a regular dictionary using the dict() function before being printed.




from collections import defaultdict
 
test_list = [{'Gfg' 1, 'id' : 2, 'best' : 8, 'geeks' : 10},
             {'Gfg' 4, 'id' : 4, 'best'10, 'geeks' : 12},
             {'Gfg' 4, 'id' : 8, 'best'11, 'geeks' : 15}]
 
grp_key = 'Gfg'
sum_keys = ['best', 'geeks']
 
res = defaultdict(lambda: {x: 0 for x in sum_keys})
for sub in test_list:
    ele = sub[grp_key]
    for y in sum_keys:
        res[ele][y] += int(sub[y])
 
print("The grouped list: ", dict(res))

Output
The grouped list:  {1: {'best': 8, 'geeks': 10}, 4: {'best': 21, 'geeks': 27}}

Time complexity: O(N*M), where N is the number of dictionaries in the test_list and M is the number of keys in each dictionary that need to be summed (in this case, 2: ‘best’ and ‘geeks’). 
Auxiliary space: O(K*N), where K is the number of keys in each dictionary and N is the number of dictionaries in the test_list. 

Method 3 :  Using list comprehension and the built-in function sum()




test_list = [{'Gfg': 1, 'id': 2, 'best': 8, 'geeks': 10},
             {'Gfg': 4, 'id': 4, 'best': 10, 'geeks': 12},
             {'Gfg': 4, 'id': 8, 'best': 11, 'geeks': 15}]
grp_key = 'Gfg'
sum_keys = ['best', 'geeks']
 
result_list = [{grp_key: key, **{skey: sum(sub[skey] for sub in test_list if sub[grp_key] == key)}}
               for key in set(sub[grp_key] for sub in test_list)
               for skey in sum_keys]
 
print("The grouped list: ", result_list)

Output
The grouped list:  [{'Gfg': 1, 'best': 8}, {'Gfg': 1, 'geeks': 10}, {'Gfg': 4, 'best': 21}, {'Gfg': 4, 'geeks': 27}]

The time complexity of this method is O(n*m), where n is the number of dictionaries in the input list and m is the number of keys in the sum_keys list. 
The auxiliary space is O(n), where n is the number of groups

Method 4: Using the pandas library

Step-by-step approach:

Below is the implementation of the above approach:




# importing pandas library
import pandas as pd
 
# initializing list
test_list = [{'Gfg' 1, 'id' : 2, 'best' : 8, 'geeks' : 10},
             {'Gfg' 4, 'id' : 4, 'best'10, 'geeks' : 12},
             {'Gfg' 4, 'id' : 8, 'best'11, 'geeks' : 15}]
 
# creating a DataFrame from the list
df = pd.DataFrame(test_list)
 
# initializing group key
grp_key = 'Gfg'
 
# initializing sum keys
sum_keys = ['best', 'geeks']
 
# grouping the DataFrame by the group key and summing the sum keys
res_df = df.groupby(grp_key)[sum_keys].sum()
 
# converting the resulting DataFrame back to a dictionary
res = res_df.to_dict('index')
 
# printing result
print("The grouped list: " + str(res))

OUTPUT:

The grouped list: {1: {'best': 8, 'geeks': 10}, 4: {'best': 21, 'geeks': 27}}

Time complexity: O(n), where n is the number of dictionaries in the given list test_list. 
Auxiliary space: O(n), as we are creating a DataFrame and a dict with the same number of entries as test_list.


Article Tags :