Python – Summation Grouping in Dictionary List

Sometimes, while working with Python Dictionaries, we can have a problem in which we need to perform the grouping of dictionaries according to specific key, and perform summation of certain keys while grouping similar key’s value. This is s peculiar problem but can have applications in domains such as web development. Let’s discuss a certain way in which this task can be performed.

Input : test_list = [{‘geeks’: 10, ‘best’: 8, ‘Gfg’: 6}, {‘geeks’: 12, ‘best’: 11, ‘Gfg’: 4}]
Output : {4: {‘geeks’: 12, ‘best’: 11}, 6: {‘geeks’: 10, ‘best’: 8}}

Input : test_list = [{‘CS’: 14, ‘best’: 9, ‘geeks’: 7, ‘Gfg’: 14}]
Output : {14: {‘best’: 9, ‘geeks’: 7}}

Method #1 : Using loop This is brute force way in which this task can be performed. In this we manually loop through each key from dictionary and perform required grouping according to the key and construct the required grouped and summed dictionary.

Python3

# Python3 code to demonstrate working of 
# Summation Grouping in Dictionary List
# Using loop
 
# initializing list

test_list = [{'Gfg' :  1, 'id' : 2, 'best' : 8, 'geeks' : 10}, 

             {'Gfg' :  4, 'id' : 4, 'best':  10, 'geeks' : 12}, 

             {'Gfg' :  4, 'id' : 8, 'best':  11, 'geeks' : 15}]
 
# printing original list

print("The original list is : " + str(test_list))
 
# initializing group key 

grp_key = 'Gfg'
 
# initializing sum keys 

sum_keys = ['best', 'geeks']
 
# Summation Grouping in Dictionary List
# Using loop

res = {}

for sub in test_list:

    ele = sub[grp_key]

    if ele not in res:

        res[ele] = {x: 0 for x in sum_keys}

    for y in sum_keys:

        res[ele][y] += int(sub[y])
 
# printing result 

print("The grouped list : " + str(res))

Output :

The original list is : [{‘geeks’: 10, ‘id’: 2, ‘best’: 8, ‘Gfg’: 1}, {‘geeks’: 12, ‘id’: 4, ‘best’: 10, ‘Gfg’: 4}, {‘geeks’: 15, ‘id’: 8, ‘best’: 11, ‘Gfg’: 4}] The grouped list : {1: {‘geeks’: 10, ‘best’: 8}, 4: {‘geeks’: 27, ‘best’: 21}}

Time complexity: O(nm), where n is the length of the input list and m is the number of sum keys.
Auxiliary space: O(nm), because the program creates a dictionary with n keys, and each key has an associated dictionary of size m.

Method #2: Using defaultdict

Use defaultdict with a lambda function that returns a dictionary with the keys from the sum_keys list and the initial values set to 0 is used. When a new group key is found, the default value is created and the sum of the values for the corresponding keys is calculated. Finally, the res dictionary is converted to a regular dictionary using the dict() function before being printed.

Python3

from collections import defaultdict
 
test_list = [{'Gfg' :  1, 'id' : 2, 'best' : 8, 'geeks' : 10}, 

             {'Gfg' :  4, 'id' : 4, 'best':  10, 'geeks' : 12}, 

             {'Gfg' :  4, 'id' : 8, 'best':  11, 'geeks' : 15}]
 
grp_key = 'Gfg'

sum_keys = ['best', 'geeks']
 
res = defaultdict(lambda: {x: 0 for x in sum_keys})

for sub in test_list:

    ele = sub[grp_key]

    for y in sum_keys:

        res[ele][y] += int(sub[y])
 
print("The grouped list: ", dict(res))

Output

The grouped list:  {1: {'best': 8, 'geeks': 10}, 4: {'best': 21, 'geeks': 27}}

Time complexity: O(N*M), where N is the number of dictionaries in the test_list and M is the number of keys in each dictionary that need to be summed (in this case, 2: ‘best’ and ‘geeks’).
Auxiliary space: O(K*N), where K is the number of keys in each dictionary and N is the number of dictionaries in the test_list.

Method 3 : Using list comprehension and the built-in function sum()

Define the input data and the grouping and summarizing keys:
Use list comprehension to create a new list of dictionaries where each dictionary contains the group key and the sums of the values for the sum keys:
Print the result list:

Python3

test_list = [{'Gfg': 1, 'id': 2, 'best': 8, 'geeks': 10},

             {'Gfg': 4, 'id': 4, 'best': 10, 'geeks': 12},

             {'Gfg': 4, 'id': 8, 'best': 11, 'geeks': 15}]

grp_key = 'Gfg'

sum_keys = ['best', 'geeks']
 
result_list = [{grp_key: key, **{skey: sum(sub[skey] for sub in test_list if sub[grp_key] == key)}}

               for key in set(sub[grp_key] for sub in test_list)

               for skey in sum_keys]
 
print("The grouped list: ", result_list)

Output

The grouped list:  [{'Gfg': 1, 'best': 8}, {'Gfg': 1, 'geeks': 10}, {'Gfg': 4, 'best': 21}, {'Gfg': 4, 'geeks': 27}]

The time complexity of this method is O(n*m), where n is the number of dictionaries in the input list and m is the number of keys in the sum_keys list.
The auxiliary space is O(n), where n is the number of groups

Method 4: Using the pandas library

Step-by-step approach:

Import the pandas library.
Create a DataFrame from the given list test_list.
Group the DataFrame by the group key grp_key.
Sum the values of the sum keys sum_keys for each group.
Convert the resulting DataFrame back to a dictionary.

Below is the implementation of the above approach:

Python3

# importing pandas library

import pandas as pd
 
# initializing list

test_list = [{'Gfg' :  1, 'id' : 2, 'best' : 8, 'geeks' : 10}, 

             {'Gfg' :  4, 'id' : 4, 'best':  10, 'geeks' : 12}, 

             {'Gfg' :  4, 'id' : 8, 'best':  11, 'geeks' : 15}]
 
# creating a DataFrame from the list

df = pd.DataFrame(test_list)
 
# initializing group key 

grp_key = 'Gfg'
 
# initializing sum keys 

sum_keys = ['best', 'geeks']
 
# grouping the DataFrame by the group key and summing the sum keys

res_df = df.groupby(grp_key)[sum_keys].sum()
 
# converting the resulting DataFrame back to a dictionary

res = res_df.to_dict('index')
 
# printing result 

print("The grouped list: " + str(res))

OUTPUT:

The grouped list: {1: {'best': 8, 'geeks': 10}, 4: {'best': 21, 'geeks': 27}}

Time complexity: O(n), where n is the number of dictionaries in the given list test_list.
Auxiliary space: O(n), as we are creating a DataFrame and a dict with the same number of entries as test_list.

Article Tags :

Python

Python Programs

Python list-programs