 Open in App
Not now

# Python – Multiple Keys Grouped Summation

• Last Updated : 17 Mar, 2023

Sometimes, while working with Python records, we can have a problem in which, we need to perform elements grouping based on multiple key equality, and also summation of the grouped result of particular key. This kind of problem can occur in application in data domains. Let’s discuss certain way in which this task can be performed.

Input
test_list = [(12, ‘M’, ‘Gfg’), (23, ‘H’, ‘Gfg’), (13, ‘M’, ‘Best’)]
grp_indx = [1, 2] [ Indices to group ]
sum_idx =  [ Index to sum ]
Output : [(‘M’, ‘Gfg’, 12), (‘H’, ‘Gfg’, 23), (‘M’, ‘Best’, 13)]

Input
test_list = [(12, ‘M’, ‘Gfg’), (23, ‘M’, ‘Gfg’), (13, ‘M’, ‘Best’)]
grp_indx = [1, 2] [ Indices to group ]
sum_idx =  [ Index to sum ]
Output : [(‘M’, ‘Gfg’, 35), (‘M’, ‘Best’, 13)]

Method 1: Using loop + defaultdict() + list comprehension

The combination of above functionalities can be used to solve this problem. In this, we perform grouping using loop and the task of performing summation of key is done using list comprehension.

## Python3

 `# Python3 code to demonstrate working of``# Multiple Keys Grouped Summation``# Using loop + defaultdict() + list comprehension``from` `collections ``import` `defaultdict` `# initializing list``test_list ``=` `[(``12``, ``'M'``, ``'Gfg'``), (``23``, ``'H'``, ``'Gfg'``),``            ``(``13``, ``'M'``, ``'Best'``), (``18``, ``'M'``, ``'Gfg'``),``            ``(``2``, ``'H'``, ``'Gfg'``), (``23``, ``'M'``, ``'Best'``)]` `# printing original list``print``(``"The original list is : "` `+` `str``(test_list))` `# initializing grouping indices``grp_indx ``=` `[``1``, ``2``]` `# initializing sum index``sum_idx ``=` `[``0``]` `# Multiple Keys Grouped Summation``# Using loop + defaultdict() + list comprehension``temp ``=` `defaultdict(``int``)``for` `sub ``in` `test_list:``    ``temp[(sub[grp_indx[``0``]], sub[grp_indx[``1``]])] ``+``=` `sub[sum_idx[``0``]]``res ``=` `[key ``+` `(val, ) ``for` `key, val ``in` `temp.items()]``                ` `# printing result``print``(``"The grouped summation : "` `+` `str``(res))`

Output :
The original list is : [(12, ‘M’, ‘Gfg’), (23, ‘H’, ‘Gfg’), (13, ‘M’, ‘Best’), (18, ‘M’, ‘Gfg’), (2, ‘H’, ‘Gfg’), (23, ‘M’, ‘Best’)]
The grouped summation : [(‘M’, ‘Gfg’, 30), (‘H’, ‘Gfg’, 25), (‘M’, ‘Best’, 36)]

Time complexity: O(n), where n is the length of the input list.
Auxiliary space: O(m), where m is the number of distinct combinations of grouping indices.

Method 2: Using itertools.groupby() and a lambda function for Multiple Keys Grouped Summation.

In this method we first sorts the input list using the sorted() function and a lambda function that extracts the grouping indices. It then uses itertools.groupby() to group the sorted list by the same indices. Finally, it uses a list comprehension to iterate over each group, summing the values of the sum_idx index for each element in the group, and creating a new tuple that includes the grouping indices and the summed value.

## Python3

 `from` `itertools ``import` `groupby` `# initializing list``test_list ``=` `[(``12``, ``'M'``, ``'Gfg'``), (``23``, ``'H'``, ``'Gfg'``),``            ``(``13``, ``'M'``, ``'Best'``), (``18``, ``'M'``, ``'Gfg'``),``            ``(``2``, ``'H'``, ``'Gfg'``), (``23``, ``'M'``, ``'Best'``)]` `# printing original list``print``(``"The original list is : "` `+` `str``(test_list))` `# initializing grouping indices``grp_indx ``=` `[``1``, ``2``]` `# initializing sum index``sum_idx ``=` `[``0``]` `# Multiple Keys Grouped Summation``# Using itertools.groupby() and a lambda function``res ``=` `[(key[``0``], key[``1``], ``sum``(sub[``0``] ``for` `sub ``in` `group))``       ``for` `key, group ``in` `groupby(``sorted``(test_list, key``=``lambda` `x: (x[grp_indx[``0``]], x[grp_indx[``1``]])),``                                 ``key``=``lambda` `x: (x[grp_indx[``0``]], x[grp_indx[``1``]]))]` `# printing result``print``(``"The grouped summation : "` `+` `str``(res))`

Output

```The original list is : [(12, 'M', 'Gfg'), (23, 'H', 'Gfg'), (13, 'M', 'Best'), (18, 'M', 'Gfg'), (2, 'H', 'Gfg'), (23, 'M', 'Best')]
The grouped summation : [('H', 'Gfg', 25), ('M', 'Best', 36), ('M', 'Gfg', 30)]```

Time complexity: O(n log n) because of the sorting operation. The groupby function itself has a time complexity of O(n).
Auxiliary space: O(n).

Method 3: Using pandas library

Pandas is a powerful library in Python for data manipulation and analysis. It has a groupby function that can be used to group data by one or more keys and perform operations on the grouped data.

## Python3

 `import` `pandas as pd` `# initializing list``test_list ``=` `[(``12``, ``'M'``, ``'Gfg'``), (``23``, ``'H'``, ``'Gfg'``),``            ``(``13``, ``'M'``, ``'Best'``), (``18``, ``'M'``, ``'Gfg'``),``            ``(``2``, ``'H'``, ``'Gfg'``), (``23``, ``'M'``, ``'Best'``)]` `# creating a pandas DataFrame from the list``df ``=` `pd.DataFrame(test_list, columns``=``[``'value'``, ``'key1'``, ``'key2'``])` `# grouping by key1 and key2 and summing the values``grouped ``=` `df.groupby([``'key1'``, ``'key2'``])[``'value'``].``sum``()` `# converting the result back to a list of tuples``res ``=` `[(key[``0``], key[``1``], value) ``for` `key, value ``in` `grouped.items()]` `# printing result``print``(``"The grouped summation : "` `+` `str``(res))`

```OUTPUT-
The grouped summation : [('H', 'Gfg', 25), ('M', 'Best', 36), ('M', 'Gfg', 30)]```

Time complexity: O(n log n) because of the sorting operation performed internally by pandas for grouping the data.
Auxiliary space: O(n) because pandas needs to create a DataFrame object to store the input data and perform the grouping operation.

Method 4: Using itertools.groupby() and operator.itemgetter()

Use the itertools.groupby() function and the operator.itemgetter() function to group the elements by their keys and sum the values.

## Python3

 `import` `itertools``import` `operator` `# initializing list``test_list ``=` `[(``12``, ``'M'``, ``'Gfg'``), (``23``, ``'H'``, ``'Gfg'``),``            ``(``13``, ``'M'``, ``'Best'``), (``18``, ``'M'``, ``'Gfg'``),``            ``(``2``, ``'H'``, ``'Gfg'``), (``23``, ``'M'``, ``'Best'``)]` `# initializing grouping indices``grp_indx ``=` `[``1``, ``2``]` `# initializing sum index``sum_idx ``=` `[``0``]` `# Multiple Keys Grouped Summation``# Using itertools.groupby() and operator.itemgetter()``test_list.sort(key``=``operator.itemgetter(``*``grp_indx))``res ``=` `[]``for` `k, g ``in` `itertools.groupby(test_list, key``=``operator.itemgetter(``*``grp_indx)):``    ``vals ``=` `[sub[sum_idx[``0``]] ``for` `sub ``in` `g]``    ``res.append(k ``+` `(``sum``(vals),))` `# printing result``print``(``"The grouped summation : "` `+` `str``(res))`

Output

`The grouped summation : [('H', 'Gfg', 25), ('M', 'Best', 36), ('M', 'Gfg', 30)]`

Time complexity: O(n log n) due to the sorting of the input list using the sorted() function.
Auxiliary space: O(n) because the result list res and the temporary list vals both have a maximum size of n, where n is the number of elements in the input list.

Method 5: Using dictionary comprehension

1. Initialize the input list, grouping indices and sum index.
2. Create a dictionary comprehension to initialize a dictionary with keys as tuples of grouping indices and values as 0.
3. Traverse through each sub-list in the input list, and update the corresponding key-value in the dictionary by adding the value at the sum index to the existing value.
4. Convert the dictionary to a list of tuples where each tuple contains the grouping indices followed by the sum.
5. Print the result.

## Python3

 `# initializing list``test_list ``=` `[(``12``, ``'M'``, ``'Gfg'``), (``23``, ``'H'``, ``'Gfg'``), (``13``, ``'M'``, ``'Best'``),``             ``(``18``, ``'M'``, ``'Gfg'``), (``2``, ``'H'``, ``'Gfg'``), (``23``, ``'M'``, ``'Best'``)]` `# initializing grouping indices``grp_indx ``=` `[``1``, ``2``]` `# initializing sum index``sum_idx ``=` `[``0``]` `# Multiple Keys Grouped Summation``# Using dictionary comprehension``temp ``=` `{(sub[grp_indx[``0``]], sub[grp_indx[``1``]]): ``0` `for` `sub ``in` `test_list}``for` `sub ``in` `test_list:``    ``temp[(sub[grp_indx[``0``]], sub[grp_indx[``1``]])] ``+``=` `sub[sum_idx[``0``]]``res ``=` `[key ``+` `(val,) ``for` `key, val ``in` `temp.items()]` `# printing result``print``(``"The grouped summation: "` `+` `str``(res))`

Output

`The grouped summation: [('M', 'Gfg', 30), ('H', 'Gfg', 25), ('M', 'Best', 36)]`

Time complexity: O(n). Where n is the length of the dictionary.
Auxiliary Space: O(m), where m is the number of unique combinations of grouping indices.

My Personal Notes arrow_drop_up