Python – Value nested grouping on List

Last Updated : 05 Apr, 2023

Sometimes, while working with data, we can have a problem in which we have flat data in the form of a list of dictionaries, and we need to perform the categorization from that bare dictionaries according to ids. This can have applications in domains that involve data, such as web development and Data Science. Let’s discuss the certain way in which this task can be performed.

Method 1: Using defaultdict() + setdefault() + loop

The combination of the above functionalities can be used to perform this task. It is brute way in which this can be performed. In this, we initialize the defaultdict() with dictionary values for nested records formation and according to populate the data using setdefault() and conditions.

Python3

# Python3 code to demonstrate working of
# Value nested grouping on List
# Using loop + setdefault() + defaultdict()
 
from collections import defaultdict
 
# initializing list
test_list = [{'value': 'Fruit'},
             {'tag': 'Fruit', 'value': 'mango'},
             {'value': 'Car'},
             {'tag': 'Car', 'value': 'maruti'},
             {'tag': 'Fruit', 'value': 'orange'},
             {'tag': 'Car', 'value': 'city'}]
 
 
# printing original list
print("The original list is : " + str(test_list))
 
# Value nested grouping on List
# Using loop + setdefault() + defaultdict()
temp = defaultdict(dict)
res = {}
for sub in test_list:
    type = sub['value']
    if 'tag' in sub:
        tag = sub['tag']
        temp[tag].setdefault(type, temp[type])
    else:
        res[type] = temp[type]
 
# printing result
print("The dictionary after grouping : " + str(res))

Output :

The original list is : [{‘value’: ‘Fruit’}, {‘tag’: ‘Fruit’, ‘value’: ‘mango’}, {‘value’: ‘Car’}, {‘tag’: ‘Car’, ‘value’: ‘maruti’}, {‘tag’: ‘Fruit’, ‘value’: ‘orange’}, {‘tag’: ‘Car’, ‘value’: ‘city’}] The dictionary after grouping : {‘Fruit’: {‘mango’: {}, ‘orange’: {}}, ‘Car’: {‘city’: {}, ‘maruti’: {}}}

Time complexity: O(n), where n is the number of elements in the input list.
Auxiliary space: O(n), where n is the number of elements in the input list

Method 2: “List Grouping with itertools.groupby()”

Use the groupby function to group the list of dictionaries by the ‘tag‘ key. If the ‘tag’ key is present, it creates a nested dictionary with the ‘value‘ key as the value. If the ‘tag‘ key is not present, it creates a dictionary with the ‘value‘ key as the key and an empty dictionary as the value.

Python3

from itertools import groupby
from operator import itemgetter
 
# initializing list
test_list = [{'value': 'Fruit'},
             {'tag': 'Fruit', 'value': 'mango'},
             {'value': 'Car'},
             {'tag': 'Car', 'value': 'maruti'},
             {'tag': 'Fruit', 'value': 'orange'},
             {'tag': 'Car', 'value': 'city'}]
 
# printing original list
print("The original list is : " + str(test_list))
 
# Value nested grouping on List
res = {}
for k, g in groupby(test_list, key=itemgetter('tag')):
    if k:
        res[k] = {i['value'] for i in g}
    else:
        res.update({i['value']: {} for i in g})
 
# printing result
print("The dictionary after grouping : " + str(res))

Output:

The original list is : [{'value': 'Fruit'}, {'tag': 'Fruit', 'value': 'mango'}, {'value': 'Car'}, {'tag': 'Car', 'value': 'maruti'}, {'tag': 'Fruit', 'value': 'orange'}, {'tag': 'Car', 'value': 'city'}]
The dictionary after grouping : {'Fruit': {'mango', 'orange'}, 'Car': {'city', 'maruti'}}

Time complexity: O(n log n), where n is the length of the input list.
Auxiliary space: O(n), where n is the length of the input list.

Method 3- Using dictionary comprehension and iterating over the list.

Python3

test_list = [{'value': 'Fruit'},
             {'tag': 'Fruit', 'value': 'mango'},
             {'value': 'Car'},
             {'tag': 'Car', 'value': 'maruti'},
             {'tag': 'Fruit', 'value': 'orange'},
             {'tag': 'Car', 'value': 'city'}]
 
# Grouping the list based on 'tag' and 'value' keys
res = {}
for d in test_list:
    if 'tag' in d:
        res.setdefault(d['tag'], {}).setdefault('value', set()).add(d['value'])
    else:
        res.setdefault(d['value'], {})
 
# printing result
print("The dictionary after grouping : " + str(res))

Output

The dictionary after grouping : {'Fruit': {'value': {'mango', 'orange'}}, 'Car': {'value': {'maruti', 'city'}}}

Note: In this method, if the ‘tag’ key is not present in the dictionary, the ‘value’ key with an empty set is added to the corresponding key in the resultant dictionary.

Time complexity: O(n)b where n is the length of the input list.
Auxiliary space: O(m), where m is the number of unique ‘tag’ and ‘value’ keys in the input list.

Method 4: Using pandas library

Pandas is a popular library for data manipulation and analysis, which includes powerful tools for grouping and aggregating data.

Python3

import pandas as pd
 
# initializing list
test_list = [{'value': 'Fruit'},
             {'tag': 'Fruit', 'value': 'mango'},
             {'value': 'Car'},
             {'tag': 'Car', 'value': 'maruti'},
             {'tag': 'Fruit', 'value': 'orange'},
             {'tag': 'Car', 'value': 'city'}]
 
# creating a DataFrame from the list
df = pd.DataFrame(test_list)
 
# grouping and aggregating data
res = df.groupby('tag')['value'].apply(set).to_dict()
 
# printing result
print("The dictionary after grouping : " + str(res))

Output:

The dictionary after grouping : {‘Car’: {‘city’, ‘maruti’}, ‘Fruit’: {‘orange’, ‘mango’}}

Time complexity: O(n log n)
Auxiliary Space: O(n + m)

Method 6: Using the collections module’s defaultdict() and a for loop to group the values based on the ‘tag’ key.

The program groups the values in the input list based on the ‘tag’ key using a defaultdict with set as the default value. It then converts the defaultdict to a regular dictionary and prints the result.

Python3

from collections import defaultdict
 
# initializing list
test_list = [{'value': 'Fruit'},
             {'tag': 'Fruit', 'value': 'mango'},
             {'value': 'Car'},
             {'tag': 'Car', 'value': 'maruti'},
             {'tag': 'Fruit', 'value': 'orange'},
             {'tag': 'Car', 'value': 'city'}]
 
# creating a defaultdict with set as the default value
res = defaultdict(set)
 
# iterating over the list and grouping the values
for item in test_list:
    if 'tag' in item:
        res[item['tag']].add(item['value'])
 
# converting the defaultdict to a regular dictionary
res = dict(res)
 
# printing result
print("The dictionary after grouping : " + str(res))

Output

The dictionary after grouping : {'Fruit': {'mango', 'orange'}, 'Car': {'city', 'maruti'}}

Time complexity: O(n), where n is the length of the input list.
Auxiliary space: O(n), where n is the length of the input list.

Method 6: Using a list comprehension and a lambda function

Create a lambda function that takes an item from the input list and returns a tuple containing the ‘tag’ value (or None if ‘tag’ key is not present) and the ‘value’ value.
Use the lambda function inside a list comprehension to extract the tuples from the input list.
Use the itertools.groupby() function to group the tuples based on the ‘tag’ value.
Use dictionary comprehension to convert the groups into a dictionary, where the keys are the ‘tag’ values and the values are lists of the corresponding ‘value’ values.

Python3

test_list = [{'value': 'Fruit'},             {'tag': 'Fruit', 'value': 'mango'},             {'value': 'Car'},             {
    'tag': 'Car', 'value': 'maruti'},             {'tag': 'Fruit', 'value': 'orange'},             {'tag': 'Car', 'value': 'city'}]
 
res = {k: {'value': set(v)} for k, v in
       [(d['tag'], [i['value'] for i in test_list if i.get('tag') == d['tag']])
        for d in test_list if 'tag' in d]}
 
for item in test_list:
    if 'value' in item and item['value'] not in res:
        res[item['value']] = {}
 
print("The dictionary after grouping: " + str(res))

Output

The dictionary after grouping: {'Fruit': {'value': {'mango', 'orange'}}, 'Car': {'value': {'city', 'maruti'}}, 'mango': {}, 'maruti': {}, 'orange': {}, 'city': {}}

Time complexity: O(n log n), due to the use of itertools.groupby() function.
Auxiliary space: O(n), for the list of tuples grouped_tuples and the dictionary result_dict.

Suggest improvement

Python - Vertical Grouping Value Lists

Share your thoughts in the comments