Python – Value nested grouping on List
Last Updated :
05 Apr, 2023
Sometimes, while working with data, we can have a problem in which we have flat data in the form of a list of dictionaries, and we need to perform the categorization from that bare dictionaries according to ids. This can have applications in domains that involve data, such as web development and Data Science. Let’s discuss the certain way in which this task can be performed.
Method 1: Using defaultdict() + setdefault() + loop
The combination of the above functionalities can be used to perform this task. It is brute way in which this can be performed. In this, we initialize the defaultdict() with dictionary values for nested records formation and according to populate the data using setdefault() and conditions.
Python3
from collections import defaultdict
test_list = [{ 'value' : 'Fruit' },
{ 'tag' : 'Fruit' , 'value' : 'mango' },
{ 'value' : 'Car' },
{ 'tag' : 'Car' , 'value' : 'maruti' },
{ 'tag' : 'Fruit' , 'value' : 'orange' },
{ 'tag' : 'Car' , 'value' : 'city' }]
print ( "The original list is : " + str (test_list))
temp = defaultdict( dict )
res = {}
for sub in test_list:
type = sub[ 'value' ]
if 'tag' in sub:
tag = sub[ 'tag' ]
temp[tag].setdefault( type , temp[ type ])
else :
res[ type ] = temp[ type ]
print ( "The dictionary after grouping : " + str (res))
|
Output :
The original list is : [{‘value’: ‘Fruit’}, {‘tag’: ‘Fruit’, ‘value’: ‘mango’}, {‘value’: ‘Car’}, {‘tag’: ‘Car’, ‘value’: ‘maruti’}, {‘tag’: ‘Fruit’, ‘value’: ‘orange’}, {‘tag’: ‘Car’, ‘value’: ‘city’}] The dictionary after grouping : {‘Fruit’: {‘mango’: {}, ‘orange’: {}}, ‘Car’: {‘city’: {}, ‘maruti’: {}}}
Time complexity: O(n), where n is the number of elements in the input list.
Auxiliary space: O(n), where n is the number of elements in the input list
Method 2: “List Grouping with itertools.groupby()”
Use the groupby function to group the list of dictionaries by the ‘tag‘ key. If the ‘tag’ key is present, it creates a nested dictionary with the ‘value‘ key as the value. If the ‘tag‘ key is not present, it creates a dictionary with the ‘value‘ key as the key and an empty dictionary as the value.
Python3
from itertools import groupby
from operator import itemgetter
test_list = [{ 'value' : 'Fruit' },
{ 'tag' : 'Fruit' , 'value' : 'mango' },
{ 'value' : 'Car' },
{ 'tag' : 'Car' , 'value' : 'maruti' },
{ 'tag' : 'Fruit' , 'value' : 'orange' },
{ 'tag' : 'Car' , 'value' : 'city' }]
print ( "The original list is : " + str (test_list))
res = {}
for k, g in groupby(test_list, key = itemgetter( 'tag' )):
if k:
res[k] = {i[ 'value' ] for i in g}
else :
res.update({i[ 'value' ]: {} for i in g})
print ( "The dictionary after grouping : " + str (res))
|
Output:
The original list is : [{'value': 'Fruit'}, {'tag': 'Fruit', 'value': 'mango'}, {'value': 'Car'}, {'tag': 'Car', 'value': 'maruti'}, {'tag': 'Fruit', 'value': 'orange'}, {'tag': 'Car', 'value': 'city'}]
The dictionary after grouping : {'Fruit': {'mango', 'orange'}, 'Car': {'city', 'maruti'}}
Time complexity: O(n log n), where n is the length of the input list.
Auxiliary space: O(n), where n is the length of the input list.
Method 3- Using dictionary comprehension and iterating over the list.
Python3
test_list = [{ 'value' : 'Fruit' },
{ 'tag' : 'Fruit' , 'value' : 'mango' },
{ 'value' : 'Car' },
{ 'tag' : 'Car' , 'value' : 'maruti' },
{ 'tag' : 'Fruit' , 'value' : 'orange' },
{ 'tag' : 'Car' , 'value' : 'city' }]
res = {}
for d in test_list:
if 'tag' in d:
res.setdefault(d[ 'tag' ], {}).setdefault( 'value' , set ()).add(d[ 'value' ])
else :
res.setdefault(d[ 'value' ], {})
print ( "The dictionary after grouping : " + str (res))
|
Output
The dictionary after grouping : {'Fruit': {'value': {'mango', 'orange'}}, 'Car': {'value': {'maruti', 'city'}}}
Note: In this method, if the ‘tag’ key is not present in the dictionary, the ‘value’ key with an empty set is added to the corresponding key in the resultant dictionary.
Time complexity: O(n)b where n is the length of the input list.
Auxiliary space: O(m), where m is the number of unique ‘tag’ and ‘value’ keys in the input list.
Method 4: Using pandas library
Pandas is a popular library for data manipulation and analysis, which includes powerful tools for grouping and aggregating data.
Python3
import pandas as pd
test_list = [{ 'value' : 'Fruit' },
{ 'tag' : 'Fruit' , 'value' : 'mango' },
{ 'value' : 'Car' },
{ 'tag' : 'Car' , 'value' : 'maruti' },
{ 'tag' : 'Fruit' , 'value' : 'orange' },
{ 'tag' : 'Car' , 'value' : 'city' }]
df = pd.DataFrame(test_list)
res = df.groupby( 'tag' )[ 'value' ]. apply ( set ).to_dict()
print ( "The dictionary after grouping : " + str (res))
|
Output:
The dictionary after grouping : {‘Car’: {‘city’, ‘maruti’}, ‘Fruit’: {‘orange’, ‘mango’}}
Time complexity: O(n log n)
Auxiliary Space: O(n + m)
Method 6: Using the collections module’s defaultdict() and a for loop to group the values based on the ‘tag’ key.
The program groups the values in the input list based on the ‘tag’ key using a defaultdict with set as the default value. It then converts the defaultdict to a regular dictionary and prints the result.
Python3
from collections import defaultdict
test_list = [{ 'value' : 'Fruit' },
{ 'tag' : 'Fruit' , 'value' : 'mango' },
{ 'value' : 'Car' },
{ 'tag' : 'Car' , 'value' : 'maruti' },
{ 'tag' : 'Fruit' , 'value' : 'orange' },
{ 'tag' : 'Car' , 'value' : 'city' }]
res = defaultdict( set )
for item in test_list:
if 'tag' in item:
res[item[ 'tag' ]].add(item[ 'value' ])
res = dict (res)
print ( "The dictionary after grouping : " + str (res))
|
Output
The dictionary after grouping : {'Fruit': {'mango', 'orange'}, 'Car': {'city', 'maruti'}}
Time complexity: O(n), where n is the length of the input list.
Auxiliary space: O(n), where n is the length of the input list.
Method 6: Using a list comprehension and a lambda function
- Create a lambda function that takes an item from the input list and returns a tuple containing the ‘tag’ value (or None if ‘tag’ key is not present) and the ‘value’ value.
- Use the lambda function inside a list comprehension to extract the tuples from the input list.
- Use the itertools.groupby() function to group the tuples based on the ‘tag’ value.
- Use dictionary comprehension to convert the groups into a dictionary, where the keys are the ‘tag’ values and the values are lists of the corresponding ‘value’ values.
Python3
test_list = [{ 'value' : 'Fruit' }, { 'tag' : 'Fruit' , 'value' : 'mango' }, { 'value' : 'Car' }, {
'tag' : 'Car' , 'value' : 'maruti' }, { 'tag' : 'Fruit' , 'value' : 'orange' }, { 'tag' : 'Car' , 'value' : 'city' }]
res = {k: { 'value' : set (v)} for k, v in
[(d[ 'tag' ], [i[ 'value' ] for i in test_list if i.get( 'tag' ) = = d[ 'tag' ]])
for d in test_list if 'tag' in d]}
for item in test_list:
if 'value' in item and item[ 'value' ] not in res:
res[item[ 'value' ]] = {}
print ( "The dictionary after grouping: " + str (res))
|
Output
The dictionary after grouping: {'Fruit': {'value': {'mango', 'orange'}}, 'Car': {'value': {'city', 'maruti'}}, 'mango': {}, 'maruti': {}, 'orange': {}, 'city': {}}
Time complexity: O(n log n), due to the use of itertools.groupby() function.
Auxiliary space: O(n), for the list of tuples grouped_tuples and the dictionary result_dict.
Like Article
Suggest improvement
Share your thoughts in the comments
Please Login to comment...