Open In App

3 Level Nested Dictionary To Multiindex Pandas Dataframe

Last Updated : 28 Mar, 2024
Improve
Improve
Like Article
Like
Save
Share
Report

Pandas is a powerful data manipulation and analysis library for Python. One of its key features is the ability to handle hierarchical indexing, also known as MultiIndexing. This allows you to work with more complex, multi-dimensional data in a structured way. In this article, we will explore how to convert a nested dictionary into a Pandas DataFrame with a 3-level MultiIndex.

Nested Dictionary:is A nested dictionary is a dictionary where the values are themselves dictionaries. In the context of Pandas, a 3-level MultiIndex DataFrame is a DataFrame with three levels of row indices. This structure is useful when dealing with data that has multiple dimensions or categories.

Nested Dictionary To Multiindex Pandas Dataframe

Below, is the example of Nested Dictionary To Multiindex Pandas Dataframe (3 Level) .

  • Using the from_dict Method
  • Using the concat Function
  • Using the Tuple Method
  • Using a List and pd.Series Method

Create a Nested Dictionary

Let’s consider a nested dictionary with three levels

Python3




data = {
    'A': {
        'X': {
            'i': 10,
            'ii': 20,
        },
        'Y': {
            'i': 30,
            'ii': 40,
        },
    },
    'B': {
        'X': {
            'i': 50,
            'ii': 60,
        },
        'Y': {
            'i': 70,
            'ii': 80,
        },
    },
}


Using the from_dict Method

The code creates a Pandas DataFrame (df) from a nested dictionary (data) with a 3-level MultiIndex by flattening the dictionary and setting the DataFrame’s index accordingly.

Python3




# Flatten the nested dictionary and create a DataFrame
df = pd.DataFrame.from_dict({(level1, level2, level3): value
                             for level1, inner_dict in data.items()
                             for level2, inner_inner_dict in inner_dict.items()
                             for level3, value in inner_inner_dict.items()}, orient='index')
  
# Set MultiIndex
df.index = pd.MultiIndex.from_tuples(df.index)
  
# Display the resulting DataFrame
print(df)


Output:

      0
A X i 10
ii 20
Y i 30
ii 40
B X i 50
ii 60
Y i 70
ii 80

Using the concat Function

This code converts a nested dictionary (data) into a Pandas DataFrame (df) with a 3-level MultiIndex by stacking inner levels of the dictionary, concatenating DataFrames along a new outer level.

Python3




dfs = [pd.DataFrame(data[i]).stack() for i in data.keys()]
df = pd.concat(dfs, keys=data.keys()).unstack().stack(level=[0])
print(df)


Output:

A  i   X    10
Y 30
ii X 20
Y 40
B i X 50
Y 70
ii X 60
Y 80
dtype: int64

Using the Tuple Method

This code flattens a nested dictionary (data) into a Pandas DataFrame (df) with a 3-level MultiIndex. It uses dictionary comprehension to create a flattened dictionary (flat), then constructs a MultiIndex from its keys.

Python3




flat = {(outerKey, innerKey, innermostKey): values for outerKey, innerDict in data.items() for innerKey, 
        innerDict2 in innerDict.items() for innermostKey, values in innerDict2.items()}
multi= pd.MultiIndex.from_tuples(flat.keys(), names=['first', 'second', 'third'])
df= pd.concat([pd.DataFrame(list(flat.values()), index=multi, columns=['value'])])
print(df)


Output :

 first second third  value
0 A X i 10
1 A X ii 20
2 A Y i 30
3 A Y ii 40
4 B X i 50
5 B X ii 60
6 B Y i 70
7 B Y ii 80

Using a List and pd.Series Method

This code flattens a nested dictionary (data) into a Pandas DataFrame (df) with a 3-level MultiIndex. It uses dictionary comprehension to create a flattened dictionary (flat), then constructs a DataFrame (multi) with separate columns for each level of the MultiIndex.

Python3




flat = {(outerKey, innerKey, innermostKey): values for outerKey, 
        innerDict in data.items() for innerKey, innerDict2 in innerDict.items() for innermostKey, 
        values in innerDict2.items()}
multi = pd.DataFrame(list(flat.keys()), columns=['first', 'second', 'third'])
multi_index = pd.MultiIndex.from_frame(multi)
values = pd.Series(list(flat.values()), 
                                   index=multi,
                                   name='value')
df = values.reset_index()
print(df)


Output :

 first second third  value
0 A X i 10
1 A X ii 20
2 A Y i 30
3 A Y ii 40
4 B X i 50
5 B X ii 60
6 B Y i 70
7 B Y ii 80


Like Article
Suggest improvement
Share your thoughts in the comments

Similar Reads