Open In App

Normalizing Nested Json Object Into Pandas Dataframe

Last Updated : 08 Feb, 2024
Improve
Improve
Like Article
Like
Save
Share
Report

The encapsulation of one or more JSON objects into another JSON object is called a nested JSON object. There are various circumstances when we have the data in JSON object form and have to imply numerous functions on the dataset. In such cases, we can use the Python library called Pandas which is used for datasets. For converting into a Pandas data frame, we need to normalize the nested JSON object. In this article, we will discuss the same.

Normalizing Nested JSON Objects

Normalizing nested JSON objects refers to restructuring the data into a flat format, typically with key-value pairs, to simplify analysis or storage. This process involves expanding nested structures, such as arrays or objects within objects, into separate entities. Normalization aids in easier querying, indexing, and processing of JSON data.

Normalizing Nested Json Object Into Pandas Dataframe

Importing Pandas

Python3




import pandas as pd


Using json_normalize

Normalizing a nested JSON object into a Pandas DataFrame involves converting the hierarchical structure of the JSON into a tabular format. This process often entails using the json_normalize() function in Pandas to flatten nested dictionaries or lists within the JSON object and create a DataFrame with appropriate columns. This enables easier manipulation, analysis, and visualization of the JSON data within Python’s Pandas ecosystem.

Syntax:

df = pandas.json_normalize(json_object).

Here, json_object: It is the nested JSON object that we need to convert to Pandas data frame.

Example:1

We have defined the JSON of books, with objects as id, author, editor, title, and category. Further, author and editor are nested into lastname and firstname. We have converted nested JSON object to Pandas data frame using json_normalize function.

Python3




# Create a list of nested JSON objects
data = [{"id": "123",
         "author":
         {
             "firstname": "Jane",
             "lastname": "Doe"
         },
         "editor":
         {
             "firstname": "Jane",
             "lastname": "Smith"
         },
         "title": "The Ultimate Database Study Guide",
         "category": ["Non-Fiction", "Technology"]
         },
        {"id": "120",
         "author":
         {
             "lastname": "Gunjan",
             "firstname": "Pawan"
         },
         "editor":
         {
             "lastname": "Gunjan",
             "firstname": "Nitya"
         },
         "title": "Om",
         "category": ["Spritual", "Meditations"]
         }
        ]
 
# Normalize the list of JSON objects into a Pandas dataframe
df = pd.json_normalize(data)
print(df)


Output:

    id                              title                   category  \
0 123 The Ultimate Database Study Guide [Non-Fiction, Technology]
1 120 Om [Spritual, Meditations]

author.firstname author.lastname editor.firstname editor.lastname
0 Jane Doe Jane Smith
1 Pawan Gunjan Nitya Gunjan

Example 2

Now, we will load our json file then use json_normalize .

Python3




import json
 
# Load data from 'complex_data.json'
with open('complex_data.json') as f:
    data = json.load(f)
 
# Use pd.json_normalize to convert the JSON to a DataFrame
df = pd.json_normalize(data)
 
# Display the DataFrame
print(df)


Output:

   name  age      city                                           contacts  \
0 John 30 New York [{'type': 'email', 'value': 'john@example.com'...
interests
0 [programming, reading, traveling]



Like Article
Suggest improvement
Share your thoughts in the comments

Similar Reads