Convert multiple JSON files to CSV Python

In this article, we will learn how to convert multiple JSON files to CSV file in Python. Before that just recall some terms :

  • JSON File:  A JSON file may be a file that stores simple data structures and objects in JavaScript Object Notation (JSON) format, which may be a standard data interchange format. It is primarily used for transmitting data between an internet application and a server.
  • CSV File:  A CSV may be a comma-separated values file, which allows data to be saved during a tabular format. CSVs appear as if a garden-variety spreadsheet but with a .CSV extension. CSV files are often used with almost any spreadsheet program, like Microsoft Excel or Google Spreadsheets.

To form a CSV file from multiple JSON files, we have to use nested json file, flatten the dataframe or to load the json files into the form of dataframe, concatenate/merge/join these to form one dataframe (at least one column should be same in all json files) and at last convert this dataframe into CSV file. This full procedure of the given task can be understood with the help of examples which are given below :

Example 1:  If all columns match

In this example, we will load two json files, concatenate one to another and convert to a CSV file. The json files used for this are :

file1.json



{
    "ID":{
        "0":23,
        "1":43,
        "2":12,
        "3":13,
        "4":67,
        "5":89
        },
    "Name":{
        "0":"Ram",
        "1":"Deep",
        "2":"Yash",
        "3":"Aman",
        "4":"Arjun",
        "5":"Aditya"
        },
    "Marks":{
        "0":89,
        "1":97,
        "2":45,
        "3":78,
        "4":56,
        "5":76
        },
    "Grade":{
        "0":"B",
        "1":"A",
        "2":"F",
        "3":"C",
        "4":"E",
        "5":"C"
        }
}


file2.json

{
    "ID":{
        "0":90,
        "1":56,
        "2":34,
        "3":96,
        "4":45
         },
    "Name":{
        "0":"Akash",
        "1":"Chalsea",
        "2":"Divya",
        "3":"Sajal",
        "4":"Shubham"
        },
    "Marks":{
        "0":81,
        "1":87,
        "2":100,
        "3":89,
        "4":78
        },
    "Grade":{
        "0":"B",
        "1":"B",
        "2":"A",
        "3":"B",
        "4":"C"
        }
}


Step 1: Load the json files with the help of pandas dataframe.
Step 2 : Concatenate the dataframes into one dataframe.
Step 3: Convert the concatenated dataframe into CSV file.

The complete code with the result is shown below :

Code: 

Python3

filter_none

edit
close

play_arrow

link
brightness_4
code

# importing packages
import pandas as pd
  
# load json file using pandas
df1 = pd.read_json('file1.json')
  
# view data
print(df1)
  
# load json file using pandas
df2 = pd.read_json('file2.json')
  
# view data
print(df2)
  
# use pandas.concat method 
df = pd.concat([df1,df2])
  
# view the concatenated dataframe
print(df)
  
# convert dataframe to csv file
df.to_csv("CSV.csv",index=False)
  
# load the resultant csv file
result = pd.read_csv("CSV.csv")
  
# and view the data
print(result)

chevron_right


Output:

   ID    Name  Marks Grade
0  23     Ram     89     B
1  43    Deep     97     A
2  12    Yash     45     F
3  13    Aman     78     C
4  67   Arjun     56     E
5  89  Aditya     76     C

   ID     Name  Marks Grade
0  90    Akash     81     B
1  56  Chalsea     87     B
2  34    Divya    100     A
3  96    Sajal     89     B
4  45  Shubham     78     C

   ID     Name  Marks Grade
0  23      Ram     89     B
1  43     Deep     97     A
2  12     Yash     45     F
3  13     Aman     78     C
4  67    Arjun     56     E
5  89   Aditya     76     C
0  90    Akash     81     B
1  56  Chalsea     87     B
2  34    Divya    100     A
3  96    Sajal     89     B
4  45  Shubham     78     C

    ID     Name  Marks Grade
0   23      Ram     89     B
1   43     Deep     97     A
2   12     Yash     45     F
3   13     Aman     78     C
4   67    Arjun     56     E
5   89   Aditya     76     C
6   90    Akash     81     B
7   56  Chalsea     87     B
8   34    Divya    100     A
9   96    Sajal     89     B
10  45  Shubham     78     C

Example 2: If some columns match 

In this example, we will load two json files, merge these and convert to a CSV file. The json files used for this are :

file3.json



{
    "ID":{
        "0":23,
        "1":43,
        "2":12,
        "3":13,
        "4":67,
        "5":89
        },
    "Name":{
        "0":"Ram",
        "1":"Deep",
        "2":"Yash",
        "3":"Aman",
        "4":"Arjun",
        "5":"Aditya"
        },
    "Marks":{
        "0":89,
        "1":97,
        "2":45,
        "3":78,
        "4":56,
        "5":76
        }
}

file4.json

{
    "ID":{
        "0":23,
        "1":43,
        "2":12,
        "3":67,
        "4":89
        },
    "Name":{
        "0":"Ram",
        "1":"Deep",
        "2":"Yash",
        "3":"Arjun",
        "4":"Aditya"
        },
    "Grade":{
        "0":"B",
        "1":"A",
        "2":"F",
        "3":"E",
        "4":"C"
        }
}

Step 1: Load the json files with the help of pandas dataframe.

Step 2: Merge the dataframes by different methods as inner/outer/left/right joins.

Step 3: Convert the merged dataframe into CSV file.

The complete code with the result is shown below :

Code:

Python3

filter_none

edit
close

play_arrow

link
brightness_4
code

# importing packages
import pandas as pd
  
# load json file using pandas
df1 = pd.read_json('file3.json')
  
# view data
print(df1)
  
# load json file using pandas
df2 = pd.read_json('file4.json')
  
# view data
print(df2)
  
# use pandas.merge method
df_inner = pd.merge(df1, df2, how='inner', left_on=[
                    'ID', 'Name'], right_on=['ID', 'Name'])
df_outer = pd.merge(df1, df2, how='outer', left_on=[
                    'ID', 'Name'], right_on=['ID', 'Name'])
df_left = pd.merge(df1, df2, how='left', left_on=[
                   'ID', 'Name'], right_on=['ID', 'Name'])
df_right = pd.merge(df1, df2, how='right', left_on=[
                    'ID', 'Name'], right_on=['ID', 'Name'])
  
# convert dataframe to csv file
df_inner.to_csv("CSV_inner.csv", index=False)
df_outer.to_csv("CSV_outer.csv", index=False)
df_left.to_csv("CSV_left.csv", index=False)
df_right.to_csv("CSV_right.csv", index=False)
  
# load the resultant csv file
result_inner = pd.read_csv("CSV_inner.csv")
result_outer = pd.read_csv("CSV_outer.csv")
result_left = pd.read_csv("CSV_left.csv")
result_right = pd.read_csv("CSV_right.csv")
  
# and view the data
print(result_outer)
print(result_inner)
print(result_left)
print(result_right)

chevron_right


Output:

   ID    Name  Marks
0  23     Ram     89
1  43    Deep     97
2  12    Yash     45
3  13    Aman     78
4  67   Arjun     56
5  89  Aditya     76

   ID    Name Grade
0  23     Ram     B
1  43    Deep     A
2  12    Yash     F
3  67   Arjun     E
4  89  Aditya     C

   ID    Name  Marks Grade
0  23     Ram     89     B
1  43    Deep     97     A
2  12    Yash     45     F
3  13    Aman     78   NaN
4  67   Arjun     56     E
5  89  Aditya     76     C

   ID    Name  Marks Grade
0  23     Ram     89     B
1  43    Deep     97     A
2  12    Yash     45     F
3  67   Arjun     56     E
4  89  Aditya     76     C

   ID    Name  Marks Grade
0  23     Ram     89     B
1  43    Deep     97     A
2  12    Yash     45     F
3  13    Aman     78   NaN
4  67   Arjun     56     E
5  89  Aditya     76     C

   ID    Name  Marks Grade
0  23     Ram     89     B
1  43    Deep     97     A
2  12    Yash     45     F
3  67   Arjun     56     E
4  89  Aditya     76     C

Example 3:  If nested json file is given 

In this example, we will load nested json file, flatten it and then convert into CSV file. The json file used for this is :



file5.json

{
  "tickets":[
    {
      "Name": "Liam",
      "Location": {
        "City": "Los Angeles",
        "State": "CA"
      },
      "hobbies": [
        "Piano",
        "Sports"
      ],
      "year" : 1985,
      "teamId" : "ATL",
      "playerId" : "barkele01",
      "salary" : 870000
    },
    {
      "Name": "John",
      "Location": {
        "City": "Los Angeles",
        "State": "CA"
      },
      "hobbies": [
        "Music",
        "Running"
      ],
      "year" : 1985,
      "teamId" : "ATL",
      "playerId" : "bedrost01",
      "salary" : 550000
    }
  ],
  "count": 2
}

Step 1: Load the nested json file with the help of json.load() method.

Step 2: Flatten the different column values using pandas methods.

Step 3:  Convert the flattened dataframe into CSV file.

Repeat the above steps for both the nested files and then follow either example 1 or example 2 for conversion. To convert a single nested json file follow the method given below.

The complete code with the result is shown below :

Code:

Python3

filter_none

edit
close

play_arrow

link
brightness_4
code

# importing packages
import pandas as pd
import json
  
# load json file using json.load
with open('file5.json') as file:
    data = json.load(file)
  
# view data
print(data)
  
# form the dataframe
df = pd.DataFrame(data['tickets'])
  
# view dataframe
print(df)
  
# flattern the dataframe and remove unnecessary columns
for i, item in enumerate(df['Location']):
    df['location_city'] = dict(df['Location'])[i]['City']
    df['location_state'] = dict(df['Location'])[i]['State']
  
for i, item in enumerate(df['hobbies']):
    df['hobbies_{}'.format(i)] = dict(df['hobbies'])[i]
  
df = df.drop({'Location', 'hobbies'}, axis=1)
  
# view dataframe
print(df)
  
# convert dataframe to csv file
df.to_csv("CSV.csv", index=False)
  
# load the resultant csv file
result = pd.read_csv("CSV.csv")
  
# and view the data
print(result)

chevron_right


Output:

{‘tickets’: [{‘Name’: ‘Liam’, ‘Location’: {‘City’: ‘Los Angeles’, ‘State’: ‘CA’}, ‘hobbies’: [‘Piano’, ‘Sports’], ‘year’: 1985, ‘teamId’: ‘ATL’, ‘playerId’: ‘barkele01’, ‘salary’: 870000}, {‘Name’: ‘John’, ‘Location’: {‘City’: ‘Los Angeles’, ‘State’: ‘CA’}, ‘hobbies’: [‘Music’, ‘Running’], ‘year’: 1985, ‘teamId’: ‘ATL’, ‘playerId’: ‘bedrost01’, ‘salary’: 550000}], ‘count’: 2}



                                Location  Name                  hobbies   playerId  \

0  {‘City’: ‘Los Angeles’, ‘State’: ‘CA’}  Liam   [Piano, Sports]  barkele01   

1  {‘City’: ‘Los Angeles’, ‘State’: ‘CA’}  John  [Music, Running]  bedrost01  

 

    salary teamId  year  

0  870000    ATL  1985  

1  550000    ATL  1985  
 

    Name   playerId  salary teamId  year location_city location_state  \

0  Liam  barkele01  870000    ATL  1985   Los Angeles             CA   

1  John  bedrost01  550000    ATL  1985   Los Angeles             CA   



      hobbies_0 hobbies_1  

0     Piano     Music  

1    Sports   Running  
  

    Name   playerId  salary teamId  year location_city location_state  \

0  Liam  barkele01  870000    ATL  1985   Los Angeles             CA   

1  John  bedrost01  550000    ATL  1985   Los Angeles             CA  

 

     hobbies_0 hobbies_1  

0     Piano     Music  

1    Sports   Running  

Attention geek! Strengthen your foundations with the Python Programming Foundation Course and learn the basics.

To begin with, your interview preparations Enhance your Data Structures concepts with the Python DS Course.




My Personal Notes arrow_drop_up

Check out this Author's contributed articles.

If you like GeeksforGeeks and would like to contribute, you can also write an article using contribute.geeksforgeeks.org or mail your article to contribute@geeksforgeeks.org. See your article appearing on the GeeksforGeeks main page and help other Geeks.

Please Improve this article if you find anything incorrect by clicking on the "Improve Article" button below.


Article Tags :

1


Please write to us at contribute@geeksforgeeks.org to report any issue with the above content.