Skip to content
Related Articles

Related Articles

How to convert pandas DataFrame into JSON in Python?

View Discussion
Improve Article
Save Article
  • Last Updated : 21 Apr, 2020

Data Analysis is an extremely important tool in today’s world. A key aspect of Data Analytics is an organized representation of data. There are numerous data structures in computer science to achieve this task. In this article, we talk about two such data structures viz. pandas DataFrames and JSON . Further, we see how to convert DataFrames to JSON format.

Pandas DataFrames are tabular representations of data where columns represent the various data points in single data entry and each row is unique data entry. Whereas JSON is a text written in JavaScript Object notations.

Note: For more information, refer to Python | Pandas DataFrame

Convert pandas DataFrame into JSON

To convert pandas DataFrames to JSON format we use the function DataFrame.to_json() from the pandas library in Python. There are multiple customizations available in the to_json function to achieve the desired formats of JSON. Let’s look at the parameters accepted by the functions and then explore the customization


path_or_bufstring or filename, optionalFile path or object. If not specified, the result is returned as a string.
orient‘split’, ‘records’, ‘index’, ‘columns’, ‘values’, ‘table’, default=’index’Indication of expected JSON string format.
date_formatNone, ‘epoch’, ‘iso’, default=’epoch’Type of date conversion. ‘epoch’ = epoch milliseconds, ‘iso’ = ISO8601. The default depends on the orient. For orient=’table’, the default is ‘iso’. For all other orients, the default is ‘epoch’.
double_precisioninteger value, default=10The number of decimal places to use when encoding floating point values.
force_asciiboolean value, default=TrueForce encoded string to be ASCII.
date_unit‘s’, ‘ms’, ‘us’, ‘ns’, default=’ms’The time unit to encode to, governs timestamp and ISO8601 precision. The values represent second, millisecond, microsecond, and nanosecond respectively.
default_handlercallable functionHandler to call if object cannot otherwise be converted to a suitable format for JSON. Should receive a single argument which is the object to convert and return a serializable object.
linesboolean value, default=FalseIf ‘orient’ is ‘records’ write out line delimited json format. Will throw ValueError if incorrect ‘orient’ since others are not list like.
compression‘infer’, ‘gzip’, ‘bz2’, ‘zip’, ‘xz’, None, default=’infer’A string representing the compression to use in the output file, only used when the first argument is a filename. By default, the compression is inferred from the filename.
indexboolean value, default=TrueWhether to include the index values in the JSON string. Not including the index (index=False) is only supported when orient is ‘split’ or ‘table’.
indentinteger valueLength of whitespace used to indent each record. Optional argument need not be mentioned.

We now look at a few examples to understand the usage of the function DataFrame.to_json.

Example 1: Basic usage

import numpy as np
import pandas as pd
data = np.array([['1', '2'], ['3', '4']])
dataFrame = pd.DataFrame(data, columns = ['col1', 'col2'])
json = dataFrame.to_json()

Output :

{"col1":{"0":"1", "1":"3"}, "col2":{"0":"2", "1":"4"}}

Example 2: Exploring the ‘orient’ attribute of DataFrame.to_json function

import numpy as np
import pandas as pd
data = np.array([['1', '2'], ['3', '4']])
dataFrame = pd.DataFrame(data, columns = ['col1', 'col2'])
json = dataFrame.to_json()
json_split = dataFrame.to_json(orient ='split')
print("json_split = ", json_split, "\n")
json_records = dataFrame.to_json(orient ='records')
print("json_records = ", json_records, "\n")
json_index = dataFrame.to_json(orient ='index')
print("json_index = ", json_index, "\n")
json_columns = dataFrame.to_json(orient ='columns')
print("json_columns = ", json_columns, "\n")
json_values = dataFrame.to_json(orient ='values')
print("json_values = ", json_values, "\n")
json_table = dataFrame.to_json(orient ='table')
print("json_table = ", json_table, "\n")

Output :

json_split = {“columns”:[“col1”, “col2”], “index”:[0, 1], “data”:[[“1”, “2”], [“3”, “4”]]}

json_records = [{“col1″:”1”, “col2″:”2”}, {“col1″:”3”, “col2″:”4”}]

json_index = {“0”:{“col1″:”1”, “col2″:”2”}, “1”:{“col1″:”3”, “col2″:”4”}}

json_columns = {“col1”:{“0″:”1”, “1”:”3″}, “col2”:{“0″:”2”, “1”:”4″}}

json_values = [[“1”, “2”], [“3”, “4”]]

json_table = {“schema”:{“fields”:[{“name”:”index”, “type”:”integer”}, {“name”:”col1″, “type”:”string”}, {“name”:”col2″, “type”:”string”}], “primaryKey”:[“index”], “pandas_version”:”0.20.0″}, “data”:[{“index”:0, “col1″:”1”, “col2″:”2”}, {“index”:1, “col1″:”3”, “col2″:”4”}]}

My Personal Notes arrow_drop_up
Recommended Articles
Page :

Start Your Coding Journey Now!