Create PySpark dataframe from dictionary
Last Updated :
30 May, 2021
In this article, we are going to discuss the creation of Pyspark dataframe from the dictionary. To do this spark.createDataFrame() method method is used. This method takes two argument data and columns. The data attribute will contain the dataframe and the columns attribute will contain the list of columns name.
Example 1: Python code to create the student address details and convert them to dataframe
Python3
import pyspark
from pyspark.sql import SparkSession
spark = SparkSession.builder.appName( 'sparkdf' ).getOrCreate()
data = [{ 'student_id' : 12 , 'name' : 'sravan' ,
'address' : 'kakumanu' }]
dataframe = spark.createDataFrame(data)
dataframe.show()
|
Output:
Example2: Create three dictionaries and pass them to the data frame in pyspark
Python3
import pyspark
from pyspark.sql import SparkSession
spark = SparkSession.builder.appName( 'sparkdf' ).getOrCreate()
data = [{ 'student_id' : 12 , 'name' : 'sravan' , 'address' : 'kakumanu' },
{ 'student_id' : 14 , 'name' : 'jyothika' , 'address' : 'tenali' },
{ 'student_id' : 11 , 'name' : 'deepika' , 'address' : 'repalle' }]
dataframe = spark.createDataFrame(data)
dataframe.show()
|
Output:
Like Article
Suggest improvement
Share your thoughts in the comments
Please Login to comment...