How to display a PySpark DataFrame in table format ?
Last Updated :
29 Aug, 2022
In this article, we are going to display the data of the PySpark dataframe in table format. We are going to use show() function and toPandas function to display the dataframe in the required format.
show(): Used to display the dataframe.
Syntax: dataframe.show( n, vertical = True, truncate = n)
where,
- dataframe is the input dataframe
- N is the number of rows to be displayed from the top ,if n is not specified it will print entire rows in the dataframe
- vertical parameter specifies the data in the dataframe displayed in vertical format if it is true, otherwise it will display in horizontal format like a dataframe
- truncate is a parameter us used to trim the values in the dataframe given as a number to trim
toPanads(): Pandas stand for a panel data structure which is used to represent data in a two-dimensional format like a table.
Syntax: dataframe.toPandas()
where, dataframe is the input dataframe
Let’s create a sample dataframe.
Python3
import pyspark
from pyspark.sql import SparkSession
spark = SparkSession.builder.appName( 'sparkdf' ).getOrCreate()
data = [[ "1" , "sravan" , "company 1" ],
[ "2" , "ojaswi" , "company 2" ],
[ "3" , "bobby" , "company 3" ],
[ "4" , "rohith" , "company 2" ],
[ "5" , "gnanesh" , "company 1" ]]
columns = [ 'Employee ID' , 'Employee NAME' , 'Company Name' ]
dataframe = spark.createDataFrame(data, columns)
print (dataframe)
|
Output:
DataFrame[Employee ID: string, Employee NAME: string, Company Name: string]
Example 1: Using show() function without parameters. It will result in the entire dataframe as we have.
Output:
Example 2: Using show() function with n as a parameter, which displays top n rows.
Syntax: DataFrame.show(n)
Where, n is a row
Code:
Output:
Example 3:
Using show() function with vertical = True as parameter. Display the records in the dataframe vertically.
Syntax: DataFrame.show(vertical)
vertical can be either true and false.
Code:
Python3
dataframe.show(vertical = True )
|
Output:
Example 4: Using show() function with truncate as a parameter. Display first one letter in each value of all the columns
Python3
dataframe.show(truncate = 1 )
|
Output:
Example 5: Using show() with all parameters.
Python3
dataframe.show(n = 3 ,vertical = True ,truncate = 2 )
|
Output:
Example 6: Using toPandas() method, which converts it to Pandas Dataframe which perfectly looks like a table.
Output:
Like Article
Suggest improvement
Share your thoughts in the comments
Please Login to comment...