How to drop multiple column names given in a list from PySpark DataFrame ?
Last Updated :
17 Jun, 2021
In this article, we are going to drop multiple columns given in the list in Pyspark dataframe in Python.
For this, we will use the drop() function. This function is used to remove the value from dataframe.
Syntax: dataframe.drop(*[‘column 1′,’column 2′,’column n’])
Where,
- dataframe is the input dataframe
- column names are the columns passed through a list in the dataframe.
Python code to create student dataframe with three columns:
Python3
import pyspark
from pyspark.sql import SparkSession
spark = SparkSession.builder.appName( 'sparkdf' ).getOrCreate()
data = [[ "1" , "sravan" , "vignan" ],
[ "2" , "ojaswi" , "vvit" ],
[ "3" , "rohith" , "vvit" ],
[ "4" , "sridevi" , "vignan" ],
[ "1" , "sravan" , "vignan" ],
[ "5" , "gnanesh" , "iit" ]]
columns = [ 'student ID' , 'student NAME' , 'college' ]
dataframe = spark.createDataFrame(data,columns)
print ( "Actual data in dataframe" )
dataframe.show()
|
Output:
Actual data in dataframe
+----------+------------+-------+
|student ID|student NAME|college|
+----------+------------+-------+
| 1| sravan| vignan|
| 2| ojaswi| vvit|
| 3| rohith| vvit|
| 4| sridevi| vignan|
| 1| sravan| vignan|
| 5| gnanesh| iit|
+----------+------------+-------+
Example 1: Program to delete multiple column names as a list.
Python3
list = [ 'student NAME' , 'college' ]
dataframe = dataframe.drop( * list )
dataframe.show()
|
Output:
+----------+
|student ID|
+----------+
| 1|
| 2|
| 3|
| 4|
| 1|
| 5|
+----------+
Example 2: Example program to drop one column names as a list.
Python3
list = [ 'college' ]
dataframe = dataframe.drop( * list )
dataframe.show()
|
Output:
+----------+------------+
|student ID|student NAME|
+----------+------------+
| 1| sravan|
| 2| ojaswi|
| 3| rohith|
| 4| sridevi|
| 1| sravan|
| 5| gnanesh|
+----------+------------+
Example 3: Drop all column names as a list.
Python3
list = [ 'student ID' , 'student NAME' , 'college' ]
dataframe = dataframe.drop( * list )
dataframe.show()
|
Output:
++
||
++
||
||
||
||
||
||
++
Share your thoughts in the comments
Please Login to comment...