In this article, we are going to filter the rows based on column values in PySpark dataframe.
Creating Dataframe for demonstration:
Python3
import spark
from pyspark.sql import SparkSession
spark = SparkSession.builder.appName( 'sparkdf' ).getOrCreate()
data = [[ "1" , "sravan" , "company 1" ],
[ "2" , "ojaswi" , "company 1" ],
[ "3" , "rohith" , "company 2" ],
[ "4" , "sridevi" , "company 1" ],
[ "1" , "sravan" , "company 1" ],
[ "4" , "sridevi" , "company 1" ]]
columns = [ 'ID' , 'NAME' , 'Company' ]
dataframe = spark.createDataFrame(data, columns)
dataframe.show()
|
Output:

Method 1: Using where() function
This function is used to check the condition and give the results
Syntax: dataframe.where(condition)
We are going to filter the rows by using column values through the condition, where the condition is the dataframe condition
Example 1: filter rows in dataframe where ID =1
Python3
dataframe.where(dataframe. ID = = '1' ).show()
|
Output:

Example 2:
Python3
dataframe.where(dataframe.NAME ! = 'sravan' ).show()
|
Output:

Example 3: Where clause multiple column values filtering.
Python program to filter rows where ID greater than 2 and college is vvit
Python3
dataframe.where((dataframe. ID > '2' ) & (dataframe.college = = 'vvit' )).show()
|
Output:

Method 2: Using filter() function
This function is used to check the condition and give the results.
Syntax: dataframe.filter(condition)
Example 1: Python code to get column value = vvit college
Python3
dataframe. filter (dataframe.college = = 'vvit' ).show()
|
Output:

Example 2: filter the data where id > 3.
Python3
dataframe. filter (dataframe. ID > '3' ).show()
|
Output:

Example 3: Multiple column value filtering.
Python program to filter rows where ID greater than 2 and college is vignan
Python3
dataframe. filter ((dataframe. ID > '2' ) &
(dataframe.college = = 'vignan' )).show()
|
Output:

Whether you're preparing for your first job interview or aiming to upskill in this ever-evolving tech landscape,
GeeksforGeeks Courses are your key to success. We provide top-quality content at affordable prices, all geared towards accelerating your growth in a time-bound manner. Join the millions we've already empowered, and we're here to do the same for you. Don't miss out -
check it out now!
Last Updated :
29 Jun, 2021
Like Article
Save Article