Filtering a PySpark DataFrame using isin by exclusion
In this article, we will discuss how to filter the pyspark dataframe using isin by exclusion.
isin(): This is used to find the elements contains in a given dataframe, it takes the elements and gets the elements to match the data.
Syntax: isin([element1,element2,.,element n)
Creating Dataframe for demonstration:
Method 1: Using filter()
filter(): This clause is used to check the condition and give the results, Both are similar
Example 1: Get the particular ID’s with filter() clause
Example 2: Get names from dataframe columns.
Method 2: Using Where()
where(): This clause is used to check the condition and give the results
Example 1: Get the particular colleges with where() clause.
Example 2: Get ID except 5 from dataframe.
Attention geek! Strengthen your foundations with the Python Programming Foundation Course and learn the basics.
To begin with, your interview preparations Enhance your Data Structures concepts with the Python DS Course. And to begin with your Machine Learning Journey, join the Machine Learning – Basic Level Course