Python | Pandas DataFrame.where()

Python is a great language for doing data analysis, primarily because of the fantastic ecosystem of data-centric python packages. Pandas is one of those packages and makes importing and analyzing data much easier.

Pandas where() method is used to check a data frame for one or more condition and return the result accordingly. By default, The rows not satisfying the condition are filled with NaN value.

Syntax:
DataFrame.where(cond, other=nan, inplace=False, axis=None, level=None, errors=’raise’, try_cast=False, raise_on_error=None)
 

Parameters:

cond: One or more condition to check data frame for.
other: Replace rows which don’t satisfy the condition with user defined object, Default is NaN
inplace: Boolean value, Makes changes in data frame itself if True
axis: axis to check( row or columns)

For link to the CSV file used, Click here.

Example #1: Single Condition operation

In this example, rows having particular Team name will be shown and rest will be replaced by NaN using .where() method.

filter_none

edit
close

play_arrow

link
brightness_4
code

# importing pandas package
import pandas as pd
  
# making data frame from csv file
data = pd.read_csv("nba.csv")
  
# sorting dataframe
data.sort_values("Team", inplace = True)
  
# making boolean series for a team name
filter = data["Team"]=="Atlanta Hawks"
  
# filtering data
data.where(filter, inplace = True)
  
# display
data

chevron_right


Output:

As shown in the output image, every row which doesn’t have Team = Atlanta Hawks is replaced with NaN.

 

Example #2: Multi-condition Operations

Data is filtered on the basis of both Team and Age. Only the rows having Team name “Atlanta Hawks” and players having age above 24 will be displayed.

filter_none

edit
close

play_arrow

link
brightness_4
code

# importing pandas package
import pandas as pd
  
# making data frame from csv file
data = pd.read_csv("nba.csv")
  
# sorting dataframe
data.sort_values("Team", inplace = True)
  
# making boolean series for a team name
filter1 = data["Team"]=="Atlanta Hawks"
  
# making boolean series for age
filter2 = data["Age"]>24
  
# filtering data on basis of both filters
data.where(filter1 & filter2, inplace = True)
  
# display
data

chevron_right


Output:
As shown in the output image, Only the rows having Team name “Atlanta Hawks” and players having age above 24 are displayed.



My Personal Notes arrow_drop_up

Developer in day, Designer at night GSoC 2019 with Python Software Foundation (EOS Design system)

If you like GeeksforGeeks and would like to contribute, you can also write an article using contribute.geeksforgeeks.org or mail your article to contribute@geeksforgeeks.org. See your article appearing on the GeeksforGeeks main page and help other Geeks.

Please Improve this article if you find anything incorrect by clicking on the "Improve Article" button below.