Filter Pandas dataframe in Python using ‘in’ and ‘not in’
Last Updated :
05 Feb, 2023
The in and not in operators can be used with Pandas DataFrames to check if a given value or set of values is present in the DataFrame or not using Python. The in-operator returns a boolean value indicating whether the specified value is present in the DataFrame, while the not-in-operator returns a boolean value indicating whether the specified value is not present in the DataFrame.
This operator can be used with the .query() method of a Pandas DataFrame to filter the DataFrame based on a given set of values. The .query() method takes a string containing a Boolean expression as input and returns a new DataFrame containing only the rows that satisfy the given expression.
Filter Pandas Dataframe in Python using ‘in’ keyword
The in keyword has two purposes, first to check if a value is present in a list, tuple, range, string, etc. and another is to iterate through a sequence in a for a loop.
Example 1
Here is an example of how the in operator can be used with the .query() method to filter a DataFrame:
Python3
import pandas as pd
df = pd.DataFrame({ "A" : [ 1 , 2 , 3 , 4 ],
"B" : [ 5 , 6 , 7 , 8 ]})
df = df.query( "A in [1, 2]" )
print (df)
|
Output:
A B
0 1 5
1 2 6
Example 2
The in operator can also be used in more complex expressions with the .query() method, to combine multiple conditions and apply logical operators such as and/or.
Python3
df = df.query( "A in [1, 2] and B in [6, 7]" )
print (df)
|
Output:
A B
1 2 6
Filter Pandas Dataframe in Python using the ‘not in’ keyword
Python not keyword is a logical operator which is usually used for figuring out the negation or opposite boolean value of the operand.
Example 1
To use the `not in` operator with the .query() method of a Pandas DataFrame, you can simply negate the expression using the not keyword.
Python3
import pandas as pd
df = pd.DataFrame({ "A" : [ 1 , 2 , 3 , 4 ], "B" : [ 5 , 6 , 7 , 8 ]})
df = df.query( "not A in [1, 2]" )
print (df)
|
Output:
A B
2 3 7
3 4 8
Example 2
Here is the example with the ‘not in’ operator.
Python3
df = df.query( "not (A in [1, 2] and B in [6, 7])" )
print (df)
|
Output:
A B
0 1 5
2 3 7
3 4 8
Share your thoughts in the comments
Please Login to comment...