Open In App

How to Use “NOT IN” Filter in Pandas?

Last Updated : 18 Jan, 2024
Improve
Improve
Like Article
Like
Save
Share
Report

The “NOT IN”(∼) filter is a membership operator used to check whether the data is present in DataFrame or not. 

Pandas library does not have the direct NOT IN filter in Python, but we can perform the NOT IN filter by negating the isin() operator of Pandas.

In this tutorial, we will provide a step-by-step guide to perform the NOT IN filter in Pandas DataFrame.

Create a Sample DataFrame

Python3




# import pandas module
import pandas as pd
 
# create dataframe
data1 = pd.DataFrame({'name': ['sravan', 'harsha', 'jyothika'],
                      'subject1': ['python', 'R', 'php'],
                      'marks': [96, 89, 90]}, index=[0, 1, 2])
 
# display
data1


Output:

sample DatFrame

Method 1: Use NOT IN Filter in One Column

We are using the isin() operator to get the given values in the DataFrame.

Those given values are taken from the list.

We are filtering the DataFrame column values that are present in the given list.

Syntax: dataframe[~dataframe[column_name].isin(list)]

where,

  • dataframe is the input dataframe
  • column_name is the column that is filtered
  • list is the list of values to be removed in that column

Example: Using NOT IN filter in one column of a DataFrame.

Python3




# import pandas module
import pandas as pd
 
# create dataframe
data1 = pd.DataFrame({'name': ['sravan', 'harsha', 'jyothika'],
                      'subject1': ['python', 'R', 'php'],
                      'marks': [96, 89, 90]}, index=[0, 1, 2])
 
# consider a list
list1 = ['harsha', 'jyothika']
 
# filter in name column
print(data1[~data1['name'].isin(list1)])
print("============")
 
# consider a list
list2 = ['R']
 
 
# filter in name column
print(data1[~data1['subject1'].isin(list2)])
print("============")
 
# consider a list
list3 = [96, 89]
 
# filter in name column
print(data1[~data1['marks'].isin(list3)])


Output:

NOT IN Filter with One Column

Method 2: Use NOT IN Filter in Multiple Columns

Now we can filter in more than one column by using any() function. This function will check the value that exists in any given column and columns are given in [[]] separated by a comma.

Syntax: dataframe[~dataframe[[columns]].isin(list).any(axis=1)]

Example: Using NOT IN filter in multiple columns of the DataFrame.

Python3




# import pandas module
import pandas as pd
 
# create dataframe
data1 = pd.DataFrame({'name': ['sravan', 'harsha', 'jyothika'],
                      'subject1': ['python', 'R', 'php'],
                      'marks': [96, 89, 90]}, index=[0, 1, 2])
 
# consider a list
list1 = ['harsha', 'jyothika', 96]
 
# filter in name and marks column
print(data1[~data1[['name', 'marks']].isin(list1).any(axis=1)])
print("============")
 
# consider a list
list2 = ['R', 'sravan']
 
# filter in name and subject1 column
print(data1[~data1[['subject1', 'name']].isin(list2).any(axis=1)])


Output:

 NOT IN Filter with Multiple Column

Method 3: Use Numpy with NOT IN filter

This is similar to the above functionality.

Syntax: dataframe[~numpy.isin(dataframe[‘column’], list)]

Example: Using NOT IN operator with Numpy constructor

Python3




# import pandas module
import numpy as np
import pandas as pd
 
# create dataframe
data1 = pd.DataFrame({'name': ['sravan', 'harsha', 'jyothika'],
                      'subject1': ['python', 'R', 'php'],
                      'marks': [96, 89, 90]}, index=[0, 1, 2])
 
# consider a list
list1 = ['harsha', 'jyothika', 96]
 
# filter in name column
data1[~np.isin(data1['name'], list1)]


Output:

Numpy with NOT IN filter

Conclusion

NOT IN filter allows you to know what values are not present in DataFrame. It is used to check for missing values, conditional data handling, data cleaning, etc. 

In this tutorial, we have covered how to use the NOT IN filter in Pandas DataFrame. We have seen using the NOT IN (∼) operator in single and multiple columns of a DataFrame. We have also covered isin() and any() function to perform NOT IN filtering.



Similar Reads

How to use Pandas filter with IQR?
The IQR or Inter Quartile Range is a statistical measure used to measure the variability in a given data. In naive terms, it tells us inside what range the bulk of our data lies. It can be calculated by taking the difference between the third quartile and the first quartile within a dataset. IQR = Q3 - Q1 Where, Q3 = the 75th percentile value (it i
4 min read
Filter Pandas dataframe in Python using 'in' and 'not in'
The in and not in operators can be used with Pandas DataFrames to check if a given value or set of values is present in the DataFrame or not using Python. The in-operator returns a boolean value indicating whether the specified value is present in the DataFrame, while the not-in-operator returns a boolean value indicating whether the specified valu
3 min read
Spatial Filters - Averaging filter and Median filter in Image Processing
Spatial Filtering technique is used directly on pixels of an image. Mask is usually considered to be added in size so that it has a specific center pixel. This mask is moved on the image such that the center of the mask traverses all image pixels.In this article, we are going to cover the following topics - To write a program in Python to implement
3 min read
Python | Pandas dataframe.filter()
Python is a great language for doing data analysis, primarily because of the fantastic ecosystem of data-centric python packages. Pandas is one of those packages and makes importing and analyzing data much easier. Pandas dataframe.filter() function is used to Subset rows or columns of dataframe according to labels in the specified index. Note that
2 min read
Python | Pandas Series.filter()
Pandas series is a One-dimensional ndarray with axis labels. The labels need not be unique but must be a hashable type. The object supports both integer- and label-based indexing and provides a host of methods for performing operations involving the index. Pandas Series.filter() function returns subset rows or columns of dataframe according to labe
3 min read
Filter words from a given Pandas series that contain atleast two vowels
In this article, how we can filter the words from a given series which contain two or more vowels. Here we'll see two ways in which we can achieve this.Example 1:In this example we'll use the map() function to loop through the series and check for each word if the count of vowels is greater than or equal to two. map is basically used to loop throug
2 min read
How to Filter rows using Pandas Chaining?
In this article, we will learn how to filter rows using Pandas chaining. For this first we have to look into some previous terms which are given below : Pandas DataFrame: It is a two-dimensional data structure, i.e. the data is tabularly aligned in rows and columns. The Pandas DataFrame has three main components i.e. data, rows, and columns.Pandas
4 min read
How to Filter Rows Based on Column Values with query function in Pandas?
In this article, let's see how to filter rows based on column values. Query function can be used to filter rows based on column values. Consider below Dataframe: C/C++ Code import pandas as pd data = [['A', 10], ['B', 15], ['C', 14], ['D', 12]] df = pd.DataFrame(data, columns = ['Name', 'Age']) df Output: [caption width="800"]Our DataFrame [/captio
1 min read
How to Filter DataFrame Rows Based on the Date in Pandas?
Filtering a DataFrame rows by date selects all rows which satisfy specified date constraints, based on a column containing date data. For instance, selecting all rows between March 13, 2020, and December 31, 2020, would return all rows with date values in that range. Use DataFrame.loc() with the indexing syntax [condition] to select only the rows f
2 min read
Filter Pandas DataFrame by Time
In this article let's see how to filter pandas data frame by date. So we can filter python pandas data frame by date using the logical operator and loc() method. In the below examples we have a data frame that contains two columns the first column is Name and another one is DOB. Example 1: filter data that's DOB is greater than 1999-02-5. Python Co
1 min read
Article Tags :
Practice Tags :