Python | Pandas Index.get_duplicates()

Last Updated : 17 Dec, 2018

Python is a great language for doing data analysis, primarily because of the fantastic ecosystem of data-centric python packages. Pandas is one of those packages and makes importing and analyzing data much easier.

Pandas Index.get_duplicates() function extract duplicated index elements. This function returns a sorted list of index elements which appear more than once in the Index.

Syntax: Index.get_duplicates()

Returns : List of duplicated indexes.

Example #1: Use Index.get_duplicates() function to find all the duplicate values in the Index.

# importing pandas as pd 
import pandas as pd 
  
# Creating the Index 
idx = pd.Index(['Labrador', 'Beagle', 'Labrador', 
                    'Lhasa', 'Husky', 'Beagle']) 
  
# Print the Index 
idx 

Output :

let’s find out all the duplicate values in the Index.

# print the duplicated values. 
idx.get_duplicates() 

Output :

As we can see in the output, the Index.get_duplicates() function has returned all the values which are having more than one occurrence in the Index.

Example #2: Use Index.get_duplicates() function to find all the duplicate in the Index. The Index also contains NaN values.

# importing pandas as pd 
import pandas as pd 
  
# Creating the Index 
idx = pd.Index(['Labrador', 'Beagle', None, 'Labrador', 
             'Lhasa', 'Husky', 'Beagle', None, 'Koala']) 
  
# Print the Index 
idx 

Output :

As we can see in the output we are having some missing values. Lets see how the Index.get_duplicates() function treats them.

# print the duplicate values in Index 
idx.get_duplicates() 

Output :

The occurrence of missing values more than once has been treated as duplicates.

Suggest improvement

Python | Pandas Index.duplicated()

Share your thoughts in the comments

Python | Pandas Index.get_duplicates()

Please Login to comment...

Similar Reads

What kind of Experience do you want to share?