Skip to content
Related Articles

Related Articles

Pandas – Find unique values from multiple columns
  • Last Updated : 16 Mar, 2021

Prerequisite: Pandas 

In this article, we will discuss various methods to obtain unique values from multiple columns of Pandas DataFrame.

Method 1: Using pandas Unique() and Concat() methods

Pandas series aka columns has a unique() method that filters out only unique values from a column. The first output shows only unique FirstNames. We can extend this method using pandas concat() method and concat all the desired columns into 1 single column and then find the unique of the resultant column.

Python3






import pandas as pd
import numpy as np
  
# Creating a custom dataframe.
df = pd.DataFrame({'FirstName': ['Arun', 'Navneet', 'Shilpa'
                                 'Prateek', 'Pyare', 'Prateek'],
                     
                   'LastName': ['Singh', 'Yadav', 'Yadav', 'Shukla',
                                'Lal', 'Mishra'],
                     
                   'Age': [26, 25, 25, 27, 28, 30]})
  
# To get unqiue values in 1 series/column
print(f"Unique FN: {df['FirstName'].unique()}")
  
# Extending the idea from 1 column to multiple columns
print(f"Unqiue Values from 3 Columns:\
{pd.concat([df['FirstName'],df['LastName'],df['Age']]).unique()}")

Output:

Unique FN: [‘Arun’ ‘Navneet’ ‘Shilpa’ ‘Prateek’ ‘Pyare’]

Unqiue Values from 3 Columns:[‘Arun’ ‘Navneet’ ‘Shilpa’ ‘Prateek’ ‘Pyare’ ‘Singh’ ‘Yadav’ ‘Shukla’

 ‘Lal’ ‘Mishra’ 26 25 27 28 30]

Method 2: Using Numpy.unique() method

With the help of np.unique() method, we can get the unique values from an array given as parameter in np.unique() method.

Note: This approach has one limitation i.e. we cannot combine str and numerical columns together, and therefore if such a situation arises where we need to club different datatypes columns together then go for Method 1.

Python3






import pandas as pd
import numpy as np
  
# Creating a custom dataframe.
df = pd.DataFrame({'FirstName': ['Arun', 'Navneet', 'Shilpa'
                                 'Prateek', 'Pyare', 'Prateek'],
                     
                   'LastName': ['Singh', 'Yadav', 'Yadav', 'Shukla',
                                'Lal', 'Mishra'],
                     
                   'Age': [26, 25, 25, 27, 28, 30]})
  
print(np.unique(df[['LastName', 'FirstName']].values))
  
# Will throw error as Age is numerical datatype
# and LastName is str
# print(np.unique(df[['LastName','Age']].values))

Output:

[‘Arun’ ‘Lal’ ‘Mishra’ ‘Navneet’ ‘Prateek’ ‘Pyare’ ‘Shilpa’ ‘Shukla’

 ‘Singh’ ‘Yadav’]

Method 3: Using Sets in Python 

The Set has a property that only contains unique values and therefore we convert individual series into a Set object and then take the set union of them. Unlike Method 2 this also works for all datatype combinations. 

Python3




import pandas as pd
import numpy as np
  
  
# Creating a custom dataframe.
df = pd.DataFrame({'FirstName': ['Arun', 'Navneet', 'Shilpa'
                                 'Prateek', 'Pyare', 'Prateek'],
                     
                   'LastName': ['Singh', 'Yadav', 'Yadav', 'Shukla',
                                'Lal', 'Mishra'],
                     
                   'Age': [26, 25, 25, 27, 28, 30]})
  
# Typecasting pandas series into set and then 
# taking set union (|)
print(set(df.FirstName) | set(df.LastName) | set(df.Age))

Output:

{‘Singh’, ‘Pyare’, ‘Mishra’, 27, ‘Navneet’, ‘Arun’, ‘Lal’, ‘Shukla’, 30, 25, 26, ‘Yadav’, 28, ‘Shilpa’, ‘Prateek’}

 Attention geek! Strengthen your foundations with the Python Programming Foundation Course and learn the basics.  

To begin with, your interview preparations Enhance your Data Structures concepts with the Python DS Course. And to begin with your Machine Learning Journey, join the Machine Learning – Basic Level Course

My Personal Notes arrow_drop_up
Recommended Articles
Page :