How to Count Distinct Values of a Pandas Dataframe Column?

Last Updated : 01 Dec, 2023

In Pandas, there are various ways by which we can count distinct value of a Pandas Dataframe. Let’s see How to Count Distinct Values of a Pandas Dataframe Column.

Creating the Pandas Dataframe for a Reference

Consider a tabular structure as given below which has to be created as Dataframe. The columns are height, weight, and age. The records of 8 students form the rows.

	height	weight	age
Steve	165	63.5	20
Ria	165	64	22
Nivi	164	63.5	22
Jane	158	54	21
Kate	167	63.5	23
Lucy	160	62	22
Ram	158	64	20
Niki	165	64	21

The first step is to create the Dataframe for the above tabulation. Look at the code snippet below:

Python3

# import library
import pandas as pd
 
# create a Dataframe
df = pd.DataFrame({
    'height': [165, 165, 164,
               158, 167, 160,
               158, 165],
 
    'weight': [63.5, 64, 63.5,
               54, 63.5, 62,
               64, 64],
 
    'age': [20, 22, 22,
            21, 23, 22,
            20, 21]},
 
    index=['Steve', 'Ria', 'Nivi',
           'Jane', 'Kate', 'Lucy',
           'Ram', 'Niki'])
 
# show the Dataframe
print(df)

Output

height  weight  age
Steve     165    63.5   20
Ria       165    64.0   22
Nivi      164    63.5   22
Jane      158    54.0   21
Kate      167    63.5   23
Lucy      160    62.0   22
Ram       158    64.0   20
Niki      165    64.0   21

Count Distinct Values of a Pandas Dataframe Column

Below are the ways by which we can count distinct values of a Pandas Dataframe column:

Using pandas.unique()
Using Dataframe.nunique()
Using Series.value_counts()
Using a loop

Count Distinct Values of a Column Using unique()

In this example, we are using the pandas library to create a DataFrame named df with columns ‘height,’ ‘weight,’ and ‘age.’ It then calculates the number of unique values in the ‘height’ column using the pd.unique() function and obtains the count using the len() function, printing the result. The output indicates the number of unique height values in the DataFrame.

Python3

# import library
import pandas as pd
 
# create a Dataframe
df = pd.DataFrame({ 
  'height' : [165, 165, 164, 
              158, 167, 160,
              158, 165],
   
  'weight' : [63.5, 64, 63.5,
              54, 63.5, 62,
              64, 64],
   
  'age' : [20, 22, 22, 
           21, 23, 22,
           20, 21]},
   
   index = ['Steve', 'Ria', 'Nivi', 
            'Jane', 'Kate', 'Lucy',
            'Ram', 'Niki'])
 
# counting unique values
n = len(pd.unique(df['height']))
 
print("No.of.unique values :", 
      n)

Output

No.of.unique values : 5

Pandas Count Distinct Values Using Dataframe.nunique()

In this example we are using the pandas library to create a DataFrame named df with columns ‘height,’ ‘weight,’ and ‘age.’ It then calculates the number of unique values in each column using the nunique() function with axis=0 and prints the result, showing the count of distinct values for each column.

Python3

# import library
import pandas as pd
 
# create a Dataframe
df = pd.DataFrame({
    'height': [165, 165, 164,
               158, 167, 160,
               158, 165],
 
    'weight': [63.5, 64, 63.5,
               54, 63.5, 62,
               64, 64],
 
    'age': [20, 22, 22,
            21, 23, 22,
            20, 21]},
 
    index=['Steve', 'Ria', 'Nivi',
           'Jane', 'Kate', 'Lucy',
           'Ram', 'Niki'])
 
# check the values of
# each row for each column
n = df.nunique(axis=0)
 
print("No.of.unique values in each column :\n",
      n)

Output

No.of.unique values in each column :
height    5
weight    4
age       4
dtype: int64

In this example we are using the pandas library to create a DataFrame named df with columns ‘height,’ ‘weight,’ and ‘age.’ It then calculates and prints the number of unique values in the ‘height’ column using the nunique() function specific to that column.

Python3

# import library
import pandas as pd
 
# create a Dataframe
df = pd.DataFrame({
    'height': [165, 165, 164,
               158, 167, 160,
               158, 165],
 
    'weight': [63.5, 64, 63.5,
               54, 63.5, 62,
               64, 64],
 
    'age': [20, 22, 22,
            21, 23, 22,
            20, 21]},
 
    index=['Steve', 'Ria', 'Nivi',
           'Jane', 'Kate', 'Lucy',
           'Ram', 'Niki'])
 
# count no. of unique
# values in height column
n = df.height.nunique()
 
print("No.of.unique values in height column :",
      n)

Output

No.of.unique values in height column : 5

Count Distinct Values of Column Using Series.value_counts()

In this example we are using the pandas library to create a DataFrame named df with columns ‘height,’ ‘weight,’ and ‘age.’ It then obtains a list of unique value counts in the ‘height’ column using value_counts() and calculates the number of unique values by finding the length of the list.

Python3

# import library
import pandas as pd
 
# create a Dataframe
df = pd.DataFrame({
    'height': [165, 165, 164,
               158, 167, 160,
               158, 165],
 
    'weight': [63.5, 64, 63.5,
               54, 63.5, 62,
               64, 64],
 
    'age': [20, 22, 22,
            21, 23, 22,
            20, 21]},
 
    index=['Steve', 'Ria', 'Nivi',
           'Jane', 'Kate', 'Lucy',
           'Ram', 'Niki'])
 
 
# getting the list of unique values
li = list(df.height.value_counts())
 
# print the unique value counts
print("No.of.unique values :",
      len(li))

Output

No.of.unique values : 5

Pandas Count Unique Values Using for loop

The Dataframe has been created and one can hard coded using for loop and count the number of unique values in a specific column. For example In the above table, if one wishes to count the number of unique values in the column height. The idea is to use a variable cnt for storing the count and a list visited that has the previously visited values. Then for loop that iterates through the ‘height’ column and for each value, it checks whether the same value has already been visited in the visited list. If the value was not visited previously, then the count is incremented by 1.

Python3

# import library
import pandas as pd
 
# create a Dataframe
df = pd.DataFrame({
    'height': [165, 165, 164,
               158, 167, 160,
               158, 165],
 
    'weight': [63.5, 64, 63.5,
               54, 63.5, 62,
               64, 64],
 
    'age': [20, 22, 22,
            21, 23, 22,
            20, 21]},
 
    index=['Steve', 'Ria', 'Nivi',
           'Jane', 'Kate', 'Lucy',
           'Ram', 'Niki'])
 
# variable to hold the count
cnt = 0
 
# list to hold visited values
visited = []
 
# loop for counting the unique
# values in height
for i in range(0, len(df['height'])):
 
    if df['height'][i] not in visited:
 
        visited.append(df['height'][i])
 
        cnt += 1
 
print("No.of.unique values :",
      cnt)
 
print("unique values :",
      visited)

Output

No.of.unique values : 5unique values : [165, 164, 158, 167, 160]

Suggest improvement

Highlight the negative values red and positive values black in Pandas Dataframe

What's new in Python 3.9?

Share your thoughts in the comments

How to Count Distinct Values of a Pandas Dataframe Column?

Creating the Pandas Dataframe for a Reference

Python3

Count Distinct Values of a Pandas Dataframe Column

Count Distinct Values of a Column Using unique()

Python3

Pandas Count Distinct Values Using Dataframe.nunique()

Python3

Python3

Count Distinct Values of Column Using Series.value_counts()

Python3

Pandas Count Unique Values Using for loop

Python3

Please Login to comment...

Similar Reads

What kind of Experience do you want to share?