Count Unique Values by Group in R

• Last Updated : 05 Apr, 2021

In the article, we are going to discuss how to count the number of unique values by the group in R Programming Language. So let’s take the following example,

Suppose you have a dataset with multiple columns like this:

In this dummy dataset class, age, age_group represent column names and our task is to count the number of unique values by age_group.

So, that the resultant dataset should look like this:

Method 1: Using aggregate function

Using aggregate function we can perform operation on multiple rows (by grouping the data) and produce a single summary value.

Example:

R

 `# Count Unique values by group`` ` `# Creating dataset ``# creating class column``x <- ``c``(``"A"``,``"B"``,``"C"``,``"B"``,``"A"``,``"A"``,``"C"``,``"A"``,``"B"``,``"C"``,``"A"``)`` ` `# creating age column``y <- ``c``(20,15,45,14,21,22,47,18,16,50,23)`` ` `# creating age_group column``z <- ``c``(``"YOUNG"``,``"KID"``,``"OLD"``,``"KID"``,``"YOUNG"``,``"YOUNG"``,``      ``"OLD"``,``"YOUNG"``,``"KID"``,``"OLD"``,``"YOUNG"``)`` ` `# creating dataframe``df <- ``data.frame``(class=x,age=y,age_group=z)``df`` ` `# applying aggregate function``aggregate``( age~age_group,df, ``function``(x) ``length``(``unique``(x)))`

Output:

Output 1.

Method 2: Using dplyr package and group_by function

dplyr is the most widely used R package. It is mainly used for data wrangling purpose. It provides set of tools for data manipulation.

Example:

R

 `# Count Unique values by group`` ` `# loading dplyr``library``(``"dplyr"``)`` ` `# Creating dataset ``# creating class column``x <- ``c``(``"A"``,``"B"``,``"C"``,``"B"``,``"A"``,``"A"``,``"C"``,``"A"``,``"B"``,``"C"``,``"A"``)`` ` `# creating age column``y <- ``c``(20,15,45,14,21,22,47,18,16,50,23)`` ` `# creating age_group column``z <- ``c``(``"YOUNG"``,``"KID"``,``"OLD"``,``"KID"``,``"YOUNG"``,``"YOUNG"``,``      ``"OLD"``,``"YOUNG"``,``"KID"``,``"OLD"``,``"YOUNG"``)`` ` `# creating dataframe``df <- ``data.frame``(class=x,age=y,age_group=z)`` ` `# grouping age_group column ``# counting all the unique``# value based on the age_group ``# column ``df %>%``  ``group_by``(age_group) %>%``  ``summarise``(``n_distinct``(age))`

Output:

Output 2.

My Personal Notes arrow_drop_up