Skip to content
Related Articles

Related Articles

How to Calculate the Mean by Group in R DataFrame ?

Improve Article
Save Article
  • Last Updated : 01 Apr, 2021
Improve Article
Save Article

In this article, we are going to see how to calculate the mean by the group in DataFrame in R Programming Language.

It can be done with two approaches:

Dataset creation: First, we create a dataset so that later we can apply the above two approaches and find the Mean by group.


# GFG dataset name and creation
GFG <- data.frame(                                            
   Category  = c ("A","B","C","B","C","A","C","A","B"),       
   Frequency= c(9,5,0,2,7,8,1,3,7)                            
# Prints the dataset

So, as you can see the above code is for creating a dataset named “GFG”.

It also has 2 columns named Category and Frequency. So, when you run the above code in an R compiler, a table is shown as output as given below

And after applying that two approaches we need to get output as:

Before we discuss those approaches let us first know how we got the output values:

  • In Table 1, We have two columns named Category and Frequency.
  • In Category, we have some repeating variables of A, B and C.
  • A group values (9,8,3), B group values (5,2,7) and C group values (0,7,1) taken from the Frequency column.
  • So, to find Mean we have a  formula

MEAN = Sum of terms / Number of terms

  • Hence, Mean by Group  of each group (A,B,C) would be


  • A=9+8+3=20
  • B=5+2+7=14
  • C=0+7+1=08

Number of terms:

  • A is repeated 3 times
  • B is repeated 3 times
  • C is repeated 3 times

Mean by group (A, B, C):

  • A(mean) = Sum/Number of terms = 20/3 = 6.67
  • B(mean) = Sum/Number of terms = 14/3 = 4.67
  • C(mean) = Sum/Number of terms = 8/3 = 2.67

Method 1: Using aggregate function

Aggregate function: Splits the data into subsets, computes summary statistics for each, and returns the result in a convenient form.

Syntax: aggregate(x = dataset_Name , by = group_list, FUN = any_function) 

# Basic R syntax of aggregate function

Now, let’s sum our data using an aggregate function:


GFG <- data.frame(
   Category  = c ("A","B","C","B","C","A","C","A","B"), 
   Frequency= c(9,5,0,2,7,8,1,3,7)
# Specify data column
aggregate(x= GFG$Frequency,     
         # Specify group indicator
         by = list(GFG$Category),      
         # Specify function (i.e. mean)
         FUN = mean)


In the above aggregate function, it takes on three parameters 

  • First is dataset name in our case it is “GFG”.
  • Second is the column name which values we need to make different groups in our case it is Category column, and it is separated into three groups (A, B, C). 
  • In the third parameter, we need to mention which function(i.e mean, sum, etc) we need to perform on a group formed (A, B, C) 

Method 2: Using dplyr Package

dplyr is a package which provides a set of tools for efficiently manipulating datasets in R

Methods in dplyr package:

  • mutate() adds new variables that are functions of existing variables
  • select() picks variables based on their names.
  • filter() picks cases based on their values.
  • summarise() reduces multiple values down to a single summary.
  • arrange() changes the ordering of the rows.

Install this library:


Load this library:




# load dplyr library
GFG <- data.frame(
   Category  = c ("A","B","C","B","C","A","C","A","B"), 
   Frequency= c(9,5,0,2,7,8,1,3,7)
# Specify data frame
# Specify group indicator, column, function
group_by(Category) %>%                        
              list(name = mean))


In the above code, we first take our dataset named “GFG”. With group_by() method we form groups in our case (A, B, C). summarise_at() it has two parameters first is a column on which it applies the operation given as the second parameter of it.

My Personal Notes arrow_drop_up
Related Articles

Start Your Coding Journey Now!