How to Calculate the Sum by Group in R?
In this article, we are going to see how to calculate the Sum by Group in R Programming Language.
Data for Demonstration
R
# creating data frame df <- data.frame (Sub = c ( 'Math' , 'Math' , 'Phy' , 'Phy' , 'Phy' , 'Che' , 'Che' ), Marks = c (8, 2, 4, 9, 9, 7, 1), Add_on = c (3, 1, 9, 4, 7, 8, 2)) # view dataframe df |
Output:
Sub Marks Add_on Math 8 3 Math 2 1 Phy 4 9 Phy 9 4 Phy 9 7 Che 7 8 Che 1 2
Method 1: Using aggregate() method in Base R
aggregate() function is used to get the summary statistics of the data by group. The statistics include mean, min, sum. max etc.
Syntax: aggregate(dataframe$aggregate_column, list(dataframe$group_column), FUN)
where
- dataframe is the input dataframe.
- aggregate_column is the column to be aggregated in the dataframe.
- group_column is the column to be grouped with FUN.
- FUN represents sum/mean/min/ max.
R
# creating data frame df <- data.frame (Sub = c ( 'Math' , 'Math' , 'Phy' , 'Phy' , 'Phy' , 'Che' , 'Che' ), Marks = c (8, 2, 4, 9, 9, 7, 1), Add_on = c (3, 1, 9, 4, 7, 8, 2)) aggregate (df$Marks, list (df$Sub), FUN=sum) aggregate (df$Add_on, list (df$Sub), FUN=sum) |
Output:
Group.1 x Che 8 Math 10 Phy 22 Group.1 x Che 10 Math 4 Phy 20
Method 2: Using dplyr() package
group_by() function followed by summarise() function with an appropriate action to perform.
R
library (dplyr) df %>% group_by (Sub) %>% summarise_at ( vars (Marks), list (name = sum)) |
Output:
Sub name Che 8 Math 10 Phy 22
Method 3: Using data.table
data.table package to calculate the sum of points scored by a team.
R
library (data.table) # convert data frame to data table setDT (df) # find sum of points scored by sub df[ , list (sum= sum (Marks)), by=Sub] |
Output:
Sub sum Math 10 Phy 22 Che 8
Please Login to comment...