How to Aggregate Multiple Columns in R?
In this article, we will discuss how to aggregate multiple columns in R Programming Language.
Aggregation means combining two or more data. Here we are going to use the aggregate function to get the summary statistics for one or more variables in a data frame.
Syntax:
aggregate(sum_column ~ group_column, data, FUN)
where,
- data is the input dataframe
- sum_column is the column that can summarize
- group_column is the column to be grouped.
- FUN refers to functions like sum, mean, min, max, etc.
Example:
Let’s create a dataframe
R
data = data.frame (subjects= c ( "java" , "python" , "java" ,
"java" , "php" , "php" ),
id= c (1, 2, 3, 4, 5, 6),
names= c ( "manoj" , "sai" , "mounika" ,
"durga" , "deepika" , "roshan" ),
marks= c (89, 89, 76, 89, 90, 67))
data
|
Output:
Example 1: Summarize One Variable & Group by One Variable
Here, we are going to get the summary of one variable by grouping it with one variable.
Syntax:
aggregate(sum_column ~ group_column, data, FUN=sum)
In this example, We are going to use the sum function to get some of marks by grouping with subjects.
R
data = data.frame (subjects= c ( "java" , "python" , "java" ,
"java" , "php" , "php" ),
id= c (1, 2, 3, 4, 5, 6),
names= c ( "manoj" , "sai" , "mounika" ,
"durga" , "deepika" , "roshan" ),
marks= c (89, 89, 76, 89, 90, 67))
aggregate (marks~ subjects, data, FUN=sum)
|
Output:
Example 2: Summarize One Variable & Group by Multiple Variables
Here we are going to get the summary of one variable by grouping it with one or more variables. We have to use the + operator to group multiple columns.
Syntax:
aggregate(sum_column ~ group_column1+group_column2+……………group_columnn, data, FUN=sum)
In this example, We are going to group names and subjects to get sum of marks.
R
data = data.frame (subjects= c ( "java" , "python" , "java" ,
"java" , "php" , "php" ),
id= c (1, 2, 3, 4, 5, 6),
names= c ( "manoj" , "sai" , "mounika" ,
"durga" , "deepika" , "roshan" ),
marks= c (89, 89, 76, 89, 90, 67))
aggregate (marks~ subjects+names, data, FUN=sum)
|
Output:
Example 3: Summarize Multiple Variables & Group by One Variable
Here we are going to get the summary of one or more variables by grouping with one variable. We will use cbind() function known as column binding to get a summary of multiple variables.
Syntax:
aggregate(cbind(sum_column1,sum_column2,.,sum_column n) ~ group_column1+group_column2+……………group_columnn, data, FUN=sum)
In this example, We are going to get sum of marks and id by grouping with subjects.
R
data = data.frame (subjects= c ( "java" , "python" , "java" ,
"java" , "php" , "php" ),
id= c (1, 2, 3, 4, 5, 6),
names= c ( "manoj" , "sai" , "mounika" ,
"durga" , "deepika" , "roshan" ),
marks= c (89, 89, 76, 89, 90, 67))
aggregate ( cbind (marks, id)~ subjects, data, FUN=sum)
|
Output:
Example 4: Summarize Multiple Variables & Group by Multiple Variables
Here, we are going to get the summary of one or more variables by grouping them with one or more variables. We can use cbind() for combining one or more variables and the ‘+’ operator for grouping multiple variables.
Syntax:
aggregate(cbind(sum_column1,.,sum_column n)~ group_column1+….+group_column n, data, FUN=sum)
In this example, We are going to get sum of marks and id by grouping them with subjects and names.
R
data = data.frame (subjects= c ( "java" , "python" , "java" ,
"java" , "php" , "php" ),
id= c (1, 2, 3, 4, 5, 6),
names= c ( "manoj" , "sai" , "mounika" ,
"durga" , "deepika" , "roshan" ),
marks= c (89, 89, 76, 89, 90, 67))
aggregate ( cbind (marks, id)~ subjects+names, data, FUN=sum)
|
Output:
Last Updated :
19 Dec, 2021
Like Article
Save Article
Share your thoughts in the comments
Please Login to comment...