Mutating column in dplyr using rowSums
Last Updated :
24 Oct, 2021
In this article, we are going to discuss how to mutate columns in dataframes using the dplyr package in R Programming Language.
Installation
The package can be downloaded and installed in the R working space using the following command :
Install Command – install.packages(“dplyr”)
Load Command – library(“dplyr”)
Functions Used
- mutate(): The mutate() method in this package adds new variables and preserves existing ones. The mutate method doesn’t affect the rows of the dataframe. However, re-grouping may happen based on the groups created in the dataframe. dataframe attributes are also preserved.
Syntax:
mutate (new-col-name = rowSums())
- rowSums(): The rowSums() method calculates the sum of each row of a numeric array, matrix, or dataframe. We can select specific rows to compute the sum in this method. Since, the matrix created by default row and column names are labeled using the X1, X2.., etc. labels, we can specify them using these names. The rows can be selected using the select_ method.
Syntax:
select_(. , col-names.. )
Parameters:
- col-names : column names in the dataframe
Example 1:
In this example we are going to create a dataframe with a matrix with 3 columns – X1, X2, X3, and all columns are selected and their row sums are computed. An additional column “row_sum” is appended to the end of the dataframe.
R
library ( "dplyr" )
data_frame < - data.frame ( matrix ( rnorm (30), 10, 3),
stringsAsFactors= FALSE )
print ( "Original DataFrame" )
print (data_frame)
data_mod < - data_frame % > % mutate (row_sum= rowSums (
select_ (., "X1" , "X2" , "X3" )))
print ( "Modified DataFrame" )
print (data_mod)
|
Output
[1] "Original DataFrame"
X1 X2 X3
1 -2.1548694 -1.1243811 -1.3944730
2 1.1023396 -2.0153914 -1.6321950
3 -0.2959568 -0.6511423 -0.2601204
4 -0.1503434 -0.3802135 0.5651982
5 0.7330868 1.8792182 0.1205579
6 0.5351399 -0.1250861 -0.4986981
7 -0.4058386 -0.0359763 -0.8261032
8 -1.3560053 -0.2901260 -1.1033241
9 -0.6176755 -0.8223494 0.8507067
10 0.7307755 -1.2664778 1.2097483
[1] "Modified DataFrame"
X1 X2 X3 row_sum
1 -2.1548694 -1.1243811 -1.3944730 -4.67372344
2 1.1023396 -2.0153914 -1.6321950 -2.54524683
3 -0.2959568 -0.6511423 -0.2601204 -1.20721946
4 -0.1503434 -0.3802135 0.5651982 0.03464132
5 0.7330868 1.8792182 0.1205579 2.73286285
6 0.5351399 -0.1250861 -0.4986981 -0.08864431
7 -0.4058386 -0.0359763 -0.8261032 -1.26791811
8 -1.3560053 -0.2901260 -1.1033241 -2.74945549
9 -0.6176755 -0.8223494 0.8507067 -0.58931825
10 0.7307755 -1.2664778 1.2097483 0.67404601
Example 2:
In this example, rowSums of X1 and X3 are computed. Only these columns are returned in the final output.
R
library ( "dplyr" )
data_frame < - data.frame ( matrix ( rnorm (30), 10, 3),
stringsAsFactors= FALSE )
print ( "Original DataFrame" )
print (data_frame)
data_mod < -
data_frame % >%
mutate (row_sum= rowSums ( select (., .dots= all_of ( c ( "X1" , "X2" )))))
print ( "Modified DataFrame" )
print (data_mod)
|
Output
[1] "Original DataFrame"
X1 X2 X3
1 -0.01475802 -2.0928792 0.6990158
2 0.09758214 0.9327706 -0.7551849
3 1.73099513 -2.0445329 0.7353809
4 -0.98991323 -0.8638640 0.7545635
5 -0.10079777 -1.0169922 -2.2176920
6 -0.32026943 -0.2890030 1.0493662
7 0.13442533 -2.3674214 0.4975756
8 -1.47351401 -1.1391841 -1.0987409
9 1.05674759 -0.7550495 1.0312730
10 -0.14471879 0.7089866 0.1736686
[1] "Modified DataFrame"
X1 X2 X3 row_sum
1 -0.01475802 -2.0928792 0.6990158 -2.1076372
2 0.09758214 0.9327706 -0.7551849 1.0303527
3 1.73099513 -2.0445329 0.7353809 -0.3135378
4 -0.98991323 -0.8638640 0.7545635 -1.8537772
5 -0.10079777 -1.0169922 -2.2176920 -1.1177900
6 -0.32026943 -0.2890030 1.0493662 -0.6092725
7 0.13442533 -2.3674214 0.4975756 -2.2329960
8 -1.47351401 -1.1391841 -1.0987409 -2.6126981
9 1.05674759 -0.7550495 1.0312730 0.3016981
10 -0.14471879 0.7089866 0.1736686 0.5642678
Share your thoughts in the comments
Please Login to comment...