In this article, we will discuss how to use na.rm in R Programming Language. na.rm in R is used to remove the NA values.
na.rm in vector
When we perform any operation, we have to exclude NA values, otherwise, the result would be NA.
Syntax: function(vector,na.rm)
where
- vector is input vector
- na.rm is to remove NA values
- function is to perform operation on vector like sum ,mean ,min ,max etc
Example 1: In this example, we are calculating the mean, sum, minimum, maximum, and standard deviation with NA
# create a vector data = c (1,2,3, NA ,45,34, NA , NA ,23)
# display print (data)
# calculate mean including NA values print ( mean (data,na.rm= FALSE ))
# calculate sum including NA values print ( sum (data,na.rm= FALSE ))
# get minimum including NA values print ( min (data,na.rm= FALSE ))
# get maximum including NA values print ( max (data,na.rm= FALSE ))
# calculate standard deviation including # NA values print ( sd (data,na.rm= FALSE ))
|
Output:
[1] 1 2 3 NA 45 34 NA NA 23 [1] NA [1] NA [1] NA [1] NA [1] NA
Example 2: In this example, we are calculating the mean, sum, minimum, maximum, and standard deviation without NA
# create a vector data = c (1,2,3, NA ,45,34, NA , NA ,23)
# display print (data)
# calculate mean excluding NA values print ( mean (data,na.rm= TRUE ))
# calculate sum excluding NA values print ( sum (data,na.rm= TRUE ))
# get minimum excluding NA values print ( min (data,na.rm= TRUE ))
# get maximum excluding NA values print ( max (data,na.rm= TRUE ))
# calculate standard deviation excluding # NA values print ( sd (data,na.rm= TRUE ))
|
Output:
[1] 1 2 3 NA 45 34 NA NA 23 [1] 18 [1] 108 [1] 1 [1] 45 [1] 18.86796
na.rm in dataframe
We have to use apply function to apply the function on the dataframe with na.rm function
Syntax: apply(dataframe, 2, function, na.rm )
where
- dataframe is the input dataframe
- function is to perform some operations like mean,min ,max etc
- 2 represents column
- na.rm is to remove NA values
Example 1: In this example, we are calculating the mean, sum, minimum, maximum, and standard deviation without NA in all columns
# create a dataframe with 3 columns data = data.frame (column1 = c (1,2, NA ,34),
column2 = c ( NA ,34,56, NA ),
column3 = c ( NA , NA ,32,56))
# display print (data)
# calculate mean including NA values apply (data, 2, mean , na.rm = FALSE )
# calculate sum including NA values apply (data, 2, sum , na.rm = FALSE )
# calculate min including NA values apply (data, 2, min, na.rm = FALSE )
# calculate max including NA values apply (data, 2, max , na.rm = FALSE )
# calculate standard deviation including # NA values apply (data, 2, sd, na.rm = FALSE )
|
Output:
Example 2: Excluding NA values
# create a dataframe with 3 columns data = data.frame (column1 = c (1,2, NA ,34),
column2 = c ( NA ,34,56, NA ),
column3 = c ( NA , NA ,32,56))
# display print (data)
# calculate mean excluding NA values apply (data, 2, mean , na.rm = TRUE )
# calculate sum excluding NA values apply (data, 2, sum , na.rm = TRUE )
# calculate min excluding NA values apply (data, 2, min, na.rm = TRUE )
# calculate max excluding NA values apply (data, 2, max , na.rm = TRUE )
# calculate standard deviation excluding # NA values apply (data, 2, sd, na.rm = TRUE )
|
Output: