Root-Mean-Square Error in R Programming

Root mean squared error (RMSE) is the square root of the mean of the square of all of the error. RMSE is considered an excellent general-purpose error metric for numerical predictions. RMSE is a good measure of accuracy, but only to compare prediction errors of different models or model configurations for a particular variable and not between variables, as it is scale-dependent. It is the measure of how well a regression line fits the data points. The formula for calculating RMSE is:
RMSE-formula

where,
predictedi = The predicted value for the ith observation.
actuali = The observed(actual) value for the ith observation
N = Total number of observations.

Note: The difference between the actual values and the predicted values is known as residuals.

Implementation of RMSE

The rmse() function available in Metrics package in R is used to calculate root mean square error between actual values and predicted values.

Syntax:
rmse(actual, predicted)



Parameters:
actual: The ground truth numeric vector.
predicted: The predicted numeric vector, where each element in the vector is a prediction for the corresponding element in actual.

Example 1:
Let’s define two vectors actual vector with ground truth numeric values and predicted vector with predicted numeric values where each element in the vector is a prediction for the corresponding element in actual.

filter_none

edit
close

play_arrow

link
brightness_4
code

# R program to illustrate RMSE
  
# Importing the required package
library(Metrics)
  
# Taking two vectors
actual = c(1.5, 1.0, 2.0, 7.4, 5.8, 6.6)         
predicted = c(1.0, 1.1, 2.5, 7.3, 6.0, 6.2)      
  
# Calculating RMSE using rmse()         
result = rmse(actual, predicted)
  
# Printing the value
print(result)       

chevron_right


Output:

[1] 0.3464102

Example 2:
In this example let’s take the trees data in the datasets library which represents the data from a study conducted on black cherry trees.

filter_none

edit
close

play_arrow

link
brightness_4
code

# Importing required packages
library(datasets)
library(tidyr)
library(dplyr)
  
# Access the data from R’s datasets package
data(trees)
  
# Display the data in the trees dataset    
trees           

chevron_right


Output:

    Girth Height Volume
1    8.3     70   10.3
2    8.6     65   10.3
3    8.8     63   10.2
4   10.5     72   16.4
5   10.7     81   18.8
6   10.8     83   19.7
7   11.0     66   15.6
8   11.0     75   18.2
9   11.1     80   22.6
10  11.2     75   19.9
11  11.3     79   24.2
12  11.4     76   21.0
13  11.4     76   21.4
14  11.7     69   21.3
15  12.0     75   19.1
16  12.9     74   22.2
17  12.9     85   33.8
18  13.3     86   27.4
19  13.7     71   25.7
20  13.8     64   24.9
21  14.0     78   34.5
22  14.2     80   31.7
23  14.5     74   36.3
24  16.0     72   38.3
25  16.3     77   42.6
26  17.3     81   55.4
27  17.5     82   55.7
28  17.9     80   58.3
29  18.0     80   51.5
30  18.0     80   51.0
31  20.6     87   77.0
filter_none

edit
close

play_arrow

link
brightness_4
code

# Look at the structure
# Of the variables
str(trees)     

chevron_right


Output:

'data.frame':   31 obs. of  3 variables:
 $ Girth : num  8.3 8.6 8.8 10.5 10.7 10.8 11 11 11.1 11.2 ...
 $ Height: num  70 65 63 72 81 83 66 75 80 75 ...
 $ Volume: num  10.3 10.3 10.2 16.4 18.8 19.7 15.6 18.2 22.6 19.9 ...

This data set consists of 31 observations of 3 numeric variables describing black cherry trees with trunk girth, height and volume as variables.Now, try to fit a linear regression model to predict Volume of the trunks on the basis of given trunk girth. The Simple Liner Regression Model in R will help in this case. Let’s dive right in and build a linear model relating tree volume to girth. R makes this straightforward with the base function lm(). How well will the model do at predicting that tree’s volume from its girth? Use the predict() function, a generic R function for making predictions of model-fitting functions. predict() takes as arguments, the linear regression model and the values of the predictor variable that we want response variable values for.

filter_none

edit
close

play_arrow

link
brightness_4
code

# Building a linear model 
# Relating tree volume to girth
fit_1 <- lm(Volume ~ Girth, data = trees)                            
trees.Girth = trees %>% select(Girth) 
  
# Use predict function to predict volume
data.predicted = c(predict(fit_1, data.frame(Girth = trees.Girth)))    
data.predicted

chevron_right


Output:

        1         2         3         4         5         6         7         8         9 
 5.103149  6.622906  7.636077 16.248033 17.261205 17.767790 18.780962 18.780962 19.287547 
       10        11        12        13        14        15        16        17        18 
19.794133 20.300718 20.807304 20.807304 22.327061 23.846818 28.406089 28.406089 30.432431 
       19        20        21        22        23        24        25        26        27 
32.458774 32.965360 33.978531 34.991702 36.511459 44.110244 45.630001 50.695857 51.709028 
       28        29        30        31 
53.735371 54.241956 54.241956 67.413183 

Now we have the actual volume of cherry tree trunks and the predicted one as driven by the linear regression models. Finally use rmse() function to get the relative error between the actual and the predicted values.

filter_none

edit
close

play_arrow

link
brightness_4
code

# Load the Metrics package 
library(Metrics)
  
# Applying rmse() function 
rmse(trees$Volume, predict(fit_1, data.frame(Girth = trees.Girth)))

chevron_right


Output:

[1] 4.11254

As the error value is 4.11254 which is a good score for a linear model. But it can be reduced further by adding more predictors(Multiple Regression Model). So, in summary, it can be said that it is very easy to find the root mean square error using R. One can perform this task using rmse() function in R.




My Personal Notes arrow_drop_up

Check out this Author's contributed articles.

If you like GeeksforGeeks and would like to contribute, you can also write an article using contribute.geeksforgeeks.org or mail your article to contribute@geeksforgeeks.org. See your article appearing on the GeeksforGeeks main page and help other Geeks.

Please Improve this article if you find anything incorrect by clicking on the "Improve Article" button below.


Article Tags :

Be the First to upvote.


Please write to us at contribute@geeksforgeeks.org to report any issue with the above content.