# R-squared Regression Analysis in R Programming

For the prediction of one variable’s value(dependent variable) through other variables (independent variables) some models are used that are called regression models. For further calculating the accuracy of this prediction another mathematical tool is used, which is R-squared Regression Analysis or the coefficient of determination. The value of R-squared is between 0 and 1. And if the coefficient of determination is 1 (or 100%) means that prediction of the dependent variable has been perfect and accurate.

R-square is a comparison of the residual sum of squares (SSres) with the total sum of squares(SStot). The residual sum of squares is calculated by the summation of squares of perpendicular distance between data points and the best-fitted line. The total sum of squares is calculated by the summation of squares of perpendicular distance between data points and the average line. #### Formula for R-squared Regression Analysis

The formula for R-squared Regression Analysis is given as follows, where, : experimental values of the dependent variable : the average/mean : the fitted value

#### Find the Coefficient of Determination(R) in R

It is very easy to find out the Coefficient of Determination(R) in the R language. The steps to follow are:

• Make a data frame in R.
• Calculate the linear regression model and save it in a new variable.
• The so calculated new variable’s summary has a coefficient of determination or R-squared parameter that needs to be extracted.

 # Creating a data frame of exam marks  exam <- data.frame(name = c("ravi", "shaily",                               "arsh", "monu"),                     math = c(87, 98, 67, 90),                     estimated = c(65, 87, 56, 100))     # Printing data frame  exam     # Calculating the linear regression model  model = lm(math~estimated, data = exam)     # Extracting R-squared parameter from summary  summary(model)$r.squared  Output:  name math estimated 1 ravi 87 65 2 shaily 98 87 3 arsh 67 56 4 monu 90 100  0.5672797  Note: If the prediction is accurate the R-squared Regression value generated is 1.  # Creating a data frame of exam marks  exam <- data.frame(name = c("ravi", "shaily",   "arsh", "monu"),   math = c(87, 98, 67, 90),   estimated = c(87, 98, 67, 90))    # Printing data frame  exam    # Calculating the linear regression model  model = lm(math~estimated, data = exam)    # Extracting R-squared parameter from summary  summary(model)$r.squared

Output:

    name   math   estimated
1   ravi   87        87
2 shaily   98        98
3   arsh   67        67
4   monu   90       90

 1


#### Limitation of Using R-square Method

• The value of r-square always increases or remains the same as new variables are added to the model, without detecting the significance of this newly added variable (i.e value of r-square never decreases on the addition of new attributes to the model). As a result, non-significant attributes can also be added to the model with an increase in r-square value.
• This is because SStot is always constant and the regression model tries to decrease the value of SSres by finding some correlation with this new attribute and hence the overall value of r-square increases, which can lead to a poor regression model.

My Personal Notes arrow_drop_up Check out this Author's contributed articles.

If you like GeeksforGeeks and would like to contribute, you can also write an article using contribute.geeksforgeeks.org or mail your article to contribute@geeksforgeeks.org. See your article appearing on the GeeksforGeeks main page and help other Geeks.

Please Improve this article if you find anything incorrect by clicking on the "Improve Article" button below.

Article Tags :

Be the First to upvote.

Please write to us at contribute@geeksforgeeks.org to report any issue with the above content.