For the prediction of one variable’s value(dependent variable) through other variables (independent variables) some models are used that are called regression models. For further calculating the accuracy of this prediction another mathematical tool is used, which is R-squared Regression Analysis or the coefficient of determination. The value of R-squared is between 0 and 1. And if the coefficient of determination is 1 (or 100%) means that prediction of the dependent variable has been perfect and accurate.

R-square is a comparison of the residual sum of squares (SS_{res}) with the total sum of squares(SS_{tot}). The residual sum of squares is calculated by the summation of squares of perpendicular distance between data points and the best-fitted line.

The total sum of squares is calculated by the summation of squares of perpendicular distance between data points and the average line.

#### Formula for R-squared Regression Analysis

The formula for R-squared Regression Analysis is given as follows,

where,

: experimental values of the dependent variable

: the average/mean

: the fitted value

#### Find the Coefficient of Determination(R) in R

It is very easy to find out the Coefficient of Determination(R) in the R language. The steps to follow are:

- Make a data frame in R.
- Calculate the linear regression model and save it in a new variable.
- The so calculated new variable’s summary has a coefficient of determination or R-squared parameter that needs to be extracted.

`# Creating a data frame of exam marks ` `exam <` `-` `data.frame(name ` `=` `c(` `"ravi"` `, ` `"shaily"` `, ` ` ` `"arsh"` `, ` `"monu"` `), ` ` ` `math ` `=` `c(` `87` `, ` `98` `, ` `67` `, ` `90` `), ` ` ` `estimated ` `=` `c(` `65` `, ` `87` `, ` `56` `, ` `100` `)) ` ` ` `# Printing data frame ` `exam ` ` ` `# Calculating the linear regression model ` `model ` `=` `lm(math~estimated, data ` `=` `exam) ` ` ` `# Extracting R-squared parameter from summary ` `summary(model)$r.squared ` |

*chevron_right*

*filter_none*

**Output:**

name math estimated 1 ravi 87 65 2 shaily 98 87 3 arsh 67 56 4 monu 90 100 [1] 0.5672797

Note:If the prediction is accurate the R-squared Regression value generated is 1.

`# Creating a data frame of exam marks ` `exam <` `-` `data.frame(name ` `=` `c(` `"ravi"` `, ` `"shaily"` `, ` ` ` `"arsh"` `, ` `"monu"` `), ` ` ` `math ` `=` `c(` `87` `, ` `98` `, ` `67` `, ` `90` `), ` ` ` `estimated ` `=` `c(` `87` `, ` `98` `, ` `67` `, ` `90` `)) ` ` ` `# Printing data frame ` `exam ` ` ` `# Calculating the linear regression model ` `model ` `=` `lm(math~estimated, data ` `=` `exam) ` ` ` `# Extracting R-squared parameter from summary ` `summary(model)$r.squared ` |

*chevron_right*

*filter_none*

**Output:**

name math estimated 1 ravi 87 87 2 shaily 98 98 3 arsh 67 67 4 monu 90 90 [1] 1

#### Limitation of Using R-square Method

- The value of r-square always increases or remains the same as new variables are added to the model, without detecting the significance of this newly added variable (i.e value of r-square never decreases on the addition of new attributes to the model). As a result, non-significant attributes can also be added to the model with an increase in r-square value.
- This is because SS
_{tot}is always constant and the regression model tries to decrease the value of SS_{res}by finding some correlation with this new attribute and hence the overall value of r-square increases, which can lead to a poor regression model.

## Recommended Posts:

- Regression Analysis in R Programming
- Polynomial Regression in R Programming
- Random Forest Approach for Regression in R Programming
- Lasso Regression in R Programming
- Regression and its Types in R Programming
- Regression using k-Nearest Neighbors in R Programming
- Decision Tree for Regression in R Programming
- Ridge Regression in R Programming
- Elastic Net Regression in R Programming
- Quantile Regression in R Programming
- Descriptive Analysis in R Programming
- Predictive Analysis in R Programming
- Linear Discriminant Analysis in R Programming
- Types of Regression Techniques
- Principal Component Analysis with Python
- Complexity Analysis of Binary Search
- GRE Data Analysis | Numerical Methods for Describing Data
- GRE Data Analysis | Distribution of Data, Random Variables, and Probability Distributions
- GRE Data Analysis | Counting Methods
- GRE Data Analysis | Methods for Presenting Data

If you like GeeksforGeeks and would like to contribute, you can also write an article using contribute.geeksforgeeks.org or mail your article to contribute@geeksforgeeks.org. See your article appearing on the GeeksforGeeks main page and help other Geeks.

Please Improve this article if you find anything incorrect by clicking on the "Improve Article" button below.