R-squared is a statistical measure that represents the goodness of fit of a regression model. The ideal value for r-square is 1. The closer the value of r-square to 1, the better is the model fitted.
R-square is a comparison of residual sum of squares (SSres) with total sum of squares(SStot). Total sum of squares is calculated by summation of squares of perpendicular distance between data points and the average line.
Residual sum of squares in calculated by the summation of squares of perpendicular distance between data points and the best fitted line.
R square is calculated by using the following formula :
Where SSres is the residual sum of squares and SStot is the total sum of squares.
The goodness of fit of regression models can be analyzed on the basis of R-square method. The more the value of r-square near to 1, the better is the model.
Note : The value of R-square can also be negative when the models fitted is worse than the average fitted model.
Limitation of using R-square method –
- The value of r-square always increases or remains same as new variables are added to the model, without detecting the significance of this newly added variable (i.e value of r-square never decreases on addition of new attributes to the model). As a result, non-significant attributes can also be added to the model with an increase in r-square value.
- This is because SStot is always constant and regression model tries to decrease the value of SSres by finding some correlation with this new attribute and hence the overall value of r-square increases, which can lead to a poor regression model.
- Heteroscedasticity in Regression Analysis
- ML | Adjusted R-Square in Regression Analysis
- ML | Linear Regression
- ML | Classification vs Regression
- ML | Why Logistic Regression in Classification ?
- Understanding Logistic Regression
- Multiple Linear Regression using R
- Simple Linear-Regression using R
- Linear Regression using PyTorch
- ML | Logistic Regression using Python
- Types of Regression Techniques
- ML | Logistic Regression using Tensorflow
- Linear Regression Using Tensorflow
- ML | Cost function in Logistic Regression
- Python | Linear Regression using sklearn
If you like GeeksforGeeks and would like to contribute, you can also write an article using contribute.geeksforgeeks.org or mail your article to email@example.com. See your article appearing on the GeeksforGeeks main page and help other Geeks.
Please Improve this article if you find anything incorrect by clicking on the "Improve Article" button below.