Estimated Simple Regression Equation in R
Last Updated :
14 Feb, 2023
In R Programming Language, the lm() function can be used to estimate a simple linear regression equation. The function takes two arguments: the independent variable and the dependent variable.
Difference between Estimated Simple Linear Regression and Simple Linear Regression:
Simple linear regression is a method that is used to build a model relationship between a single dependent variable (also known as the outcome) and one or more independent variables (also known as predictors or explanatory variables). The model is represented by an equation of the form:
y = a + b x
Estimated simple linear regression is the same as Simple linear regression but the difference is with the estimated coefficients. We use sample data to estimate the parameters of the model. The estimated coefficients are the values of a and b that minimize the sum of squared errors on the sample data. The estimated simple linear regression model is represented by an equation of the form:
y = a' + b' x
where a’ and b’ are the estimated values. these values are used to find the best fit for the line and reduce the sum of squared errors.
Formulae:
Sum of squared errors = Σ(yi – (a + b*xi))^2
where yi and xi are the actual values of y and x, respectively, and yi’ , a + bxi are the predicted values of y.
Syntax:
lm(y ~ x)
y = Data vector which contains Independent values.
x = Data vector which Dependent values.
R
x <- c (1, 2, 4, 7, 11)
y <- c (2, 7, 9, 10, 13)
model <- lm (y ~ x)
model
|
Output:
Call:
lm(formula = y ~ x)
Coefficients:
(Intercept) x
3.6545 0.9091
Output form:
The output of the code is estimated by simple linear regression equation, which is of the form:
y = b0 + b1 * x
y = Predicted value or function.
b0 = Intercept
b1 = Coefficient
Plotting 2D Plane for Linear Regression Line
Now let’s plot the points on the 2D plane and then we will plot the regression line on the same plane this will help us to analyze to what extent the estimated linear regression is a good fit for our data or not.
R
x <- c (1, 2, 4, 7, 11)
y <- c (2, 7, 9, 10, 13)
fit <- lm (y ~ x)
plot (x, y)
abline (fit, col = "red" )
|
Output:
Scatterplot along with the regression line
Now let’s look at the coefficient of the model which is determined after the training process.
Output:
(Intercept) x
3.6545455 0.9090909
Now let’s try to predict the value of a random data point and get the value of the dependent or target feature using the estimated simple linear regression equation.
R
new_data <- 7
prediction <- predict (model,
newdata = data.frame (x = new_data))
print (prediction)
|
Output:
10.01818
Share your thoughts in the comments
Please Login to comment...