Open In App

Estimated Simple Regression Equation in R

Last Updated : 14 Feb, 2023
Improve
Improve
Like Article
Like
Save
Share
Report

In R Programming Language, the lm() function can be used to estimate a simple linear regression equation. The function takes two arguments: the independent variable and the dependent variable.

Difference between Estimated Simple Linear Regression and Simple Linear Regression:

Simple linear regression is a method that is used to build a model relationship between a single dependent variable (also known as the outcome) and one or more independent variables (also known as predictors or explanatory variables). The model is represented by an equation of the form:

y = a + b x

Estimated simple linear regression is the same as Simple linear regression but the difference is with the estimated coefficients. We use sample data to estimate the parameters of the model. The estimated coefficients are the values of a and b that minimize the sum of squared errors on the sample data. The estimated simple linear regression model is represented by an equation of the form:

y = a' + b' x

where a’ and b’ are the estimated values. these values are used to find the best fit for the line and reduce the sum of squared errors.

Formulae:

Sum of squared errors = Σ(yi – (a + b*xi))^2

where yi and xi are the actual values of y and x, respectively, and yi’ , a + bxi are the predicted values of y.

Syntax:

lm(y ~ x)

y = Data vector which contains Independent values.

x = Data vector which Dependent values.

R




# create data
x <- c(1, 2, 4, 7, 11)
y <- c(2, 7, 9, 10, 13)
 
# estimate linear regression equation
model <- lm(y ~ x)
model


Output:

Call:
lm(formula = y ~ x)

Coefficients:
(Intercept)            x  
     3.6545       0.9091  

Output form: 

The output of the code is estimated by simple linear regression equation, which is of the form:

y = b0 + b1 * x

y   = Predicted value or function.

b0 =  Intercept

b1 = Coefficient 

Plotting 2D Plane for Linear Regression Line

Now let’s plot the points on the 2D plane and then we will plot the regression line on the same plane this will help us to analyze to what extent the estimated linear regression is a good fit for our data or not.

R




# create data
x <- c(1, 2, 4, 7, 11)
y <- c(2, 7, 9, 10, 13)
 
# fit a linear regression model
fit <- lm(y ~ x)
 
# create a scatter plot of the data
plot(x, y)
 
# add the regression line to the plot
abline(fit, col = "red")


Output:

Scatterplot along with the regression line

Scatterplot along with the regression line

Now let’s look at the coefficient of the model which is determined after the training process.

R




# printing coefficient.
coef(model)


Output:

(Intercept)           x 
  3.6545455   0.9090909 

Now let’s try to predict the value of a random data point and get the value of the dependent or target feature using the estimated simple linear regression equation.

R




# new data point
new_data <- 7
 
# wrapping the datapoint and calling predict function.
prediction <- predict(model,
                      newdata = data.frame(x = new_data))
 
# printing prediction
print(prediction)


Output:

10.01818 


Like Article
Suggest improvement
Share your thoughts in the comments

Similar Reads