Open In App

Significance Test for Linear Regression in R

Last Updated : 01 Aug, 2023
Improve
Improve
Like Article
Like
Save
Share
Report

Linear regression is a statistical method for modeling the relationship between one or more independent variables and a dependent variable. It is frequently used to forecast the value of a dependent variable using the values of one or more independent factors. The lm() function in R can be used to conduct linear regression. We may want to evaluate the significance of the regression coefficients after we have fitted a linear regression model to our data. A relevance test can be used to accomplish this. In this tutorial, we will look at how to run a linear regression significance test in R.

What is the significance test for linear regression?

Significance tests for linear regression are used to determine if the relationship between the dependent variable and one or more independent variables is statistically significant. In other words, they help us determine if the independent variables are good predictors of the dependent variable.

Several tests can be used to determine the significance of a linear regression model, but the most common test is the t-test. The t-test is used to test whether the slope coefficient(s) in the linear regression model is significantly different from zero.

Types of linear regression in R

A statistical method for simulating the relationship between one or more independent variables and a dependent variable is called linear regression. The two types of linear regression are as follows:

a. Simple linear regression:

Only one independent variable is involved in simple linear regression, a type of linear regression. Finding the best-fit line that illustrates the relationship between the independent and dependent factors is the goal. An elementary linear regression model has the solution shown below:

                                                                  y = β0 + β1x + ε

In this scenario, y is the dependent variable, x is the independent variable, 0 denotes the intercept, 1 denotes the slope, and denotes the error term.

b. Multiple linear regression:

A type of linear regression called multiple linear regression takes into account two or more independent variables. Finding the plane or hyperplane that best captures the connection between the independent variables and the dependent variable is the objective. The multiple linear regression model’s formulations is as follows:

                                                                         y = β0 + β1×1 + β2×2 + … + βnxn + ε

In this scenario, y is the dependent variable, x1, x2,…, xn are the independent variables, 0 is the intercept, 1, 2,…, n are the independent variable values, and is the error term.

Significance Test for Linear Regression in R

The summary function in R can be used to perform the linear regression relevance test on a built-in linear regression model. For each predictor variable, the summary function provides comprehensive information about the linear regression model, including predicted coefficients, standard errors, t-statistics, and p-values.

Here’s an example:

R




# Load the dataset
data(mtcars)
 
# Fit the linear regression model
model <- lm(mpg ~ wt + hp, data = mtcars)
 
# Perform the significance test
summary(model)


In this example, we will use the mtcars dataset to estimate kilometers per gallon (mpg) based on the car’s weight (wt) and horsepower. (hp). The lm function is used to estimate the linear regression model, and the summary function is used to evaluate for significance.

Output of the above code

Example 2:

R




# Load the dataset
data(iris)
 
# Split the dataset into training and testing sets
set.seed(123)
train_index <- sample(1:nrow(iris), 0.7 * nrow(iris))
train_data <- iris[train_index, ]
test_data <- iris[-train_index, ]
 
# Fit the linear regression model on the training data
model <- lm(Sepal.Length ~ Sepal.Width + Petal.Length, data = train_data)
 
# Perform the significance test
summary(model)
 
# Make predictions on the testing data
predictions <- predict(model, newdata = test_data)
 
# Calculate the root mean squared error (RMSE)
RMSE <- sqrt(mean((test_data$Sepal.Length - predictions)^2))
 
# Print the RMSE
cat("RMSE:", RMSE, "\n")
 
# Visualize the relationship between Sepal.Length and Sepal.Width
plot(train_data$Sepal.Width, train_data$Sepal.Length, main = "Sepal.Width vs Sepal.Length", xlab = "Sepal.Width", ylab = "Sepal.Length")
abline(model, col = "red")


Explanation & Output

The iris dataset is loaded in the first sentence.
The information was randomly divided into a training set (70% of the data) and a testing set (30% of the data) on the second and third lines.

A linear regression model with Sepal is matched by the fourth line.Length and Sepal are the answer variables.Petals and width.Using the training data, length is used as the predictive variable.
Using the summary() method, a significance test is run on the model in the fifth line. The summary contains details on the model’s parameters, standard errors, t-values, and p-values.
The trained algorithm is used in the sixth line to forecast the Sepal.Length of the material used for testing by predict().
The root mean squared error (RMSE) between the predicted and real Sepal is calculated in the seventh and final line.

 

In this case, we predicted Sepal. Length using the Sepal.Width and Petal.Length variables. We divided the iris dataset into training and testing groups, then applied the linear regression model to the training data before making forecasts on the testing data. To assess the precision of our model on the trial data, we compute the RMSE. We also display the regression line on the scatter plot to visualize the connection between Sepal. Width and Sepal.Length.

Conclusion

In this article, we covered how to run a linear regression significance test in R. We showed the process using the “mtcars” dataset and the variables “mpg” and “hp.” We can use the significance test to find the statistical significance of the regression coefficients.



Like Article
Suggest improvement
Share your thoughts in the comments

Similar Reads