Open In App

Relative Importance Analysis in R

Last Updated : 26 Jun, 2023
Improve
Improve
Like Article
Like
Save
Share
Report

A statistical method called relative importance analysis (RIA) is used to assess the relative contribution of independent variables to the variance of a dependent variable. It is used to evaluate the relative weights of predictor variables in a model. It enables us to recognize the predictors that are most crucial in illuminating the variation in the response variable. It provides an effective method for determining the primary causes of a phenomenon and is applicable to a number of disciplines, including marketing, finance, and social sciences. The many approaches to relative significance analysis in R Programming Language will be covered in this article. 

The relative significance of independent variables can be determined using a variety of metrics, such as:

  • Variable importance: The relative importance of each variable in a regression model in relation to the desired outcome is known as variable importance.
  • Relative importance: A measure of each variable’s relevance in relation to the other variables in the model is called relative importance.
  • Decomposition methods: To evaluate each variable’s relative relevance, decomposition techniques are employed to break down the variation that each variable in the model explains.

Variable Importance Plot

A graphical tool for assessing the relative weights of predictors in a model is the variable importance plot. The caret package’s varImp() function makes it simple to generate them. Here is a sample of the code:

R




library(caret)
data(iris)
fit <- train(Species ~ ., data = iris, method = "rf")
vi <- varImp(fit)
plot(vi)


Output:

Variable Importance Plot

Variable Importance Plot

Using the varImp() and plot() methods, this code generates a variable importance plot after fitting a random forest model to the iris dataset.

Permutation Importance

Another way for assessing variable importance is permutation importance. It operates by varying a predictor variable’s values at random and observing the impact on the model’s performance. Permutation importance charts can be made using the vip() method from the vip package. Here is a sample of the code:

R




# Install the randomForest package
install.packages("randomForest")
 
# Load the randomForest package
library(randomForest)
 
 
library(vip)
data(iris)
fit <- randomForest(Species ~ ., data = iris)
vip(fit)


Output:

Permutation Importance

Permutation Importance

This code generates a permutation significance plot by utilizing the vip() function after fitting a random forest model to the iris dataset.

Relative Weight Analysis

Based on their standardized regression coefficients, a method called relative weight analysis can be used to evaluate the relative weights of the predictors in a model. For doing relative weight analysis, the relaimpo package offers the function calc.relimp(). Here’s an illustration of the code:

R




install.packages("relaimpo")
# Load the relaimpo package
library(relaimpo)
 
# Fit a linear regression model
fit <- lm(mpg ~ ., data = mtcars)
 
# Calculate relative importance of predictors
relimp <- calc.relimp(fit, type = "lmg", importance = TRUE)
 
# View the structure of the relimp object
str(relimp)
 
# Plot the relative importance of predictors
plot(relimp)


Output:

Relative Weight Analysis

Relative Weight Analysis

This program analyses relative weights using the calc.relimp() function of the relaimpo package and fits a linear regression model to the mtcars dataset. A bar plot depicting the relative significance of the predictors is then produced.

Bootstrap Relative Importance

To determine the relative weights of predictor variables in a linear regression model using bootstrap resampling, use the R functions boot.relimp() and booteval.relimp().

Here, the type parameter is set to “lmg,” which stands for “Lindeman, Merenda, and Gold,” an often suggested approach for determining relative importance. The number of bootstrap samples to use is specified by the nboot option.

  • The bootstrapped estimates of the relative importance of each predictor variable are contained in the boot rel object that boot.relimp() returns. 
  • The booteval.relimp() method can be used to compute confidence intervals for the relative significance estimations in addition to displaying the results.

The bootstrapped confidence intervals for the relative significance estimates are contained in the boot eval object that booteval.relimp() returned. With the plot() method, you can see the confidence intervals. The confidence intervals for each predictor variable will then be plotted.

R




library(relaimpo)
 
# Create some example data
x1 <- rnorm(100)
x2 <- rnorm(100)
x3 <- rnorm(100)
y <- x1 + 2*x2 + 3*x3 + rnorm(100)
 
# Combine the data into a data frame
mydata <- data.frame(x1, x2, x3, y)
  
 
model <- lm(y ~ x1 + x2 + x3, data = mydata)
boot_rel <- boot.relimp(model, type = "lmg",
                        nboot = 1000)
 
boot_eval <- booteval.relimp(boot_rel)
plot(boot_eval)


Output:

Bootstrap Relative Importance

Bootstrap Relative Importance

Conclusion

In this post, we covered many approaches to relative significance analysis in R. We discussed relative weight analysis, variable importance charts, and permutation importance. We may make data-driven decisions and establish the relative importance of predictor variables in our models by comprehending and using these strategies.



Like Article
Suggest improvement
Share your thoughts in the comments

Similar Reads