Open In App

Robustbase Package in R

The Robustbase package in R programming is a collection of functions and methods that are widely used and was designed to do robust statistics. The Robustbase package provides tools for Robust Regression, Multivariate Analysis, and Outlier Detection.

Robust Statistics:



It is a branch of statistics that aims to provide methods that are more resistant to non-normal data compared to traditional statistics. 

Robust Regression:



It is a sort of regression analysis that is less sensitive to outliers than standard regression approaches. The Robustbase package includes robust regression functions such as the MM-estimator, S-estimator, and LMS-estimator.

Multivariate Analysis:

The Robustbase package includes robust multivariate analysis utilities. These methods are useful and are widely used when studying datasets with numerous variables because outliers or non-normal data in one variable might have a large impact on the overall analysis.

Outlier Detection:

The Robustbase package includes outlier detection tools which are really helpful that can help find exceptional observations in a dataset that may be impacting the analysis’s conclusions.

Installation of Robustbase Package in R

To use the Robustbase package in R, you first need to install it using the following command:

install.packages("robustbase")

Load the Robustbase package using the following command :

library(robustbase)

Now we can successfully use Robustbase tools in your R session for robust statistics.

Functions in Robustbase Package in R

The Robustbase package in R programming provides a number of functions that are used to perform statistics operations. A few of them are listed below:

  Method Description
1) lmrob() Computes MM-type estimators for linear Regression
2) covMcd() Computes robust location and scatter estimation via MCD
3) colMedians() Computes the median of rows or columns for a matrix
4) outlierStats() Computes robust regression outlier statistics
5) sigma() Extracts standard deviation of errors for robust models
6) lmrob.control() Tune parameters for lmrob() and auxiliaries
7) weights.lmrob() Extracts robustness and model weights
8) plot.mcd() Plots diagnostic plot formcd objects
9) lmrob.lar() Computes least absolute residuals of L1 regression
10) predict.lmrob() Predicts values for robust linear model
11) ltsReg() Carries out least trimmed squares robust regression
12) lmrob.S() Computes S-estimator for linear regression
13) smoothWgt() Computes smooth weight functions
14) lmrob.fit() Computes MM-type estimators for regression
15) lmrob..M..fit() Computes M-estimators of regression by performing RWLS iterations
16) plot.lmrob() Plots diagnostic plot for lmrob objects
17) summary.lmrob() Summary methods for lmrob objects
18) summary.mcd() Summary methods for mcd objects
19) nlrob Computes robust fitting of non-linear regression
20) plot.lts() Plots diagnostic plot for lts objects
21) rrcov.control() Controls settings  for covMcd and ltsReg
22) summary.lts() Summary method for lts objects
23) summary.nlrob() Summary method for non-linear regression objects
24) covComed() Computes the multivariate location and scatter estimator
25) estimethod() Extracts the estimation method as a character string from a fitted model
26) nlrob.control() Controls the non-linear robust regression algorithm
27) Sn Computes robust scale estimator, an efficient alternative to the MAD

Examples of Robustbase Package in R

Let us see a few examples of different functions and methods used in the Robustbase package in R.

Example 1:  Robust regression using the MM-estimator using Robustbase Package in R

Syntax:
mmfit <- lmrob(stack.loss ~ ., data = stackloss, method = "MM",details = false)

Parameters:




library(robustbase)
  
data(stackloss)
mmfit <- lmrob(stack.loss ~ ., data = stackloss, method = "MM")
summary(mmfit)

Output :

Robust regression using the MM-estimator

Example 2: Outlier Detection using the covMcd function of Robustbase Package in R

The covMcd() function from the Robustbase package detects outliers in the stackloss dataset. The outliers variable should contain the indices of the observations that are identified as outliers.

Syntax:
covMcd(x, center = NULL, alpha = 0.5, nsamp = NULL, maxit = 200, initial = "cov.1", ...)

Parameters:




library(robustbase)
  
data(stackloss)
covmat <- covMcd(stackloss)
outliers <- covmat$outliers

Output :

If the outliers variable is NULL, it means that no outliers were detected in the stackloss dataset using the covMcd function. 

NULL

Example 3: The lmrob() function in Robustbase Package for Robust Linear Regression in R

The lmrob() is a function in the Robustbase package in R that fits a robust linear regression model to data.

Syntax:
lmrob(formula, data = NULL, method = c("MM", "S", "MM2", "S2"), control = lmrob.control(),...)

Parameters:




# load the library
library(robustbase)
  
x <- seq(1, 10, length = 100)
y <- 2 * x + rnorm(100, mean = 0, sd = 1)
y[c(10, 30, 60)] <- c(20, 15, 10)
  
# fit a robust linear regression model
model <- lmrob(y ~ x)
  
# plot the data and the robust regression line
plot(x, y)
abline(model, col = "red")

Output:

 robust linear regression

Example 4: The colMedians() function in the Robustbase Package in R

The colMedians() is a function in the Robustbase package in R that computes the median of each column of a matrix. The function takes a matrix as its input and returns a numeric vector containing the median of each column. If the matrix contains missing values, the function automatically removes them before computing the medians.

Syntax:
colMedians(x, na.rm = TRUE)

Parameters:




library(robustbase)
 # create a matrix with some outliers
set.seed(123)
mat <- matrix(rnorm(100, mean = 5), ncol = 5)
mat[c(2, 5, 10)] <- c(20, -15, 30)
# compute the median of each column
colMedians(mat)

Output:

5.38029263882571 4.86005668738249 4.96429139367999 4.78816781525497 5.35853118252101

Example 5: The outlierStats() function in the Robustbase Package in R

The outlierStats() function computes the outlier Statistics for multivariate data. It computes the Mahalanobis distance for each observation in the data set and then compares this distance to a cutoff value based on the chi-squared distribution.

Syntax:
outlierStats(x, alpha = 0.05, plot = TRUE, ...)

Parameters:




#Load the Library
library(robustbase)
# Create a data frame
data <- expand.grid(grp1 = letters[1:5], grp2 = letters[1:5], rep=1:3)
set.seed(101)
data$y <- c(rt(nrow(data), 1))
  
# Set the control parameters for lmrob
control <- lmrob.control(method = "MM", compute.outlier.stats = "S")
  
# Fit the lmrob model with MM method and compute skewness outlier statistics only
fit2 <- lmrob(y ~ grp1*grp2, data, control = control)
  
#plot
plot(fit2)

Output:

Standardized residuals vs. Robust Distances

 Normal Q-Q vs. Residuals 

Response vs. Fitted Values

Residuals vs. Fitted Values

Sqrt of abs(residuals) vs. Fitted Values

Example 6: lmrob() and weight() functions in the Robustbase Package in R

The lmrob() function is a robust regression method provided by the Robustbase package in R. This method can handle outliers and non-normality in the data, and produce more reliable estimates of the regression coefficients. The weight() function is used to compute weights for robust linear regression.

Syntax:
lmrob(formula, data, weights,...)

Parameters:




library(robustbase)
  
# Load the mtcars data
data(mtcars)
  
# Fit a linear model with robust estimation using lmrob()
model <- lmrob(mpg ~ wt + drat, data = mtcars)
  
# View the robustness weights
robustness_weights <- weights(model, type = "robustness")
print(robustness_weights)

Output:

 

Example 7: The sigma() is a function of the Robustbase Package in R

The sigma() function is used to calculate the estimate of the scale parameter of a distribution.

Syntax:
sigma(x, na.rm = FALSE)

Parameters:




# Load the required package
library(robustbase)
# Load the Coleman data
data(coleman)
  
#Example 1:  Fit a linear model using lm()
m_cl <- lm(Y ~ ., data = coleman)
sigma_m_cl <- sigma(m_cl)
sigma_m_cl
  
#Example 2:  Fit a robust linear model using lmrob() and calculate sigma
m1 <- lmrob(Y ~ ., data = coleman)
sigma_m1 <- sigma(m1)
sigma_m1
  
#Example 3:  Fit a robust linear model with KS2014 setting and calculate sigma
m2 <- lmrob(Y ~ ., data = coleman, setting = "KS2014")
sigma_m2 <- sigma(m2)
sigma_m2

Output: 

2.07429596766463
1.13383554133121
0.974570103850789

Article Tags :