Open In App

Redundancy Analysis using R

Last Updated : 23 Aug, 2023
Improve
Improve
Like Article
Like
Save
Share
Report

A Redundancy Analysis (RDA) is a multivariate statistical technique used to explore the relationship between two sets of variables: response variables and predictor variables. It is commonly used in ecological and environmental research to understand the influence of predictor variables on the variation in response variables.

The main purpose of RDA is to determine the linear combinations of the predictor variables that explain the maximum amount of variation in the response variables. It helps to identify which predictor variables are most strongly associated with the response variables and to visualize the relationships between the two sets of variables.

In R Programming Language, several packages provide functions to perform RDA, including vegan, ade4, and BiodiversityR. These packages offer built-in utility functions to conduct RDA analysis and provide various options for visualizing and interpreting the results.

R




# package
library(vegan)
# Load the data
data(mydata)
# the RDA model
myrda <- rda(response_matrix ~ explanatory_matrix,
             data=mydata)
# Summarize the RDA result
summary(myrda)
# Plot the RDA result
plot(myrda)


  • The mydata is the name of the data frame that contains the response and explanatory variables. 
  • the response_matrix and explanatory_matrix are the names of matrices or data frames that contain the response and explanatory variables, respectively. 
  • The rda() function fits the RDA model, while the summary() function summarizes the results. Finally, the plot() function can be used to plot the results.

How RDA Works?

The RDA starts by calculating the PCA of the predictor variables (independent variables) to reduce their dimensionality. This step is similar to the standard PCA. Then, RDA performs a constrained ordination analysis using reduced predictor variables and the response variable. The constrained ordination method, usually based on linear models, finds the linear combination of the predictor variables that maximize the explained variation in the response while considering the relationships between predictors.

RDA can provide valuable insights into the relationship between the predictor variables and the response variable. It allows us to assess the significance of each predictor variable in explaining the variation in the response using the techniques such as permutation tests. Additionally, RDA provides statistical tests to determine the significance of the overall model and individual predictor variables.

Improving the Analysis

After calculating the RDA model, further analysis can be performed to gain deeper insights into the results. One such analysis is performing an analysis of the variance (ANOVA) on the RDA model using the anova.cca() function. ANOVA allows for the testing of the significance of different factors or groups in explaining the variation in the response variable.

By conducting an ANOVA on the RDA model, we can assess the significance of the various factors, detect interactions between predictors, and explore the relative contributions of different predictor variables to the response. This additional analysis enhances our understanding of the RDA results and helps us uncover more nuanced patterns in data.

Difference between PCA and RDA

 

PCA

RDA

Purpose

The Dimensionality reduction, data visualization. The Regression analysis, and relationship exploration.

Variables

Only considers predictor variables. Considers both predictor and response variables.

Technique

Finds principal components capturing maximum variation. Performs constrained ordination analysis.

Focus

Capturing overall data variation Explaining variation in the response variable.

Supervision

Unsupervised learning Supervised learning

Output

Transformed data, principal components. Model coefficients, significance tests.

Additional Analysis

Limited to visualizing data in a reduced space. The ANOVA on the RDA model to assess the significance and interactions.

Data Preparation

  • Organize your data into a suitable format, such as a data frame or matrix.
  • Ensure that your data is properly formatted with response variables and explanatory variables.

Model Specification

  • Load the necessary R packages for conducting RDA, such as vegan.
  • to Define the RDA model using rda() function.
  • Specify the formula for the RDA model, indicating the response and explanatory variables.
  • You can also include additional constraints or settings, such as constraining certain variables or adding interaction terms.

Model Fitting

  • Fit the RDA model to your data using the defined formula and dataset.
  • Store the result of the RDA model fitting in a variable for further analysis.

Model Assessment

  • Evaluate the significance and strength of the relationships between the response and explanatory variables.
  • the Conduct appropriate statistical tests to assess the significance of relationships, such as permutation tests or partial redundancy analysis.
  • Use the summary statistics and p-values to determine the importance and significance of explanatory variables.

Interpretation and Visualization

  • Interpret the RDA results by examining the contribution of each explanatory variable to explaining the variation in the response variables.
  • Visualize the results using ordination plots, biplots, or other relevant visualizations.
  • to Explore the relationships between the response and explanatory variables and identify patterns or trends.

R




library(vegan)
data(varespec)
data(varechem)
 
rda_model <- rda(varespec ~ .,
                 data = varechem)
plot(rda_model)


Output:

 

Now let’s try to plot the same using the dune dataset.

R




library(vegan)
data(dune)
data(dune.env)
 
rda_model <- rda(dune ~ .,
                 data = dune.env)
plot(rda_model)


Output:

 

We can also print the results of the redundancy analysis just by printing it.

R




library(vegan)
data(dune)
data(dune.env)
 
rda_model <- rda(dune ~ .,
                 data = dune.env)
print(rda_model)


Output:

Call: rda(formula = dune ~ A1 + Moisture + Management + Use + Manure, data =
dune.env)
              Inertia Proportion Rank
Total         84.1237     1.0000     
Constrained   63.2062     0.7513   12
Unconstrained 20.9175     0.2487    7
Inertia is variance 
Some constraints or conditions were aliased because they were redundant
Eigenvalues for constrained axes:
  RDA1   RDA2   RDA3   RDA4   RDA5   RDA6   RDA7   RDA8   RDA9  RDA10  RDA11  RDA12 
22.396 16.208  7.039  4.038  3.760  2.609  2.167  1.803  1.404  0.917  0.582  0.284 
Eigenvalues for unconstrained axes:
  PC1   PC2   PC3   PC4   PC5   PC6   PC7 
6.627 4.309 3.549 2.546 2.340 0.934 0.612 


Like Article
Suggest improvement
Share your thoughts in the comments

Similar Reads