Open In App

How Linear Mixed Model Works in R

Last Updated : 11 Jul, 2023
Improve
Improve
Like Article
Like
Save
Share
Report

Linear mixed models (LMMs) are statistical models that are used to analyze data with both fixed and random effects. They are particularly useful when analyzing data with hierarchical or nested structures, such as longitudinal or clustered data. In R Programming Language, the lme4 package provides a comprehensive framework for fitting and interpreting linear mixed models.

Difference between Linear Mixed Models and Classic Linear Models

The Linear Mixed Models (LMMs) and Classic Linear Models (CLMs) are both statistical models used to analyze data with continuous dependent variables. However, they differ in terms of their assumptions and the types of data they can handle.

  • Assumptions: the CLMs assume that the data points are independent of each other, whereas LMMs relax this assumption by allowing for correlated data points. LMMs account for the correlation structure among observations by including random effects in the model.
  • Handling of correlated data: the CLMs are suitable for analyzing independent data or data where the correlation structure is not of interest. LMMs, on the other hand, are specifically designed to handle correlated data, such as longitudinal or repeated measures data, where observations within the same group or subject are likely to be correlated.
  • Incorporation of random effects: the LMMs incorporate random effects in addition to the fixed effects. Random effects capture the variability between different groups or subjects in the data and allow for the estimation of group-level or subject-level effects. CLMs only include fixed effects, which represent the average effects across all groups or subjects.
  • Flexibility in modeling: the LMMs provide more flexibility in modeling complex data structures and can handle nested or crossed random effects. They can account for within-group or within-subject variability, as well as between-group or between-subject variability. CLMs, on the other hand, are more straightforward and assume that the fixed effects are constant across all groups or subjects.

Mathematical Intuition Behind Linear Mixed Models (LMMs)

Both LMMs and CLMs can be represented mathematically using the following equations:

CLM: Y = Xβ + ε

LMM: Y = Xβ + Zγ + ε

In these equations:

  • Y represents the dependent variable.
  • X represents the design matrix for fixed effects.
  • β represents the vector of the fixed effect coefficients.
  • Z represents the design matrix for random effects.
  • γ represents the vector of the random effect coefficients.
  • ε represents the vector of the residual errors.

The main difference between LMMs and CLMs lies in the inclusion of the random effects term (Zγ) in the LMM equation. This term allows for the modeling of the correlation structure and the estimation of the random effect coefficients.

Fixed and Random Effects in Linear Mixed Models

In Linear Mixed Models (LMMs), fixed and random effects are used to model different sources of variability in the data.

  • Fixed Effects: Fixed effects represent the average effects across all groups or subjects in the data. They are called “fixed” because their values are assumed to be constant and non-random. Fixed effects are typically used to represent the main factors or variables of interest in the study.
  • For example, in a study comparing the effect of the different treatments on patient outcomes, the treatment variable would be a fixed effect.
  • Random Effects: Random effects capture the variability between different groups or subjects in the data. They are called “random” because their values are assumed to be drawn from a random distribution.
  • Random effects allow for the estimation of group-level or subject-level effects and account for correlation structure among observations within the same group or subject

Installing and Loading the Required Packages

To Before we start working with linear mixed models in R, we need to install and load the necessary packages. Open R or RStudio and execute the following commands:

R




install.packages("lme4")
library(lme4)


Understanding the Structure of Linear Mixed Models

The Linear mixed models combine fixed effects, which are the same for all observations, and random effects, which vary across different groups or levels. The model equation for the linear mixed model can be represented as:

Y = X * β + Z * u + ε

Where:

  • Y is the response variable.
  • X is the design matrix for the fixed effects.
  • β is the vector of fixed effect coefficients.
  • Z is the design matrix for the random effects.
  • u is the vector of random effect coefficients.
  • ε is the vector of residual errors.

Preparing the Data

To fit a linear mixed model, we need to ensure that our data is properly structured. The data should be organized in a long format, with each row representing a single observation and columns representing the response variable, fixed effects, and random effects. Additionally, the data should be grouped or nested based on levels of random effects.

Fitting a Linear Mixed Model

Once the data is properly prepared, we can fit a linear mixed model using lmer() function from the lme4 package. The general syntax of the lmer() function is as follows:

model <- lmer(formula, data, random = ~ random_effects)
  • The formula represents the model formula, specifying the response variable and fixed effects.
  • Data is the data frame containing the variables used in the model.
  • Random specifies the random effects to be included in the model.

Interpreting the Model Output

After fitting the linear mixed model, we can examine the model output to understand the estimated fixed and random effects. The summary() function can be used to obtain a summary of model results, including estimates, standard errors, and p-values for the fixed effects.

R




library(lme4)
 
data(sleepstudy)
model <- lmer(Reaction ~ Days + (1 | Subject),
              data = sleepstudy)
summary(model)


Output:

Linear mixed model fit by REML ['lmerMod']
Formula: Reaction ~ Days + (1 | Subject)
Data: sleepstudy
REML criterion at convergence: 1786.5
Scaled residuals:
Min 1Q Median 3Q Max
-3.2257 -0.5529 0.0109 0.5188 4.2506
Random effects:
Groups Name Variance Std.Dev.
Subject (Intercept) 1378.2 37.12
Residual 960.5 30.99
Number of obs: 180, groups: Subject, 18
Fixed effects:
Estimate Std. Error t value
(Intercept) 251.4051 9.7467 25.79
Days 10.4673 0.8042 13.02
Correlation of Fixed Effects:
(Intr)
Days -0.371

R




library(lme4)
library(ggplot2)
 
data(sleepstudy)
 
model <- lmer(Reaction ~ Days + (1 | Subject),
              data = sleepstudy)
 
predicted <- predict(model)
df <- data.frame(Observed = sleepstudy$Reaction,
                 Predicted = predicted)
ggplot(df, aes(x = Observed, y = Predicted)) +
  geom_point() +
  geom_abline(intercept = 3, slope = 1,
              linetype = "dashed") +
  xlab("Observed Time") +
  ylab("Predicted Time")


Output:

How Linear Mixed Model Works in R



Like Article
Suggest improvement
Share your thoughts in the comments

Similar Reads