Confirmatory Factor Analysis (CFA) is a powerful statistical technique used to validate and understand the underlying structure of observed variables. Whether we're trying to understand why people behave the way they do or figuring out what makes customers tick, Confirmatory Factor Analysis is like a detective, piecing together clues to reveal the hidden structure. In this article, we will discuss how to measure Confirmatory Factor Analysis in R Programming Language.
What is Confirmatory Factor Analysis?
Confirmatory Factor Analysis (CFA) is a statistical method that helps us understand relationships between different variables in data. It's like a puzzle solver - it helps us see how pieces (or variables) fit together to form bigger patterns (or factors). Confirmatory Factor Analysis is often used in fields like psychology, education, and marketing to test theories and understand how different factors influence each other.
Features of Confirmatory Factor Analysis
- Understanding Relationships: Confirmatory Factor Analysis helps understand how variables relate to each other.
- Testing Theories: It confirms if observed data matches theoretical expectations.
- Identifying Hidden Factors: It uncovers underlying constructs not directly observable.
- Validating Measures: CFA ensures measurement scales accurately capture intended concepts.
- Model Evaluation: It provides fit statistics to assess model adequacy.
Implement of Confirmatory Factor Analysis in R
We will take HolzingerSwineford1939 dataset that contains cognitive test scores of 301 schoolchildren, which can be used to demonstrate our Confirmatory Factor Analysis.
Step 1: Load the required packages
#install.packages("lavaan")
# Load required package
library(lavaan)
Step 2: Load and Check the Structure of dataset
# Load & Check the Structure
data(HolzingerSwineford1939)
head(HolzingerSwineford1939)
Output:
id sex ageyr agemo school grade x1 x2 x3 x4 x5 x6 x7 x8 x9
1 1 1 13 1 Pasteur 7 3.333333 7.75 0.375 2.333333 5.75 1.2857143 3.391304 5.75 6.361111
2 2 2 13 7 Pasteur 7 5.333333 5.25 2.125 1.666667 3.00 1.2857143 3.782609 6.25 7.916667
3 3 2 13 1 Pasteur 7 4.500000 5.25 1.875 1.000000 1.75 0.4285714 3.260870 3.90 4.416667
4 4 1 13 2 Pasteur 7 5.333333 7.75 3.000 2.666667 4.50 2.4285714 3.000000 5.30 4.861111
5 5 2 12 2 Pasteur 7 4.833333 4.75 0.875 2.666667 4.00 2.5714286 3.695652 6.30 5.916667
6 6 2 14 1 Pasteur 7 5.333333 5.00 2.250 1.000000 3.00 0.8571429 4.347826 6.65 7.500000
Step 3: Specify the CFA Model
# Specify the CFA model
model <- '
visual =~ x1 + x2 + x3
textual =~ x4 + x5 + x6
speed =~ x7 + x8 + x9
'
This model provided specifies the relationships between latent constructs (visual, textual, and speed) and their respective observed indicators (x1 to x9) in a Confirmatory Factor Analysis framework. This allows researchers to test hypotheses about the underlying structure of the observed data and to evaluate the fit of the proposed model to the observed data.
Step 4: Run and Check the summary of CFA
# Run CFA
cfa_result <- cfa(model, data = HolzingerSwineford1939)
# Interpret the results
summary(cfa_result)
Output:
lavaan 0.6.17 ended normally after 35 iterations
Estimator ML
Optimization method NLMINB
Number of model parameters 21
Number of observations 301
Model Test User Model:
Test statistic 85.306
Degrees of freedom 24
P-value (Chi-square) 0.000
Parameter Estimates:
Standard errors Standard
Information Expected
Information saturated (h1) model Structured
Latent Variables:
Estimate Std.Err z-value P(>|z|)
visual =~
x1 1.000
x2 0.554 0.100 5.554 0.000
x3 0.729 0.109 6.685 0.000
textual =~
x4 1.000
x5 1.113 0.065 17.014 0.000
x6 0.926 0.055 16.703 0.000
speed =~
x7 1.000
x8 1.180 0.165 7.152 0.000
x9 1.082 0.151 7.155 0.000
Covariances:
Estimate Std.Err z-value P(>|z|)
visual ~~
textual 0.408 0.074 5.552 0.000
speed 0.262 0.056 4.660 0.000
textual ~~
speed 0.173 0.049 3.518 0.000
Variances:
Estimate Std.Err z-value P(>|z|)
.x1 0.549 0.114 4.833 0.000
.x2 1.134 0.102 11.146 0.000
.x3 0.844 0.091 9.317 0.000
.x4 0.371 0.048 7.779 0.000
.x5 0.446 0.058 7.642 0.000
.x6 0.356 0.043 8.277 0.000
.x7 0.799 0.081 9.823 0.000
.x8 0.488 0.074 6.573 0.000
.x9 0.566 0.071 8.003 0.000
visual 0.809 0.145 5.564 0.000
textual 0.979 0.112 8.737 0.000
speed 0.384 0.086 4.451 0.000
In the above code
- CFA model is defined, indicating the relationship between observed variables and latent factors .
- 'visual', 'textual', and 'speed' are latent factors.
- 'x1' to 'x9' are observed variables representing the cognitive test scores.
- The '~' symbol denotes the relationship between observed variables and latent factors.
- Each latent factor is associated with three observed variables.
- The cfa() function runs the Confirmatory Factor Analysis using the specified model and the dataset.
- summary() function is used to interpret the results of the CFA analysis, providing information such as factor loadings, standard errors, and fit indices.
Conclusion
Confirmatory Factor Analysis (CFA) is a valuable tool for understanding hidden structures within observed variables. Here we explored its significance across various fields and its practical implementation in R using the 'lavaan' package with the 'HolzingerSwineford1939' dataset. Confirmatory Factor Analysis serves as a powerful instrument for unraveling complex data structures and facilitating informed decision-making across diverse domains.