In this tutorial, we are going to use lavaan
for
measured variable path analysis.
library(lavaan)
library(lavaanPlot)
In this example, there are 4 measured variables: ReadSC1, MathSC1, Goals2, and SATmath.
In this example, our data is a correlation matrix. For more information about the data, please refer back to the course slides.
We can read the lower half of the correlation matrix into R and make
it symmetric using the getCov()
function. But we need to
further convert it to a variance-covariance matrix before
lavaan
can use it as the input data. With the correlation
matrix and sample standard deviations, you can easily compute the
variance-covariance matrix. We then assign variable names to the
covariance matrix, which is required for lavaan
to work
properly.
cormat <- '
1.000
-.173 1.000
.122 .163 1.000
-.049 .556 .385 1.000'
# convert to a symmetric correaltion matrix
Cmat <- getCov(cormat)
# sample standard deviations
dat.sdev <- c(1.274, 1.397, 1.276, 37.087)
Dmat <- diag(dat.sdev)
covmat <- Dmat %*% Cmat %*% Dmat # compute the covariance matrix
# assign row and column names to the covariance matrix
rownames(covmat) <- c("ReadSC1", "MathSC1", "Goals2", "SATmath")
colnames(covmat) <- rownames(covmat)
covmat
## ReadSC1 MathSC1 Goals2 SATmath
## ReadSC1 1.6230760 -0.3079016 0.1983261 -2.315193
## MathSC1 -0.3079016 1.9516090 0.2905592 28.806660
## Goals2 0.1983261 0.2905592 1.6281760 18.219360
## SATmath -2.3151931 28.8066597 18.2193596 1375.445569
Now we tell lavaan
what our model looks like. Like the
MLR example, we use the tilde ~
operator to tell
lavaan
that we wish to regress Goals2 on ReadSC1 and
MathSC1. We also want to regress SATmath on all the other three
variables. We use the double tilde ~~
to tell
lavaan
that we wish to estimate the variances of the
independent variables. We also allow the two exogenous variables to
covary.
mvpa.model1 <- '
# regressions
Goals2 ~ ReadSC1 + MathSC1
SATmath ~ Goals2 + ReadSC1 + MathSC1
# variances and covariances
ReadSC1 ~~ ReadSC1 + MathSC1
MathSC1 ~~ MathSC1
'
After you specify the model syntax, you can fit the model to the
variance-covariance matrix using the sem()
function. In
this function, we first specify the model syntax. Then we tell
lavaan
that our data is a variance-covariance matrix using
the sample.cov =
argument. We also tell lavaan
that our sample size is 1000.
mvpa.fit.1 <- sem(mvpa.model1, sample.cov = covmat, sample.nobs = 1000)
Next you can summarize the results using the
summary()
function. To get more info on the model fit
measures, you can use the optional argument
fit.measures = TRUE
. To take a look at the standardzied
parameter estimates, you can use the optional argument
standardized = TRUE
. The standardized solutions are listed
in the column labeled “Std.all”.
summary(mvpa.fit.1, fit.measures = T, standardized = T)
## lavaan 0.6.15 ended normally after 16 iterations
##
## Estimator ML
## Optimization method NLMINB
## Number of model parameters 10
##
## Number of observations 1000
##
## Model Test User Model:
##
## Test statistic 0.000
## Degrees of freedom 0
##
## Model Test Baseline Model:
##
## Test statistic 589.257
## Degrees of freedom 6
## P-value 0.000
##
## User Model versus Baseline Model:
##
## Comparative Fit Index (CFI) 1.000
## Tucker-Lewis Index (TLI) 1.000
##
## Loglikelihood and Information Criteria:
##
## Loglikelihood user model (H0) -9812.610
## Loglikelihood unrestricted model (H1) -9812.610
##
## Akaike (AIC) 19645.219
## Bayesian (BIC) 19694.297
## Sample-size adjusted Bayesian (SABIC) 19662.536
##
## Root Mean Square Error of Approximation:
##
## RMSEA 0.000
## 90 Percent confidence interval - lower 0.000
## 90 Percent confidence interval - upper 0.000
## P-value H_0: RMSEA <= 0.050 NA
## P-value H_0: RMSEA >= 0.080 NA
##
## Standardized Root Mean Square Residual:
##
## SRMR 0.000
##
## Parameter Estimates:
##
## Standard errors Standard
## Information Expected
## Information saturated (h1) model Structured
##
## Regressions:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## Goals2 ~
## ReadSC1 0.155 0.031 4.947 0.000 0.155 0.155
## MathSC1 0.173 0.029 6.064 0.000 0.173 0.190
## SATmath ~
## Goals2 8.781 0.731 12.004 0.000 8.781 0.302
## ReadSC1 0.054 0.734 0.074 0.941 0.054 0.002
## MathSC1 13.462 0.673 19.994 0.000 13.462 0.507
##
## Covariances:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## ReadSC1 ~~
## MathSC1 -0.308 0.057 -5.391 0.000 -0.308 -0.173
##
## Variances:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## ReadSC1 1.621 0.073 22.361 0.000 1.621 1.000
## MathSC1 1.950 0.087 22.361 0.000 1.950 1.000
## .Goals2 1.546 0.069 22.361 0.000 1.546 0.950
## .SATmath 826.971 36.983 22.361 0.000 826.971 0.602
In the second example, there are 3 measured variables of interest: ReadSC1, MathSC1, and SATmath.
In this example, our data is still a correlation matrix. Similarly, we first read in the lower half of the correlation matrix and convert it to a full symmetrix covariance matrix.
# read in lower half of the correlation matrix
cormat <- '
1.000
-.173 1.000
-.049 .556 1.000'
Cmat <- getCov(cormat)
# convert it to variance-covariance matrix
dat.sdev <- c(1.274,1.397,37.087)
Dmat <- diag(dat.sdev)
covmat <- Dmat %*% Cmat %*% Dmat
# assign row and column names to the covariance matrix
rownames(covmat) <- c("ReadSC1", "MathSC1","SATmath")
colnames(covmat) <- rownames(covmat)
covmat
## ReadSC1 MathSC1 SATmath
## ReadSC1 1.6230760 -0.3079016 -2.315193
## MathSC1 -0.3079016 1.9516090 28.806660
## SATmath -2.3151931 28.8066597 1375.445569
Now we tell lavaan
what our model looks like using the
model syntax.
mvpa.model2 <- '
MathSC1 ~ ReadSC1
SATmath ~ MathSC1
'
After you specify the model syntax, you can fit the model to the
variance-covariance matrix using the sem()
function. In
this function, we first specify the model syntax. Then we tell
lavaan
that our data is a variance-covariance matrix using
the sample.cov =
argument. We also tell lavaan
that our sample size is 1000.
mvpa.fit.2 <- sem(mvpa.model2, sample.cov=covmat, sample.nobs = 1000)
Next you can summarize the results using the
summary()
function. To get more info on the model fit
measures, you can use the optional argument
fit.measures = TRUE
. To take a look at the standardzied
parameter estimates, you can use the optional argument
standardized = TRUE
. The standardized solutions are listed
in the column labeled “Std.all”.
summary(mvpa.fit.2, fit.measures = T, standardized = T)
## lavaan 0.6.15 ended normally after 1 iteration
##
## Estimator ML
## Optimization method NLMINB
## Number of model parameters 4
##
## Number of observations 1000
##
## Model Test User Model:
##
## Test statistic 3.328
## Degrees of freedom 1
## P-value (Chi-square) 0.068
##
## Model Test Baseline Model:
##
## Test statistic 403.526
## Degrees of freedom 3
## P-value 0.000
##
## User Model versus Baseline Model:
##
## Comparative Fit Index (CFI) 0.994
## Tucker-Lewis Index (TLI) 0.983
##
## Loglikelihood and Information Criteria:
##
## Loglikelihood user model (H0) -6584.371
## Loglikelihood unrestricted model (H1) -6582.707
##
## Akaike (AIC) 13176.742
## Bayesian (BIC) 13196.373
## Sample-size adjusted Bayesian (SABIC) 13183.669
##
## Root Mean Square Error of Approximation:
##
## RMSEA 0.048
## 90 Percent confidence interval - lower 0.000
## 90 Percent confidence interval - upper 0.110
## P-value H_0: RMSEA <= 0.050 0.404
## P-value H_0: RMSEA >= 0.080 0.240
##
## Standardized Root Mean Square Residual:
##
## SRMR 0.019
##
## Parameter Estimates:
##
## Standard errors Standard
## Information Expected
## Information saturated (h1) model Structured
##
## Regressions:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## MathSC1 ~
## ReadSC1 -0.190 0.034 -5.554 0.000 -0.190 -0.173
## SATmath ~
## MathSC1 14.760 0.698 21.153 0.000 14.760 0.556
##
## Variances:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## .MathSC1 1.891 0.085 22.361 0.000 1.891 0.970
## .SATmath 949.296 42.454 22.361 0.000 949.296 0.691