In this tutorial, we are going to use lavaan
and
simsem
for SEM power analysis based on Monte Carlo
methods.
library(lavaan)
library(simsem)
library(semPlot)
In this example, we would like to conduct power analysis for a latent growth model.
We first use the lavaan
model syntax to define our
population model and supply all the parameter values. To supply the
population value of a parameter, we just need to multiply the parameter
with the numeric value.
popmodel <- '
INTERC =~ 1*V1 + 1*V2 + 1*V3 + 1*V4
SLOPE =~ 0*V1 + 1*V2 + 2*V3 + 3*V4
INTERC ~~ 1*INTERC + -.10*SLOPE
SLOPE ~~ .04*SLOPE
INTERC ~ 5*1
SLOPE ~ .5*1
# error variance
V1 ~~ .25*V1
V2 ~~ .25*V2
V3 ~~ .25*V3
V4 ~~ .25*V4
# intercepts
V1 ~ 0*1
V2 ~ 0*1
V3 ~ 0*1
V4 ~ 0*1
'
Next we use the lavaan
model syntax to define the
analysis model. This is the model syntax you would write as if you have
a data set at hand (we are very familiar with it already). The analysis
model does not necessarily have to be the same as the population
model.
Analyze.model <- '
INTERC =~ 1*V1 + 1*V2 + 1*V3 + 1*V4
SLOPE =~ 0*V1 + 1*V2 + 2*V3 + 3*V4
INTERC ~~ INTERC + SLOPE
SLOPE ~~ SLOPE
INTERC ~ 1
SLOPE ~ 1
V1 ~~ a*V1
V2 ~~ a*V2
V3 ~~ a*V3
V4 ~~ a*V4
V1 ~ 0*1
V2 ~ 0*1
V3 ~ 0*1
V4 ~ 0*1
'
Last, we call up the sim()
function available in the
simsem
R package. This function will automatically generate
multiple copies of data based on the population model and fit the
analysis model to the simulated data. It will output the results
summarized over replications.
For this example, we request sim()
to generate 1000
copies of data, each has a sample size of 303. We also request
sim()
to use lavaan
’s sem()
function when fitting the analysis model to the simulated data. For more
detailed information about the simsem
package, please refer
to its webpage.
Output <- sim(1000, Analyze.model, n = 200, generate = popmodel, lavaanfun = "sem")
summary(Output)
## RESULT OBJECT
## Model Type
## [1] "lavaan"
## ========= Fit Indices Cutoffs ============
## Alpha
## Fit Indices 0.1 0.05 0.01 0.001 Mean SD
## chisq 9.116 11.144 14.342 18.761 4.916 3.108
## aic 1218.903 1229.157 1248.090 1268.953 1176.910 32.507
## bic 1244.711 1254.965 1273.898 1294.760 1202.717 32.507
## rmsea 0.080 0.097 0.120 0.146 0.024 0.035
## cfi 0.988 0.983 0.973 0.956 0.997 0.006
## tli 0.986 0.980 0.967 0.947 1.000 0.011
## srmr 0.052 0.060 0.073 0.086 0.034 0.014
## ========= Parameter Estimates and Standard Errors ============
## Estimate Average Estimate SD Average SE
## INTERC~~INTERC 0.985 0.152 0.147
## INTERC~~SLOPE -0.099 0.036 0.035
## SLOPE~~SLOPE 0.040 0.015 0.014
## INTERC~1 4.998 0.097 0.094
## SLOPE~1 0.500 0.026 0.026
## V1~~V1 0.249 0.068 0.066
## V2~~V2 0.251 0.043 0.043
## V3~~V3 0.249 0.042 0.042
## V4~~V4 0.253 0.066 0.063
## Power (Not equal 0) Std Est Std Est SD Std Ave SE
## INTERC~~INTERC 1.000 1.000 0.000 0.000
## INTERC~~SLOPE 0.833 -0.500 0.119 0.122
## SLOPE~~SLOPE 0.817 1.000 0.000 0.000
## INTERC~1 1.000 5.083 0.416 0.395
## SLOPE~1 1.000 2.677 0.664 0.654
## V1~~V1 0.978 0.203 0.057 0.054
## V2~~V2 1.000 0.234 0.040 0.039
## V3~~V3 1.000 0.251 0.041 0.041
## V4~~V4 0.990 0.253 0.063 0.062
## Average Param Average Bias Coverage
## INTERC~~INTERC 1.00 -0.015 0.939
## INTERC~~SLOPE -0.10 0.001 0.939
## SLOPE~~SLOPE 0.04 0.000 0.950
## INTERC~1 5.00 -0.002 0.932
## SLOPE~1 0.50 0.000 0.948
## V1~~V1 0.25 -0.001 0.943
## V2~~V2 0.25 0.001 0.955
## V3~~V3 0.25 -0.001 0.940
## V4~~V4 0.25 0.003 0.943
## ========= Correlation between Fit Indices ============
## chisq aic bic rmsea cfi tli srmr
## chisq 1.000 0.077 0.077 0.946 -0.913 -0.993 0.788
## aic 0.077 1.000 1.000 0.063 -0.048 -0.079 0.074
## bic 0.077 1.000 1.000 0.063 -0.048 -0.079 0.074
## rmsea 0.946 0.063 0.063 1.000 -0.932 -0.938 0.729
## cfi -0.913 -0.048 -0.048 -0.932 1.000 0.918 -0.668
## tli -0.993 -0.079 -0.079 -0.938 0.918 1.000 -0.787
## srmr 0.788 0.074 0.074 0.729 -0.668 -0.787 1.000
## ================== Replications =====================
## Number of replications = 1000
## Number of converged replications = 998
## Number of nonconverged replications:
## 1. Nonconvergent Results = 0
## 2. Nonconvergent results from multiple imputation = 0
## 3. At least one SE were negative or NA = 0
## 4. Nonpositive-definite latent or observed (residual) covariance matrix
## (e.g., Heywood case or linear dependency) = 2
We are using the same population model and analysis model as above.
Missing data template can be defined using the miss
function in simsem
. The pmMCAR
argument
indicates the proportion of values in each variable that will be missing
completely at random.
missmodel <- miss(pmMCAR = 0.2)
The missing data template can be incorporated in the simulation
process with the miss =
argument.
Output <- sim(1000, Analyze.model, n = 200, generate = popmodel, lavaanfun = "sem", miss = missmodel)
summary(Output)
## RESULT OBJECT
## Model Type
## [1] "lavaan"
## ========= Fit Indices Cutoffs ============
## Alpha
## Fit Indices 0.1 0.05 0.01 0.001 Mean SD
## chisq 9.197 11.061 16.038 19.954 5.128 3.248
## aic 1857.108 1871.257 1895.818 1933.618 1806.467 40.690
## bic 1886.793 1900.942 1925.503 1963.303 1836.152 40.690
## rmsea 0.065 0.078 0.105 0.122 0.021 0.029
## cfi 0.992 0.989 0.979 0.976 0.998 0.004
## tli 0.990 0.987 0.974 0.971 1.000 0.007
## srmr 0.044 0.048 0.056 0.067 0.028 0.011
## ========= Parameter Estimates and Standard Errors ============
## Estimate Average Estimate SD Average SE
## INTERC~~INTERC 0.989 0.123 0.119
## INTERC~~SLOPE -0.099 0.030 0.029
## SLOPE~~SLOPE 0.040 0.012 0.012
## INTERC~1 5.000 0.077 0.076
## SLOPE~1 0.500 0.022 0.021
## V1~~V1 0.249 0.054 0.054
## V2~~V2 0.251 0.036 0.035
## V3~~V3 0.252 0.034 0.034
## V4~~V4 0.248 0.053 0.050
## Power (Not equal 0) Std Est Std Est SD Std Ave SE
## INTERC~~INTERC 1.000 1.000 0.000 0.000
## INTERC~~SLOPE 0.959 -0.498 0.093 0.091
## SLOPE~~SLOPE 0.949 1.000 0.000 0.000
## INTERC~1 1.000 5.057 0.325 0.316
## SLOPE~1 1.000 2.599 0.482 0.454
## V1~~V1 0.999 0.202 0.044 0.043
## V2~~V2 1.000 0.233 0.032 0.032
## V3~~V3 1.000 0.252 0.034 0.033
## V4~~V4 1.000 0.248 0.050 0.050
## Average Param Average Bias Coverage
## INTERC~~INTERC 1.00 -0.011 0.930
## INTERC~~SLOPE -0.10 0.001 0.943
## SLOPE~~SLOPE 0.04 0.000 0.947
## INTERC~1 5.00 0.000 0.941
## SLOPE~1 0.50 0.000 0.935
## V1~~V1 0.25 -0.001 0.944
## V2~~V2 0.25 0.001 0.946
## V3~~V3 0.25 0.002 0.951
## V4~~V4 0.25 -0.002 0.935
## ========= Correlation between Fit Indices ============
## chisq aic bic rmsea cfi tli srmr
## chisq 1.000 0.032 0.032 0.944 -0.925 -0.995 0.791
## aic 0.032 1.000 1.000 0.033 -0.035 -0.029 0.028
## bic 0.032 1.000 1.000 0.033 -0.035 -0.029 0.028
## rmsea 0.944 0.033 0.033 1.000 -0.935 -0.941 0.730
## cfi -0.925 -0.035 -0.035 -0.935 1.000 0.923 -0.675
## tli -0.995 -0.029 -0.029 -0.941 0.923 1.000 -0.791
## srmr 0.791 0.028 0.028 0.730 -0.675 -0.791 1.000
## ================== Replications =====================
## Number of replications = 1000
## Number of converged replications = 1000
## Number of nonconverged replications:
## 1. Nonconvergent Results = 0
## 2. Nonconvergent results from multiple imputation = 0
## 3. At least one SE were negative or NA = 0
## 4. Nonpositive-definite latent or observed (residual) covariance matrix
## (e.g., Heywood case or linear dependency) = 0
We are using the same population model and analysis model as above.
Missing data template can be defined using the miss
function in simsem
. The prAttr
argument
indicates the probability of an entire case being removed due to
attrition at each time point. Note that the probability is computed out
of the remaining case at the each time point.
missmodel <- miss(prAttr = c(0, 0.2, 1/4, 1/3), timePoints = 4)
Output <- sim(10000, Analyze.model, n = 130, generate = popmodel, lavaanfun = "sem", miss = missmodel, seed = 10011, multicore = T)
summary(Output)
## RESULT OBJECT
## Model Type
## [1] "lavaan"
## ========= Fit Indices Cutoffs ============
## Alpha
## Fit Indices 0.1 0.05 0.01 0.001 Mean SD
## chisq 13.895 16.160 20.842 27.012 8.359 4.189
## aic 917.215 931.075 956.364 985.774 867.983 38.273
## bic 934.421 948.281 973.569 1002.979 885.189 38.273
## rmsea 0.075 0.089 0.111 0.135 0.026 0.033
## cfi 0.972 0.961 0.940 0.904 0.992 0.014
## tli 0.979 0.971 0.955 0.928 0.999 0.015
## srmr 0.085 0.097 0.122 0.156 0.056 0.022
## ========= Parameter Estimates and Standard Errors ============
## Estimate Average Estimate SD Average SE
## INTERC~~INTERC 0.990 0.150 0.149
## INTERC~~SLOPE -0.099 0.042 0.042
## SLOPE~~SLOPE 0.039 0.018 0.017
## INTERC~1 5.000 0.096 0.095
## SLOPE~1 0.500 0.035 0.034
## a <- (V1~~V1) 0.250 0.029 0.029
## a <- (V2~~V2) 0.250 0.029 0.029
## a <- (V3~~V3) 0.250 0.029 0.029
## a <- (V4~~V4) 0.250 0.029 0.029
## [ a ] - [ a ] # 1 0.000 0.000 NA
## [ a ] - [ a ] # 2 0.000 0.000 NA
## [ a ] - [ a ] # 3 0.000 0.000 NA
## Power (Not equal 0) Std Est Std Est SD Std Ave SE
## INTERC~~INTERC 1.000 1.000 0.000 0.000
## INTERC~~SLOPE 0.680 -0.512 0.208 0.407
## SLOPE~~SLOPE 0.635 1.000 0.000 0.000
## INTERC~1 1.000 5.069 0.405 0.397
## SLOPE~1 1.000 2.827 1.344 1.899
## a <- (V1~~V1) 1.000 0.205 0.033 0.033
## a <- (V2~~V2) 1.000 0.234 0.034 0.034
## a <- (V3~~V3) 1.000 0.253 0.040 0.039
## a <- (V4~~V4) 1.000 0.256 0.050 0.049
## [ a ] - [ a ] # 1 1.000 -0.029 0.013 0.013
## [ a ] - [ a ] # 2 1.000 -0.048 0.028 0.027
## [ a ] - [ a ] # 3 1.000 -0.052 0.043 0.042
## Average FMI1 SD FMI1
## INTERC~~INTERC 0.038 0.180
## INTERC~~SLOPE -0.283 24.592
## SLOPE~~SLOPE 0.423 4.643
## INTERC~1 0.018 0.017
## SLOPE~1 0.338 0.837
## a <- (V1~~V1) 0.405 0.057
## a <- (V2~~V2) 0.405 0.057
## a <- (V3~~V3) 0.405 0.057
## a <- (V4~~V4) 0.405 0.057
## [ a ] - [ a ] # 1 NaN NA
## [ a ] - [ a ] # 2 NaN NA
## [ a ] - [ a ] # 3 NaN NA
## ========= Correlation between Fit Indices ============
## chisq aic bic rmsea cfi tli srmr
## chisq 1.000 0.015 0.015 0.943 -0.911 -0.987 0.645
## aic 0.015 1.000 1.000 0.012 0.014 -0.014 -0.067
## bic 0.015 1.000 1.000 0.012 0.014 -0.014 -0.067
## rmsea 0.943 0.012 0.012 1.000 -0.929 -0.930 0.590
## cfi -0.911 0.014 0.014 -0.929 1.000 0.913 -0.582
## tli -0.987 -0.014 -0.014 -0.930 0.913 1.000 -0.650
## srmr 0.645 -0.067 -0.067 0.590 -0.582 -0.650 1.000
## ================== Replications =====================
## Number of replications = 10000
## Number of converged replications = 9867
## Number of nonconverged replications:
## 1. Nonconvergent Results = 0
## 2. Nonconvergent results from multiple imputation = 0
## 3. At least one SE were negative or NA = 0
## 4. Nonpositive-definite latent or observed (residual) covariance matrix
## (e.g., Heywood case or linear dependency) = 133
## NOTE: The data generation model is not the same as the analysis model. See the summary of the population underlying data generation by the summaryPopulation function.
# Define the missing data patterns using a logical matrix
miss.mat <- matrix(0, nrow = 200, ncol = 4)
miss.row1 <- sample(1:nrow(miss.mat), 100, replace = F)
miss.mat[miss.row1, 1] <- 1
miss.mat[setdiff(1:nrow(miss.mat), miss.row1), 4] <- 1
# create the missing data template
missmodel <- miss(logical = apply(miss.mat, 2, as.logical), m = 0)
Output <- sim(1000, Analyze.model, n = 200, generate = popmodel, lavaanfun = "sem", miss = missmodel, seed = 10291, multicore = T)
summary(Output)
## RESULT OBJECT
## Model Type
## [1] "lavaan"
## ========= Fit Indices Cutoffs ============
## Alpha
## Fit Indices 0.1 0.05 0.01 0.001 Mean SD
## chisq 11.778 13.597 17.102 22.139 6.968 3.621
## aic 1445.144 1459.544 1479.277 1496.805 1401.684 35.370
## bic 1464.934 1479.334 1499.067 1516.595 1421.474 35.370
## rmsea 0.049 0.059 0.075 0.094 0.014 0.021
## cfi 0.989 0.984 0.971 0.955 0.997 0.006
## tli 0.992 0.988 0.979 0.966 1.002 0.008
## srmr 0.069 0.080 0.095 0.124 0.047 0.017
## ========= Parameter Estimates and Standard Errors ============
## Estimate Average Estimate SD Average SE
## INTERC~~INTERC 0.992 0.139 0.138
## INTERC~~SLOPE -0.099 0.041 0.042
## SLOPE~~SLOPE 0.040 0.020 0.020
## INTERC~1 5.000 0.080 0.082
## SLOPE~1 0.500 0.028 0.028
## a <- (V1~~V1) 0.249 0.024 0.024
## a <- (V2~~V2) 0.249 0.024 0.024
## a <- (V3~~V3) 0.249 0.024 0.024
## a <- (V4~~V4) 0.249 0.024 0.024
## [ a ] - [ a ] # 1 0.000 0.000 NA
## [ a ] - [ a ] # 2 0.000 0.000 NA
## [ a ] - [ a ] # 3 0.000 0.000 NA
## Power (Not equal 0) Std Est Std Est SD Std Ave SE
## INTERC~~INTERC 1.000 1.000 0.000 0.000
## INTERC~~SLOPE 0.663 -0.516 0.236 0.463
## SLOPE~~SLOPE 0.537 1.000 0.000 0.000
## INTERC~1 1.000 5.058 0.370 0.366
## SLOPE~1 1.000 2.895 1.877 3.895
## a <- (V1~~V1) 1.000 0.203 0.031 0.031
## a <- (V2~~V2) 1.000 0.232 0.028 0.028
## a <- (V3~~V3) 1.000 0.250 0.031 0.030
## a <- (V4~~V4) 1.000 0.251 0.041 0.038
## [ a ] - [ a ] # 1 1.000 -0.028 0.011 0.010
## [ a ] - [ a ] # 2 1.000 -0.046 0.022 0.021
## [ a ] - [ a ] # 3 1.000 -0.048 0.035 0.033
## Average FMI1 SD FMI1
## INTERC~~INTERC 0.276 0.232
## INTERC~~SLOPE -0.794 22.915
## SLOPE~~SLOPE 0.417 10.031
## INTERC~1 0.138 0.022
## SLOPE~1 0.289 1.063
## a <- (V1~~V1) 0.396 0.064
## a <- (V2~~V2) 0.396 0.064
## a <- (V3~~V3) 0.396 0.064
## a <- (V4~~V4) 0.396 0.064
## [ a ] - [ a ] # 1 NaN NA
## [ a ] - [ a ] # 2 NaN NA
## [ a ] - [ a ] # 3 NaN NA
## ========= Correlation between Fit Indices ============
## chisq aic bic rmsea cfi tli srmr
## chisq 1.000 -0.004 -0.004 0.905 -0.848 -0.992 0.601
## aic -0.004 1.000 1.000 -0.020 0.026 0.005 -0.031
## bic -0.004 1.000 1.000 -0.020 0.026 0.005 -0.031
## rmsea 0.905 -0.020 -0.020 1.000 -0.927 -0.896 0.526
## cfi -0.848 0.026 0.026 -0.927 1.000 0.853 -0.519
## tli -0.992 0.005 0.005 -0.896 0.853 1.000 -0.596
## srmr 0.601 -0.031 -0.031 0.526 -0.519 -0.596 1.000
## ================== Replications =====================
## Number of replications = 1000
## Number of converged replications = 969
## Number of nonconverged replications:
## 1. Nonconvergent Results = 1
## 2. Nonconvergent results from multiple imputation = 0
## 3. At least one SE were negative or NA = 0
## 4. Nonpositive-definite latent or observed (residual) covariance matrix
## (e.g., Heywood case or linear dependency) = 30
## NOTE: The data generation model is not the same as the analysis model. See the summary of the population underlying data generation by the summaryPopulation function.