In this tutorial, we are going to use lavaan for repeated measure means models.

Load the pacakges

library(lavaan)
library(semPlot)


Example 1: Observed variable across multiple time points

Read the data

The data for this example is saved in a txt file named “repeated_data.txt”.

setwd(mypath)  # change it to the path of your own data folder

dat <- read.delim("repeated_data.txt", sep = "\t", header = F)

colnames(dat) <- paste0("read", 1:3)

# check the data
str(dat)
## 'data.frame':    96 obs. of  3 variables:
##  $ read1: int  25 23 27 25 23 12 15 26 22 27 ...
##  $ read2: int  26 28 27 25 26 10 13 24 26 27 ...
##  $ read3: int  26 24 30 26 21 17 19 21 26 29 ...
summary(dat)
##      read1          read2           read3      
##  Min.   : 7.0   Min.   : 5.00   Min.   : 7.00  
##  1st Qu.:14.0   1st Qu.:16.00   1st Qu.:17.00  
##  Median :20.0   Median :21.00   Median :21.00  
##  Mean   :19.2   Mean   :20.36   Mean   :20.68  
##  3rd Qu.:24.0   3rd Qu.:25.00   3rd Qu.:25.25  
##  Max.   :30.0   Max.   :30.00   Max.   :30.00


Fit the unconditional model to the data

We use the following model syntax to specify the model. The mean structure is specified using ~ 1 command in the model syntax.

model <- '

# covariances
read1 ~~ read2 + read3
read2 ~~ read3

# variance 
read1 ~~ read1 
read2 ~~ read2
read3 ~~ read3

# intercept 
read1 ~ 1
read2 ~ 1
read3 ~ 1

'

fit  <- sem(model, dat)
summary(fit, fit.measures = T, standardized = T)
## lavaan 0.6.15 ended normally after 45 iterations
## 
##   Estimator                                         ML
##   Optimization method                           NLMINB
##   Number of model parameters                         9
## 
##   Number of observations                            96
## 
## Model Test User Model:
##                                                       
##   Test statistic                                 0.000
##   Degrees of freedom                                 0
## 
## Model Test Baseline Model:
## 
##   Test statistic                               233.779
##   Degrees of freedom                                 3
##   P-value                                        0.000
## 
## User Model versus Baseline Model:
## 
##   Comparative Fit Index (CFI)                    1.000
##   Tucker-Lewis Index (TLI)                       1.000
## 
## Loglikelihood and Information Criteria:
## 
##   Loglikelihood user model (H0)               -811.420
##   Loglikelihood unrestricted model (H1)       -811.420
##                                                       
##   Akaike (AIC)                                1640.841
##   Bayesian (BIC)                              1663.920
##   Sample-size adjusted Bayesian (SABIC)       1635.503
## 
## Root Mean Square Error of Approximation:
## 
##   RMSEA                                          0.000
##   90 Percent confidence interval - lower         0.000
##   90 Percent confidence interval - upper         0.000
##   P-value H_0: RMSEA <= 0.050                       NA
##   P-value H_0: RMSEA >= 0.080                       NA
## 
## Standardized Root Mean Square Residual:
## 
##   SRMR                                           0.000
## 
## Parameter Estimates:
## 
##   Standard errors                             Standard
##   Information                                 Expected
##   Information saturated (h1) model          Structured
## 
## Covariances:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##   read1 ~~                                                              
##     read2            31.792    5.109    6.223    0.000   31.792    0.822
##     read3            29.772    4.873    6.109    0.000   29.772    0.798
##   read2 ~~                                                              
##     read3            28.930    4.624    6.257    0.000   28.930    0.830
## 
## Intercepts:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##     read1            19.198    0.657   29.231    0.000   19.198    2.983
##     read2            20.365    0.613   33.206    0.000   20.365    3.389
##     read3            20.677    0.592   34.921    0.000   20.677    3.564
## 
## Variances:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##     read1            41.409    5.977    6.928    0.000   41.409    1.000
##     read2            36.107    5.212    6.928    0.000   36.107    1.000
##     read3            33.656    4.858    6.928    0.000   33.656    1.000

Example 2: Latent variable at two time points

Read the data

This example corresponds to the Duncan & Stoolmiller (1993) example in your course slides. We have the sample means, standard deviations, and correlation matrix of the data. We first convert the correlation matrix to variance-covariance matrix.

cormat <- "
1.000
.812 1.000
.819 .752 1.000
.672 .616 .621 1.000
.464 .620 .514 .680 1.000
.612 .640 .719 .819 .676 1.000
"

sdev <- c(2.46, 1.76, 2.74, 2.63, 1.89, 2.84)
smean <- c(10.96, 11.83, 9.90, 11.03, 12.14, 10.12)

Cmat <- getCov(cormat)
Dmat <- diag(sdev)
covmat <- Dmat %*% Cmat %*% Dmat

colnames(covmat) <- paste0("v", 1:6)


Fit the model to the data

Next, we fit a multi-group CFA model to the data.

We use the following model syntax to specify our model. We use the same parameter labels to contrain the corresponding factor loadings and intercepts to be equal across time points.

Because we are using the variance-covariance matrix as the input data, we need to add the sample.mean = argument to supply the sample means.

repeat.model <- '

# measurement model 

socsupp1 =~ 1*v1 + a*v2 + b*v3
socsupp2 =~ 1*v4 + a*v5 + b*v6

# residual covariances
 
v1 ~~ v4
v2 ~~ v5
v3 ~~ v6

# intercepts

v1 ~ c*1
v2 ~ d*1
v3 ~ e*1
v4 ~ c*1
v5 ~ d*1
v6 ~ e*1
 
# means of latent factors

socsupp1 ~ 0*1
socsupp2 ~ NA*1

'

repeat.fit <- sem(repeat.model, sample.cov = covmat, sample.nobs = 84, sample.mean = smean)
summary(repeat.fit, fit.measures=T)
## lavaan 0.6.15 ended normally after 63 iterations
## 
##   Estimator                                         ML
##   Optimization method                           NLMINB
##   Number of model parameters                        23
##   Number of equality constraints                     5
## 
##   Number of observations                            84
## 
## Model Test User Model:
##                                                       
##   Test statistic                                11.967
##   Degrees of freedom                                 9
##   P-value (Chi-square)                           0.215
## 
## Model Test Baseline Model:
## 
##   Test statistic                               443.134
##   Degrees of freedom                                15
##   P-value                                        0.000
## 
## User Model versus Baseline Model:
## 
##   Comparative Fit Index (CFI)                    0.993
##   Tucker-Lewis Index (TLI)                       0.988
## 
## Loglikelihood and Information Criteria:
## 
##   Loglikelihood user model (H0)               -926.691
##   Loglikelihood unrestricted model (H1)       -920.707
##                                                       
##   Akaike (AIC)                                1889.382
##   Bayesian (BIC)                              1933.136
##   Sample-size adjusted Bayesian (SABIC)       1876.355
## 
## Root Mean Square Error of Approximation:
## 
##   RMSEA                                          0.063
##   90 Percent confidence interval - lower         0.000
##   90 Percent confidence interval - upper         0.146
##   P-value H_0: RMSEA <= 0.050                    0.360
##   P-value H_0: RMSEA >= 0.080                    0.429
## 
## Standardized Root Mean Square Residual:
## 
##   SRMR                                           0.044
## 
## Parameter Estimates:
## 
##   Standard errors                             Standard
##   Information                                 Expected
##   Information saturated (h1) model          Structured
## 
## Latent Variables:
##                    Estimate  Std.Err  z-value  P(>|z|)
##   socsupp1 =~                                         
##     v1                1.000                           
##     v2         (a)    0.638    0.049   13.117    0.000
##     v3         (b)    1.039    0.073   14.235    0.000
##   socsupp2 =~                                         
##     v4                1.000                           
##     v5         (a)    0.638    0.049   13.117    0.000
##     v6         (b)    1.039    0.073   14.235    0.000
## 
## Covariances:
##                    Estimate  Std.Err  z-value  P(>|z|)
##  .v1 ~~                                               
##    .v4                0.390    0.236    1.653    0.098
##  .v2 ~~                                               
##    .v5                0.393    0.157    2.506    0.012
##  .v3 ~~                                               
##    .v6                0.740    0.301    2.461    0.014
##   socsupp1 ~~                                         
##     socsupp2          4.136    0.849    4.874    0.000
## 
## Intercepts:
##                    Estimate  Std.Err  z-value  P(>|z|)
##    .v1         (c)   10.925    0.268   40.766    0.000
##    .v2         (d)   11.877    0.185   64.299    0.000
##    .v3         (e)    9.916    0.291   34.037    0.000
##    .v4         (c)   10.925    0.268   40.766    0.000
##    .v5         (d)   11.877    0.185   64.299    0.000
##    .v6         (e)    9.916    0.291   34.037    0.000
##     socsupp1          0.000                           
##     socsupp2          0.184    0.199    0.925    0.355
## 
## Variances:
##                    Estimate  Std.Err  z-value  P(>|z|)
##    .v1                0.759    0.271    2.801    0.005
##    .v2                0.771    0.157    4.923    0.000
##    .v3                1.725    0.378    4.562    0.000
##    .v4                1.204    0.370    3.253    0.001
##    .v5                1.574    0.283    5.571    0.000
##    .v6                1.527    0.419    3.646    0.000
##     socsupp1          5.331    0.947    5.631    0.000
##     socsupp2          5.667    1.036    5.469    0.000

Example 3: Latent variable at more than two time points

Read the data

The data for this example is saved in a txt file named “latent_means2_data.txt”.

setwd(mypath)  # change it to the path of your own data folder

data2 <- read.delim("latent_means2_data.txt", sep = "\t", header = F)

colnames(data2) <- c('read1', 'read2', 'read3', 'list1', 'list2', 'list3',
                     'speak1', 'speak2','speak3', 'write1', 'write2', 'write3')


# check the data
str(data2)
## 'data.frame':    170 obs. of  12 variables:
##  $ read1 : int  22 19 24 21 24 17 14 30 19 11 ...
##  $ read2 : int  24 19 26 24 26 21 19 29 23 6 ...
##  $ read3 : int  23 17 30 27 28 20 20 29 23 13 ...
##  $ list1 : int  28 18 27 18 29 18 14 27 13 15 ...
##  $ list2 : int  30 16 30 20 28 16 7 18 10 8 ...
##  $ list3 : int  29 17 27 18 26 17 10 24 17 19 ...
##  $ speak1: int  23 19 24 22 17 14 14 23 17 23 ...
##  $ speak2: int  23 18 24 17 18 17 10 22 17 22 ...
##  $ speak3: int  23 18 29 17 19 15 10 23 18 24 ...
##  $ write1: int  25 17 21 18 20 21 14 22 17 17 ...
##  $ write2: int  22 20 24 21 20 17 18 25 20 18 ...
##  $ write3: int  21 25 27 22 22 21 18 22 17 18 ...
summary(data2)
##      read1           read2           read3           list1      
##  Min.   : 0.00   Min.   : 3.00   Min.   : 4.00   Min.   : 1.00  
##  1st Qu.:13.00   1st Qu.:14.00   1st Qu.:16.00   1st Qu.:13.00  
##  Median :17.00   Median :19.00   Median :20.00   Median :17.00  
##  Mean   :17.38   Mean   :18.55   Mean   :19.55   Mean   :16.56  
##  3rd Qu.:22.00   3rd Qu.:23.00   3rd Qu.:23.00   3rd Qu.:21.00  
##  Max.   :30.00   Max.   :30.00   Max.   :30.00   Max.   :30.00  
##      list2           list3           speak1          speak2     
##  Min.   : 1.00   Min.   : 1.00   Min.   : 0.00   Min.   : 5.00  
##  1st Qu.:14.00   1st Qu.:15.00   1st Qu.:15.00   1st Qu.:17.00  
##  Median :18.00   Median :19.00   Median :18.00   Median :18.00  
##  Mean   :17.63   Mean   :18.58   Mean   :17.96   Mean   :18.73  
##  3rd Qu.:21.75   3rd Qu.:23.00   3rd Qu.:20.00   3rd Qu.:22.00  
##  Max.   :30.00   Max.   :30.00   Max.   :28.00   Max.   :27.00  
##      speak3          write1          write2          write3     
##  Min.   :10.00   Min.   : 7.00   Min.   : 8.00   Min.   : 7.00  
##  1st Qu.:17.00   1st Qu.:15.00   1st Qu.:17.00   1st Qu.:18.00  
##  Median :19.00   Median :20.00   Median :20.00   Median :20.50  
##  Mean   :19.24   Mean   :18.59   Mean   :19.58   Mean   :20.11  
##  3rd Qu.:22.00   3rd Qu.:21.00   3rd Qu.:22.00   3rd Qu.:22.00  
##  Max.   :29.00   Max.   :30.00   Max.   :28.00   Max.   :29.00


Fit the model to the data

We use the following model syntax to specify the model. As in the previous example, we specify the parameter labels to impose the equality constraints. Additionally, when we write the means formulas for the latent factors, we used a vector such as c(NA,h) as the multiplier, instead of a single character string. It is used because we want to give the parameter a label while allowing it to be freely estimated. Later, we used the parameter labels to create a new parameter that reflects the difference between the latent factor mean at time 2 vs. the latent factor mean at time 3 diff:=h-j.

model2 <- '

# measurement model 

english1 =~ 1*read1 + a*list1 + b*speak1 + c*write1
english2 =~ 1*read2 + a*list2 + b*speak2 + c*write2
english3 =~ 1*read3 + a*list3 + b*speak3 + c*write3

# covariances

read1 ~~ read2 + read3
read2 ~~ read3 
list1 ~~ list2 + list3
list2 ~~ list3 
speak1 ~~ speak2 + speak3
speak2 ~~ speak3
write1 ~~ write2 + write3
write2 ~~ write3

# intercepts

read1 ~ d*1
read2 ~ d*1
read3 ~ d*1
list1 ~ e*1
list2 ~ e*1
list3 ~ e*1
speak1 ~ f*1
speak2 ~ f*1
speak3 ~ f*1
write1 ~ g*1
write2 ~ g*1
write3 ~ g*1


# means

english1 ~ 0*1
english2 ~ c(NA,h)*1
english3 ~ c(NA,j)*1

# difference between means

diff := h - j

'

fit2 <- sem(model2, data2)
summary(fit2, fit.measures = T, standardized = T)
## lavaan 0.6.15 ended normally after 166 iterations
## 
##   Estimator                                         ML
##   Optimization method                           NLMINB
##   Number of model parameters                        53
##   Number of equality constraints                    14
## 
##   Number of observations                           170
## 
## Model Test User Model:
##                                                       
##   Test statistic                               108.271
##   Degrees of freedom                                51
##   P-value (Chi-square)                           0.000
## 
## Model Test Baseline Model:
## 
##   Test statistic                              1711.627
##   Degrees of freedom                                66
##   P-value                                        0.000
## 
## User Model versus Baseline Model:
## 
##   Comparative Fit Index (CFI)                    0.965
##   Tucker-Lewis Index (TLI)                       0.955
## 
## Loglikelihood and Information Criteria:
## 
##   Loglikelihood user model (H0)              -5352.428
##   Loglikelihood unrestricted model (H1)      -5298.293
##                                                       
##   Akaike (AIC)                               10782.856
##   Bayesian (BIC)                             10905.152
##   Sample-size adjusted Bayesian (SABIC)      10781.665
## 
## Root Mean Square Error of Approximation:
## 
##   RMSEA                                          0.081
##   90 Percent confidence interval - lower         0.060
##   90 Percent confidence interval - upper         0.103
##   P-value H_0: RMSEA <= 0.050                    0.010
##   P-value H_0: RMSEA >= 0.080                    0.559
## 
## Standardized Root Mean Square Residual:
## 
##   SRMR                                           0.066
## 
## Parameter Estimates:
## 
##   Standard errors                             Standard
##   Information                                 Expected
##   Information saturated (h1) model          Structured
## 
## Latent Variables:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##   english1 =~                                                           
##     read1             1.000                               4.364    0.665
##     list1      (a)    1.127    0.095   11.818    0.000    4.921    0.795
##     speak1     (b)    0.659    0.062   10.699    0.000    2.875    0.708
##     write1     (c)    0.837    0.068   12.252    0.000    3.651    0.841
##   english2 =~                                                           
##     read2             1.000                               4.022    0.672
##     list2      (a)    1.127    0.095   11.818    0.000    4.535    0.708
##     speak2     (b)    0.659    0.062   10.699    0.000    2.649    0.700
##     write2     (c)    0.837    0.068   12.252    0.000    3.365    0.800
##   english3 =~                                                           
##     read3             1.000                               3.995    0.702
##     list3      (a)    1.127    0.095   11.818    0.000    4.505    0.783
##     speak3     (b)    0.659    0.062   10.699    0.000    2.632    0.705
##     write3     (c)    0.837    0.068   12.252    0.000    3.343    0.818
## 
## Covariances:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##  .read1 ~~                                                              
##    .read2            12.016    2.151    5.585    0.000   12.016    0.554
##    .read3            11.181    1.989    5.620    0.000   11.181    0.563
##  .read2 ~~                                                              
##    .read3            10.090    1.831    5.509    0.000   10.090    0.561
##  .list1 ~~                                                              
##    .list2             5.658    1.786    3.168    0.002    5.658    0.333
##    .list3             3.794    1.470    2.582    0.010    3.794    0.282
##  .list2 ~~                                                              
##    .list3             5.713    1.705    3.350    0.001    5.713    0.353
##  .speak1 ~~                                                             
##    .speak2            4.519    0.800    5.648    0.000    4.519    0.583
##    .speak3            4.327    0.778    5.563    0.000    4.327    0.569
##  .speak2 ~~                                                             
##    .speak3            4.612    0.762    6.050    0.000    4.612    0.645
##  .write1 ~~                                                             
##    .write2            1.551    0.740    2.096    0.036    1.551    0.261
##    .write3            1.684    0.701    2.402    0.016    1.684    0.305
##  .write2 ~~                                                             
##    .write3            2.364    0.744    3.178    0.001    2.364    0.399
##   english1 ~~                                                           
##     english2         16.781    3.020    5.557    0.000    0.956    0.956
##     english3         15.992    2.900    5.515    0.000    0.917    0.917
##   english2 ~~                                                           
##     english3         15.821    2.801    5.648    0.000    0.985    0.985
## 
## Intercepts:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##    .read1      (d)   17.530    0.458   38.240    0.000   17.530    2.673
##    .read2      (d)   17.530    0.458   38.240    0.000   17.530    2.929
##    .read3      (d)   17.530    0.458   38.240    0.000   17.530    3.079
##    .list1      (e)   16.473    0.449   36.688    0.000   16.473    2.660
##    .list2      (e)   16.473    0.449   36.688    0.000   16.473    2.572
##    .list3      (e)   16.473    0.449   36.688    0.000   16.473    2.862
##    .speak1     (f)   17.980    0.294   61.189    0.000   17.980    4.426
##    .speak2     (f)   17.980    0.294   61.189    0.000   17.980    4.754
##    .speak3     (f)   17.980    0.294   61.189    0.000   17.980    4.815
##    .write1     (g)   18.580    0.320   58.076    0.000   18.580    4.280
##    .write2     (g)   18.580    0.320   58.076    0.000   18.580    4.417
##    .write3     (g)   18.580    0.320   58.076    0.000   18.580    4.549
##     english1          0.000                               0.000    0.000
##     english2   (h)    1.122    0.193    5.801    0.000    0.279    0.279
##     english3   (j)    1.906    0.228    8.377    0.000    0.477    0.477
## 
## Variances:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##    .read1            23.964    2.915    8.222    0.000   23.964    0.557
##    .list1            14.130    2.060    6.860    0.000   14.130    0.369
##    .speak1            8.235    1.037    7.944    0.000    8.235    0.499
##    .write1            5.519    0.943    5.852    0.000    5.519    0.293
##    .read2            19.648    2.385    8.238    0.000   19.648    0.548
##    .list2            20.460    2.607    7.847    0.000   20.460    0.499
##    .speak2            7.284    0.903    8.067    0.000    7.284    0.509
##    .write2            6.377    0.957    6.661    0.000    6.377    0.360
##    .read3            16.445    2.060    7.983    0.000   16.445    0.507
##    .list3            12.833    1.821    7.046    0.000   12.833    0.387
##    .speak3            7.017    0.880    7.969    0.000    7.017    0.503
##    .write3            5.506    0.870    6.332    0.000    5.506    0.330
##     english1         19.046    3.457    5.510    0.000    1.000    1.000
##     english2         16.174    2.940    5.501    0.000    1.000    1.000
##     english3         15.964    2.850    5.601    0.000    1.000    1.000
## 
## Defined Parameters:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##     diff             -0.784    0.156   -5.019    0.000   -0.198   -0.198

© Copyright 2024 @Yi Feng and @Gregory R. Hancock.