Latent Growth Curve Models Patrick Sturgis, Department of Sociology, University of Surrey
Overview • Random effects as latent variables • Growth parameters • Specifying time in LGC models • Linear Growth • Non-linear growth • Explaining Growth • Fixed and time-varying predictors • Benefits of SEM framework
SEM for Repeated Measures • The SEM framework can be used on repeated measured data to model individual growth trajectories. • For cross-sectional data latent variables are specified as a function of different items at the same time point. • For repeated measures data, latent variables are specified as a function of the same item at different time points.
E1 E2 E3 E4 1 1 1 1 Constrain factor loadings Estimate factor loadings X11 X12 X13 X14 LV A Single Latent Variable Model same item at 4 time points 4 different items Estimate mean and variance of underlying factor Estimate mean and variance of trajectory of change over time
Repeated Measures & Random Effects • We have average (or ‘fixed’) effects for the population as a whole • And individual variability (or ‘random’) effects around these average coefficients
Random Effects as Latent Variables • In LGC: • The mean of the latent variable is the fixed part of the model. • It indicates the average for the parameter in the population. • The variance of the latent variable is the random part of the model. • It indicates individual heterogeneity around the average. • Or inter-individual difference in intra-individual change.
Growth Parameters • The earlier path diagram was an over-simplification. • In practice we require at least two latent variables to describe growth. • One to estimate the mean and variance of the intercept. • And one to estimate the mean and variance of the slope.
Specifying Time in LGC Models • In random effects models, time is included as an independent variable: • In LGC models, time is included via the factor loadings of the latent variables. • We constrain the factor loadings to take on particular values. • The number of latent variables and the values of the constrained loadings specify the shape of the trajectory.
1 2 1 3 1 1 1 0 A Linear Growth Curve Model Constraining values of the intercept to 1 makes this parameter indicate initial status Constraining values of the slope to 0,1,2,3 makes this parameter indicate linear change
1 2 1 3 1 1 1 0 Quadratic Growth E1 E2 E3 E4 X1 X2 X3 X4 Add additional latent variables with factor loadings constrained to powers of the linear slope 9 4 1 0 SLOPE ICEPT QUAD
File structure for LGC • For random effect models, we use ‘long’ data file format. • There are as many rows as there are observations. • For LGC, we use ‘wide’ file formats. • Each case (e.g. respondent) has only one row in the data file.
A (made up) Example • We are interested in the development of knowledge of longitudinal data analysis. • We have measures of knowledge on individual students taken at 4 time points. • Test scores have a minimum value of zero and a maximum value of 25. • We specify linear growth.
1 2 1 3 1 1 1 0 Linear Growth Example mean=11.2 (1.4) p<0.001 variance =4.1 (0.8) p<0.001 mean=1.3 (0.25) p<0.001 variance =0.6 (0.1) p<0.001
Interpretation • The average level of knowledge at time point one was 11.2 • There was significant variation across respondents in this initial status. • On average, students increased their knowledge score by 1.2 units at each time point. • There was significant variation across respondents in this rate of growth. • Having established this descriptive picture, we will want to explain this variation.
Explaining Growth • Up to this point the models have been concerned only with describing growth. • These are unconditional LGC models. • We can add predictors of growth to explain why some people grow more quickly than others. • These are conditional LGC models. • This is equivalent to fitting an interaction between time and predictor variables in random effects models.
Do men have a different initial status than women? Do men grow at a different rate than women? 1 2 1 3 1 1 1 0 Gender (women = 0; men=1) Time-Invariant Predictors Does initial status influence rate of growth?
Why SEM? • Most of this kind of stuff could be done using random/fixed effects. • SEM has some specific advantages which might lead us to prefer it over potential alternatives: • SPSS linear mixed model • HLM • MlWin • Stata (RE, FE)
Fixed Effects/Unit Heterogeneity • A fixed effects specification removes ‘unit effects’ • This controls for all observed and unobserved invariant unit characteristics • Highly desirable when one’s interest is in the effect of time varying variables on the outcome • This is done by allowing the random effect to be correlated with all observed covariates • Downside=no information about effect of time invariant variables, possible efficiency loss • SEM allows various hybrid models which fall between the classic random and fixed effect specifications
Random effect model b b b b
Fixed effect model b b b b
Hybrid model Remove equality constraint on beta weights b b b b Allow correlated errors Z Introduce Time-Invariant Covariate that has indirect Effect on X
Multiple Indicator LGC Models • Single indicators assume concepts measured without error • Multiple indicators allow correction for systematic and random error • Reduced likelihood of Type II errors (failing to reject false null) • Tests for longitudinal meaning invariance • Allows modeling of measurement error covariance structure
Other Benefits of SEM • Global tests and assessments of model fit • Full Information Maximum Likelihood for missing data • Decomposition of effects – total, direct and indirect • Probability weights • Complex sample data
E5 E1 E2 E6 E7 E3 E4 E8 1 1 1 1 1 1 1 1 X1t1 Y1t1 X1t2 Y1t2 X1t3 Y1t3 X1t4 X1t4 1 2 1 2 1 1 1 3 1 1 3 1 1 0 ICEPT2 SLOPE1 SLOPE2 ICEPT1 0 1 Multiple Process Models Does rate of growth on one variable influence rate of growth on the other? Does initial status on one variable influence development on the other?