Structural Equation Modeling

Structural Equation Modeling Intro to SEM

Other Names • SEM – Structural Equation Modeling • CSA – Covariance Structure Analysis • Causal Models • Simultaneous Equation Modeling • Path Analysis (with Latent Variables) • Confirmatory Factor Analysis

SEM in a nutshell • Combination of factor analysis and regression • Continuous and discrete predictors and outcomes • Relationships among measured or latent variables • Direct link between Path Diagrams and equations and fit statistics • Models contain both measurement and path models

An Example of a Path Diagram

Vocabulary • Measured variable • Observed variables, indicators or manifest variables in an SEM design • Predictors and outcomes in path analysis • Squares in the diagram • Latent Variable • Un-observable variable in the model, factor, construct • Construct driving measured variables in the measurement model • Circles in the diagram

More Vocabulary • Error or E • Variance left over after prediction of a measured variable • Disturbance or D • Variance left over after prediction of a factor • Exogenous Variable • Variable that predicts other variables • Endogenous Variables • A variable that is predicted by another variable • A predicted variable is endogenous even if it in turn predicts another variable

Still more Vocabulary • Measurement Model • The part of the model that relates indicators to latent factors • The measurement model is the factor analytic part of SEM • Path model • This is the part of the model that relates variable or factors to one another (prediction) • If no factors are in the model then only path model exists between indicators

Even more Vocabulary • Direct Effect • Regression coefficients of direct prediction • Indirect Effect • Mediating effect of x1 on y through x2 • Confirmatory Factor Analysis • Covariance Structure • Relationships based on variance and covariance • Mean Structure • Includes means (intercepts) into the model

Back to Path Diagrams • Single-headed arrow → • This is prediction • Regression Coefficient or factor loading • Double headed arrow ↔ • This is correlation • Missing Paths • Hypothesized absence of relationship • Can also set path to zero

The Previous Example

Types of SEM questions • Does the model produce an estimated population covariance matrix that “fits” the sample data? • SEM calculates many indices of fit; close fit, absolute fit, etc. • Which model best fits the data? • What is the percent of variance in the variables explained by the factors? • What is the reliability of the indicators? • What are the parameter estimates from the model?

SEM questions • Are there any indirect or mediating effects in the model? • Are there group differences? • Multi-group models • Can change in the variance (or mean) be tracked over time? • Growth Curve or Latent Growth Curve Analysis

SEM questions • Can a model be estimated with individual and group level components? • Multilevel Models • Can latent categorical variables be estimated? • Mixture models • Can a latent group membership be estimated from continuous and discrete variables? • Latent Class Analysis

SEM questions • Can we predict the rate at which people will drop out of a study or end treatment? • Discrete-time survival mixture analysis • Can these techniques be combined into a huge mess? • Multiple group multilevel growth curve latent class analysis???????

SEM limitations • SEM is a confirmatory approach • You need to have established theory about the relationships • Cannot be used to explore possible relationships when you have more than a handful of variables • Exploratory methods (e.g. model modification) can be used on top of the original theory • SEM is not causal; experimental design = cause

SEM limitations • SEM is often thought of as strictly correlational but can be used (like regression) with experimental data if you know how to use it. • SEM is by far a very fancy technique but this does not make up for a bad experiment and the data can only be generalized to the population at hand

SEM limitations • Biggest limitation is sample size • It needs to be large to get stable estimates of the covariances/correlations • 200 subjects for small to medium sized model • A minimum of 10 subjects per estimated parameter • Also affected by effect size and required power

SEM limitations • Missing data • Can be dealt with in the typical ways (e.g. regression, EM algorithm, etc.) through SPSS and data screening • Most SEM programs will estimate missing data and run the model simultaneously • Multivariate Normality and no outliers • Screen for univariate and multivariate outliers • SEM programs have tests for multi-normality • SEM programs have corrected estimators when there’s a violation

SEM limitations • Linearity • No multicollinearity/singularity • Residuals Covariances (R minus reproduced R) • Should be small • Centered around zero • Symmetric distribution of errors • If asymmetric than some covariances are being estimated better than others

Technical Stuff Follow

Basic Structure Simple regression: y = x +  = Implied Covariance Matrix

The univariate consequences of measurement error x = True Score + Error =  +  Var(x) = Var() + Var() =  +  Thus, Var(x) overestimates the variance of the true score

The bivariate consequences of measurement error • A simple regression model with measurement error • y = *x +   where xx is the measurement reliability of x.

Introduction:The bivariate consequences of measurement error • Impact on goodness-of-fit • What’s the impact on sample inference? • Generally, the distortions are not as systematic for multiple regression and simultaneous equation models

Confirmatory Factor Analysis Model Where: x = (q  1) vector of indicator/manifest variables  = (n  1) vector of latent constructs (factors)  = (q  1) vector of errors of measurement = (q  n) matrix of factor loadings

Confirmatory Factor Analysis: Example Measures for positive emotions 1: • x1 = Happiness, x2=Pride Measures for negative emotions 2: • x3 = Sadness, x4=Fear Model:

Confirmatory Factor Analysis: Example

 1 2 11 21 34 32 x1 x2 x3 x4 1 2 3 4 Confirmatory Factor AnalysisGraphical Representation

Confirmatory Factor AnalysisModel Assumptions E() = 0 E() = 0 Var() =  Var() =  Cov(, ) = 0 Implied Mean Vector Implied Covariance Matrix

Confirmatory Factor AnalysisExample

Confirmatory Factor AnalysisModel Identification Definition: The set of parameters ={,,} is not identified if there exists 12 such that (1)= (2).

 11 21 x1 x2 1 2 Confirmatory Factor AnalysisIs the one-factor, two-indicator model identified? • Example: Measures for temperature : x1 = Celsius, x2=Fahrenheit • Measurement Model: • where 1 and 2 are measurement intercepts.

Confirmatory Factor AnalysisScale indeterminacy Recall measurement model: Origin indeterminacy  E() = 0 Scale (unit) indeterminacy How should single-indicator factors be handled?

Confirmatory Factor AnalysisThe one-factor, two-indicator model is under identified Population covariance matrix Implied covariance matrix Solution 1 Solution 2

1 x1 x2 x3 1 3 2 Confirmatory Factor AnalysisIs the one-factor three-indicator model identified? 1 21 31

Confirmatory Factor AnalysisThe one-factor three-indicator model is exactly identified

Confirmatory Factor AnalysisIdentification Rules • - Number of free parameters  ½ q (q+1) • - Three-Indicator Rule • n1 • One non zero element per row of  • Three or more indicators per factor •  Diagonal • Two-Indicator Rule • n > 1 • ij  0 for at least one pair i, j, i  j • one non-zero element per row of  • Two or more indicators per factor •  Diagonal

Confirmatory Factor AnalysisMaximum Likelihood Estimation xi ~ i.i.d MVNq(0, ()) i=1, …, N

Confirmatory Factor AnalysisOther Estimation Methods • Unweighted Least Squares • Generalized L.S.

Confirmatory Factor AnalysisThe Asymptotic Covariance Matrix = Information Matrix

Confirmatory Factor AnalysisGoodness-of-fit measures Root Mean-Square Residual Correlation Residuals Goodness-of-Fit Index Communalities/Reliabilities Coefficient of Determination ~

Confirmatory Factor AnalysisGoodness-of-fit measures

Confirmatory Factor AnalysisOther Goodness-of-fit indices • Root Mean Square Error of Approximation: where df = (q(q+1)/2) – t (degrees of freedom). • RMSEA  0.05  Close fit 0.05 < RMSEA  0.08 Reasonable fit RMSEA > 0.1  Poor fit

Confirmatory Factor Analysis: Multitrait-Multimethod Example x1x2 x3x1 x4x3 x4x2

Confirmatory Factor Analysis: Multitrait-Multimethod Example 1 2 4 1 x1 2 x2 x3 3 x4 3 4

Brand Halos and Brand Evaluations Lynd Bacon (1999) Performance Quality Pd1 Pt1 Pd2 Pt2 Qd1 Qt1 Qd2 Qt2 DirtyScooter TrailBomber

Brand Halos and Brand EvaluationsSources of Variance Brand Attribute DirtyScooter Pd1 0.71 0.04 Pd2 0.74 0.02 TrailBomber Pt1 0.40 0.39 Pt2 0.41 0.30

Convergent and Discriminant ValidityBagozzi and Yi (1993) • Attitude towards coupons (1) with three semantic differential measures: x1=pleasant/unpleasant x2=good/bad x3=favorable/unfavorable • Subjective norms (2) with two measures x4= Most people who are important to me think I definitely should use coupons for shopping in the supermarket x5= Most people who are important to me probably consider my use of coupons to be wise.

.86 1 2 .75 .73 .90 .69 x1 x3 x4 x5 1 .43 3 .47 4 .52 5 .19 Convergent and Discriminant Validity Bagozzi and Yi (1993) .82 x2 2 .33

Convergent and Discriminant ValidityBagozzi and Yi (1993) • Convergent validity: - Goodness-of-fit: - All loadings are high and significant • Discriminant validity: H0=1 is rejected • Measurement reliability: (x1=.56, x2=.67, x3=.53, x4=.48, x5=.81)

Structural Equation Modeling