To access the slides for today go to:

To access the slides for today go to: • www.danieldicksonphd.com • Supplementary Materials • Password: mplus2014 • Includes relevant literature cited and outputs by analysis

Mplus: The Who? The What? The Why? And the How? Workshop for Loyola University Chicago’s Department of Psychology June 3, 2014

Disclaimer • I have no personal investment in or affiliation with Mplus • This workshop is not meant to take the place of a true structural equation modeling course • Hopefully, I can give you exposure to what Mplus can do, but also what you can do with SEM approaches.

One more note! I will be using SEM terms such as factor loadings, variances, covariances, disturbance terms, and fit statistics I will do some orientation to these terms Please feel free to interrupt me if you have any questions

Overview of Today • Brief history of Mplus • What is Mplus? • Important Data Issues in Mplus • This isn’t our good friend SPSS or Excel • Mplus Applications • Types of questions that can be answered • Step-by-step analysis for select research questions

Mplus • Created by Benght Muthen and his wife Linda Muthen • Muthen was a disciple of Joreskog who developed LISREL • Swedish mathematicians know their stuff • He wanted to create a program that could “do it all”

Mplus: What is it? • A powerful statistical package for analysis of latent and observed variables • Exploratory factor analysis (e.g. PCA) • Confirmatory factor analysis • Latent class analysis • Latent growth curve modeling • Multilevel modeling • Structural equation modeling • Path analysis • And more…

Mplus: What is it? • It can use categorical and continuous variables • Often permits missing data • Can be run in windows or on a Mac • Can also use it for regression! • Logistic (multinomial, ordinal) • Poisson • Tobit • Survival analysis (continuous time and discrete time)

Fun Applications/Bells and Whistles • Bootstrapping • Used in mediation to test for significance of an indirect effect • Can be used to test significance of any parameter • Maximum likelihood estimation • Allows for missing data (provided MCAR/MAR) • It’s a form of model imputation

Mplus: What is it? • Some bad news: • Requires syntax • Can’t simply do drop-down menus (like our friend SPSS) • Some good news: • The Muthens do a very good job of working to make Mplus user friendly • Don’t have to use matrices (a la LISREL) • Can use graphics to model your model • Similar to AMOS

Data Entry Issues • Preparing the data is probably the most frustrating aspect of using any software • Data are in a specific format • Deviates from the SPSS format • Can use SPSS to export the data • ASCII--It’s as simple as a “Save As…” • .csv • .dat Additional Resources: www.ats.ucla.edu/stat/mplus/seminars/introMplus_part1/entering_52.htm

Is this your data or “The Matrix”?

Data Entry Issues • Not allowed to have empty spaces (what you might do in SPSS) • Must code your missing data as a number • E.g. -999 • You have to assign labels to each of your variables • You can’t write them into your ASCII file • I’ll show you how this works! • Can’t be longer than 8 characters (come back to this) • PAIN to keep track of! Additional Resources: www.ats.ucla.edu/stat/mplus/seminars/introMplus_part1/entering_52.htm

Coding for missing data • Have to let the program know when you have “missing” data • It’s a control freak • Needs to know so that it can figure out what to do with the missing information • May dictate your estimator • Will have implications for including people who have missing data, those who don’t, and accuracy of your parameter estimates

Why is it a good idea to code for missing data? • Required to report the amount of missing data in manuscripts • Brings your attention to tests of missingness • It will allow some model imputation if the missing data are MAR/MCAR (more on this in a moment)

Types of Missingness • Missing Completely at Random (MCAR) • No association with unobserved variables (selective process) and no association with observed variables • Missing at Random (MAR) • No association with unobserved variables, but maybe related to observed variables • Non-random Missing (MNAR) • Some association with unobserved variables and maybe with observed variables Baraldi, A.N., & Enders, C.K. (2010). An introduction to modern missing data analyses. Journal of School Psychology, 48, 5-37. Enders, C.K. (2010). Applied missing data analysis. New York: Guilford Press.

Testing for Missingness: Distinguish between MCAR and MAR • Empirically evaluate relations between observed variables and missing values • Create a dummy variable with two values: missing and nonmissing • Use standard statistical procedures to test the relation between this variable and the other variables of interest in the data set • A) if dummy variable is not related to any other variables, then the data are either MCAR or NMAR • B) if the dummy variable is associated with other variables then the data are MAR or NMAR

Tests of Missingness • There are many tests for missingness. Some are available in SPSS (Little’s test for missingness) and some are what you can do on your own • See previous slide • There are whole courses on missingness! • For more information I recommend • Article intro to missingness: • Baraldi, A.N., & Enders, C.K. (2010). An introduction to modern missing data analyses. Journal of School Psychology, 48, 5-37. • Digestible text: • Enders, C.K. (2010). Applied missing data analysis. New York: Guilford Press.

Why did you go there? • I went there because Mplus uses maximum likelihood estimation • MLE uses all of the available data-complete and incomplete-to identify parameter values that best produce the sample data • Previous work suggests that parameters will be unbiased if the data satisfy the MAR assumption Schlomer, G.L., Bauman, S. & Card, N.A. (2010). Best practices for missing data management in counseling psychology. Journal of Counseling Psychology, 57, 1-10.

Steps and Procedure • Go to SPSS • Mplus Dataset.sav • Code your missing data (you can’t just leave it blank) • Save as a .csv (do not save variable names) • Variable List.xls • Create an excel file with your variable names • Add this variable list to your mplus syntax

How do I get my data there? Dataset: Mplus Dataset.sav to convert to Data.csv

Let’s start getting familiar…

Mplus Interface (Mac)

Mplus Interface (Mac Diagrammer)

But First: Some Conventions • Syntax-based • Can have syntax appear in any order • No column can be longer than 90 characters • Each command must be finished with “;” • I’ll repeat this later • NOT case sensitive (Time1=timE1) • To leave comments, anything after a ! Is “dead text” and it will appear in green

The 10 Commands • TITLE • DATA (required) • VARIABLE (required) • DEFINE • ANALYSIS • MODEL • OUTPUT • SAVEDATA • PLOT • MONTECARLO

Title • Where you can state your analyses • E.g. TITLE: Path Analysis for Indirect Effect of M in Relationship Between X and Y; • Good way to keep track of what you’re doing • Cannot exceed 90 characters • NOTE: EVERY COMMAND MUST BE TERMINATED WITH ;

Data • Your data set MUST be in the same folder as your input file • The program “looks” for the file you state • You tell the program the name of the file and the number of observations • “DATA: FILE IS FullData-Mplus.csv;” • “NOBSERVATIONS ARE 205;” • You can also enter in covariance and correlation matrices if you are doing replication studies

Variable • Where you list the names of your variables • The program won’t know which variable is which, unless you properly name them • Provide names for each variable • FEW RULES: • Similar to SPSS; you can’t have redundant names • The names of each variable cannot be more than 8 characters

Variable List Example Mplus Workshop Handouts Outputs  To Make Databases  Variable List.xls

Variable (cont.) • You also have to state which variables are categorical • Statement: CATEGORICAL ARE • State what your code is for “missing” • Statement: MISSING ARE ALL (-999) • It is also here where you tell the program to use variables • Thankfully, this command is “usevariables are”

Neato Trick • VARIABLES ARE: • DepT1 • DepT2 • DepT3 • DepT4 • DepT5; • USEVARIABLES ARE DepT1-DepT5; • This is the same as: USEVARIABLES ARE DepT1 DepT2 DepT3 DepT4 DepT5;

Define • Used when you want to create variables • Can be useful when creating centered variables • Can be useful if you like creating your own interaction effects • DEFINE: cCESD = (CESD-4.08) • See Path Analysis Moderation Section for more information

Analysis (sample) • ANALYSIS: • TYPE IS GENERAL; • ITERATIONS = 3000; !Typically 1000 can be used • CONVERGENCE = .00005; • BOOTSTRAP = 1000;

Analysis • When running analyses we have many different forms of estimation • ML (normal data; missingness MCAR) • MLR (for non-normal data) • WLS (for ordinal data) • Etc. • Mplus DOES NOT AUTOMATICALLY select an estimator for you… • In most cases, for normal data, you can use ML

Analysis • Iterations: • Can change depending upon how many times you want the data “replicated” • I typically keep it at 1,000, but I’ve used as high as 3,000 • Convergence: • I keep the same: =.00005 • Bootstrap • OPTIONAL; in many cases if I’m running mediation, I will keep BOOTSTRAP in the line of analysis • In many cases, if you need to change these values, Mplus will give you a warning to do so!

Model • Where you state the analyses you want to do • ON=For regression statements • DepVar ON Ind1; • BY=To define a latent variable • LateVar BY Ind1 Ind2 Ind3; • WITH=To correlate variables • Ind1 WITH Ind2;

Model * =To free a parameter LateVar BY Ind1* Ind2 Ind3; * followed by a number = setting a “starting value” This has to do with allowing the data to make “good places to start” it’s iterations to allow for model convergence Typically, not used unless models have difficulty converging [Ind1*2.01] provides a “starting value” of 2.01 for the intercept of Ind1. @=To fix a parameter at a specific value LateVar BY Ind1* Ind2@1 Ind3;

Output • You can select what statistics you would like to view: • Sample statistics: SAMPSTAT • Confidence intervals: CINTERVAL (request for bootstrapping intervals!) • Modification Indices: MODINDICES • Standardized results: STANDARDIZED • TECH1: Parameters estimated and Starting Values • TECH4: Estimates from model • Covariances for latent variables • Correlations for latent variables

One Final Note: What’s in the output? • Fit statistics (except in latent class analysis) • 2 (we want p > .05!) • CFI (Closer to 1.00 the better) • TLI (Closer to 1.00 the better) • RMSEA (.05 or below; CI less than .10) • SRMR (.05 or below) • For recommendations on fit statistics see Bentler and Hu (1999)

Sample Size Issues • N:q rule • Should have AT LEAST 10 people per parameter you estimate • This is simply a rule of thumb, if you want to determine a prior, I recommend • Muthen, L.K., Muthen, B.O. (2002). How to use a Monte Carlo study to decide on sample size and determine power. • Report available at www.statmodel.com (mplus website)

Identified model? • You have to “buy” estimation • You have as many “tokens” as observations • Example: 6 measures in your model

Identified Model • That means I am only allowed to estimate 21 parameters • It’s akin to going to the movies with only so much cash available (w/o a credit card) • If you’re short, you can’t ask a friend for money either! • If you ask the program to estimate more than 21 parameters, you will not have an identified model

Identified Model? • An example of an unidentified model: • I have two measures of depression: the Beck Depression Inventory and the MASQ-AD • I want to see if I have enough “buying power” to estimate if they are part of the same latent variable…

Identified Model? • In CFA, given two indicators (BDI & MASQ-AD), we estimate one factor loading (the other is fixed at 1), two error terms, and a factor variance • That means we are estimating: • 1+2+1=4 • BUT WE ONLY HAVE 3 dfs to work with! • Then we’re underidentified :( 1 2 3 4

On to analyses!

Analyses Discussed • Confirmatory Factor Analysis (CFA) • Path Analysis • Mediation • Cross-Lagged Panel • Moderation • Structural Models • Full SEM: Latent Variable Interactions • Latent Growth Models • Sequelae of change models • Latent Class Analysis • Growth Mixture Modeling • Multilevel Modeling

Confirmatory Factor Analysis

Confirmatory Factor Analysis • Evaluating the fit of a theoretical latent variable with observed indicators • Observed indicators: • Individual responses on a questionnaire • Scores across multiple questionnaires • Translation: “The shared-ness amongst these variables is due to an underlying, unobserved variable” • E.g. the questions of the Beck Depression Inventory are indicators of depression

Confirmatory Factor Analysis: An example I used the Center for Epidemiological Studies-Depression Scale to measure depression in a college student, non treatment seeking sample. I have 20 items that “tap into” depression In this case, for model simplicity and only for teaching purposes, I have “parceled” the items Item parceling is adding scores together to decrease the number of indicators in the model (controversial, not recommended)

To access the slides for today go to: