The MHSIP: A Tale of Three Centers

1. The MHSIP: A Tale of Three Centers P. Antonio Olmos-Gallo, Ph.D. Kathryn DeRoche, M.A. Mental Health Center of Denver Richard Swanson, Ph.D., J.D. Aurora Research Institute John Mahalik, Ph.D., M.P.A. Jefferson Center for Mental Health Presented at the Organization for Program Evaluation in Colorado Annual Meeting, May 15, 2008 1

2. Presentation Overview Accountability in mental health Description and intended use of the MHSIP Review of constructs of measurement Purpose and Methods Results of the psychometric investigation Reliabilities Measurement invariance Differential item functioning Discussion of results Future directions for accountability in mental health 2

3. Accountability in Mental health 3

4. Accountability in Mental Health 4

5. How does accountability work in MH? Accountability has changed from Formative- to more Summative-oriented Grant funding (Federal, Private) requires that outcomes be demonstrated (NOMS, GPRA) State-based requirements (CCAR, MHSIP, YSSF) Stakeholders are more in-tune with accountability

6. Description and Intended Uses of the MHSIP What is the MHSIP? What is it used for? 6

7. 7

8. 8

9. Domains of the MHSIP 9

10. 10

11. 11

12. Measurement Constructs 12

13. 13

14. Reliability of the MHSIP 14

15. What are we comparing? 15

16. Rasch Modeling Perspective 16

17. Purpose and Methods Participants, Procedures, and Data Analysis 17

18. Purpose of the Investigation 18

19. Participants 19

20. Procedures 20

21. Psychometric Examination of the MHSIP Reliability, Measurement Invariance, and Differential item Functioning 21

22. Comparing Subscales 22

23. Reliability Estimates in 2007 among Subscales and Centers 23

24. Reliability Summary 24

25. Invariance Testing Across Centers 25

26. Confirmatory Factor Analysis A model with all five domains could not be fit Some of the parameters could not be estimated (Variance-Covariance matrix may not be identified) Exploratory analyses using only Outcomes and Participation showed that �Outcomes� was the major culprit

28. Invariance with 3 domains We tested invariance on three domains only: Satisfaction, Access and Quality We ran separate models for every center to have an idea up-front of their similarities/differences Trouble can be expected based on the fit Center 2 had the worst fit, Center 3 had a not-so-bad fit; Center 1 was in between the other two centers

32. Measurement Invariance Whether or not, we can assert that we measured the same attribute under different conditions If there is evidence of variability, any findings reporting differences between individuals and groups cannot be interpreted Differences in average scores can be just as easily interpreted as indicating that different things were measured Correlations between variables will be for different attributes for different groups

33. Factorial Invariance One way to test measurement invariance is FACTORIAL INVARIANCE The main question it addresses: Do the items making a particular measuring instrument work the same across different populations (e.g., Males and Females)? The measurement model is group-invariant Tests for Factorial Invariance (in order of difficulty):

34. Steps in Factor Invariance testing Equivalent Factor structure Same number of factors, items associated with the same factors (Structural model invariance) Equivalent Factor loading paths Factor loadings are identical for every item and every factor

35. Steps in Factor Invariance testing (cont) Equivalent Factor variance/ covariance Variances and Covariances (correlations) among factors are the same across populations Equivalent Item reliabilities Residuals for every item are the same across populations

36. Results Factorial Invariance

37. Conclusions Factorial Invariance The model does not provide a good fit for the different centers Most of the discrepancy is centered on loadings and how the domains interact with each other (variance-covariance) Since the model is incremental, (later tests are more challenging than early ones), we did not run equivalent item reliabilities (the most stringent test)

38. Differential Item Functioning (DIF) 38

39. Differential Item Functioning 39

40. 40

42. 42

44. 44

45. Summary of DIFF Analysis 45

46. Discussion 46

47. What did we learn about the MHSIP? Some items and subscales (domains) do not seem to measure equally across centers Therefore comparing centers using these items/domains may not reflect true differences in performance It is more likely that they reflect differences in measurement (including error, difficulty, reliability) 47

48. Some domains are reliable, some are not Satisfaction was Ok from all 3 perspectives Quality had some good characteristics, but some items were bad Participation is not very reliable (only two items; but the items were good) Outcomes is overall, a real bad domain (bad items, lots of cross-loading, correlated errors) Employment/education may not be a desired outcome for all consumers

49. Discussion Despite the fact that the samples may not be appropriate (biases, sampling frameworks that can be improved), the data at hand suggests that there are some intrinsic problems with the MHSIP But the analyses also suggest some very specific ways to improve it 49

50. Suggestions Revise the Outcomes Scale (differentiate between recovery/resiliency) Add items to participation scale Some items in Access need to be reviewed (Q4 and Q6) How do we deal with all these cross-loading factors? Is it one domain (satisfaction) that we artificially broke into many domains (outcomes, access, �)? How does the factor structure for the entire sample (EFA included in the annual report) holds up for individual centers? More research is needed in this area

51. More suggestions Sampling Suggestions: Attempt to Stratify the sample by Consumer�s needs level At MHCD, we have developed a measure of consumer�s recovery needs level (RNL) Equating Suggestions: Use some form of equating procedures to equate scores across centers Using Item Response Theory techniques: IRT could help learn more about how the MHSIP measures satisfaction/performance within/among mental health centers

52. More suggestions Mixed Method Design: Conducting focus groups at each center would provide a cross-validation to quantitative measurement This would also enhance the utilization of the results for quality improvement Include in the annual reports the psychometrics (reliability) for every center Helps to know how much confidence we should have in the scores

53. Questions??? 53

54. ?2 (Chi-Square): in this context, it tests the closeness of fit between the unrestricted sample covariance matrix and the restricted (model) covariance matrix. Very sensitive to sample size: The statistic will be significant when the model fits approximately in the population and the sample size is large. RMSEA (Root Mean Square Error of Approximation): Analyzes the discrepancies between observed and implied covariance matrices. Lower bound of zero indicates perfect fit with values increasing as the fit deteriorates. Suggested that values below 0.1 indicate a good fit to the data, and values below 0.05 indicate a very good fit. It is recommended not to use models with RMSEA values larger than 0.1 GFI (Goodness of Fit Index): Analogous to R2 in that it indicates the proportion of variance explained by the model. Oscillates between 0 and 1 with values exceeding 0.9 indicating a good fit to the data. CFI (Comparative Fit Index): Indicates the proportion of improvement of the overall fit compared to a null (independent) model. Sample size independent, and penalizes for model complexity. It uses a 0-1 norm, with 1 indicating perfect fit. Values of about 0.9 or higher reflect a good fit

The MHSIP: A Tale of Three Centers

The MHSIP: A Tale of Three Centers

Presentation Transcript

Strategic management of logistics service: A fuzzy QFD approach

A Tale of Two Cities A Tale of Two Wars

The Tell Tale Heart

The Pardoner’s Tale OR The Wife of Bath’s Tale

Differentiated Learning Centers in the Elementary Library

“The Pardoner’s Tale”

A Knot’s Tale

A Tale of Two Cities

Prologue:

The Tale of the Three Brothers

Tall-Tale

The Knight’s Tale

The Merchant’s Tale A Moral Tale January & May

Motifs in the handmaid tale

Christian Study Centers

The Hitchhiker & The Tell-Tale Heart

What is a Tall Tale?

FAIRY TALES

The Tell Tale Heart

Wife of Bath’s Tale

The MHSIP: A Tale of Three Centers