1 / 24

Multilevel Data in Outcomes Research

Multilevel Data in Outcomes Research. Types of multilevel data common in outcomes research Random versus fixed effects Statistical Model Choices “Shrinkage Estimates” versus Fixed Effects Example of CA State CABG data. What are multilevel data?.

osborn
Télécharger la présentation

Multilevel Data in Outcomes Research

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Multilevel Data in Outcomes Research • Types of multilevel data common in outcomes research • Random versus fixed effects • Statistical Model Choices • “Shrinkage Estimates” versus Fixed Effects • Example of CA State CABG data

  2. What are multilevel data? • Gathering individual observations into larger groups does not create clustered data • Individual observations from a simple, random sample are never multilevel • Multilevels are a result of sampling/design • Usually from stages/levels in obtaining the individual units of observation • Repeated measures is a type of multilevel data

  3. Other Names for Multilevel Data • Hierarchical models • Clustered data (but different from cluster analysis) • Components of Variance models • Contextual Models • Micro and macro level data

  4. Multilevel Data in Outcomes Research • Two levels: • Hospitals and patients • Physicians and patients • Three levels: • Hospitals, physicians, and patients • Physicians, patients, and repeated measures • Four levels: • National Health Interview Survey

  5. National Health Interview Survey • Highest level: Select Primary Sampling Units (MSA’s, counties, groups of counties) • Next level: Stratify PSU’s by Census blocks and select Secondary Sampling Units (clusters of households) • Next level: Select Households within SSU’s • Lowest level: Interview individuals in the households (some all, others a sample)

  6. Characteristics of Multilevel Data • Measurements within level are correlated (eg, measures on same person are more alike than measurements across persons) • Variables can be measured at each level • Standard statistical models and tests are incorrect • The variance of the outcome can be attributed to each level

  7. Two Parts of Multilevel Data VarianceOutcome = Patient Satisfaction Score Level 2: Physicians MD1: mean=81 MD2: mean=58 MD3: mean=74 55 61 68 74 75 79 81 85 77 Level 1: Patients Variance in the patient score divides into two parts: (1) the variance between physicans = 2B (2) the variance within the physicians = 2W So the total variance = 2B + 2W

  8. Intraclass Correlation Coefficient (ICC) The intraclass correlation coefficient (ICC) is a measure of the correlation among the individual observations within the clusters It is calculated by the ratio of the between cluster variance to the total variance: 2B / (2B + 2W )

  9. Intraclass Correlation Coefficient (ICC) MD1: mean=81 MD2: mean=58 MD3: mean=74 58 58 74 74 74 74 81 81 81 Take extreme case where each MD’s patients have the same score = no variance within the physicians. So, ICC = 2B / 2B + 2W = 2B / 2B + 0 = 1 = perfect correlation within the clusters.

  10. Intraclass Correlation Coefficient (ICC) MD1: mean=71 MD2: mean=68 MD3: mean=74 58 78 54 94 84 64 81 61 71 A different case where each MD’s patients have very different scores = most of the variance is within the physicians (ie, between patients, not physicians). ICC is close to 0.

  11. Implications of ICC for Analysis • When the ICC is close to 0, most of the variation is explained by patient level measures • Less difference between results from ordinary regression and multilevel models • May be less important to use a statistical model that allows variables for physician characteristics

  12. Implications of ICC for Analysis • When the ICC is close to 1, most of the variation is explained by physician level measures • Using a statistical model that removes physician effects leaves little variation to explain • Important to use a statistical model that allows variables for physician characteristics

  13. Methods of Analyzing Multilevel Data • Regression model ignoring higher level variables • Regression model with an indicator variable for each level 2 unit (minus one) • Conditional regression model • Regression model with generalized estimating equations (GEE model) • Random or mixed effects regression model

  14. Choice of Analysis Model: Three Main Considerations • What is the research question? • How many observations are there at each level of the data? • How important is controlling unmeasured confounding at the higher level?

  15. Fixed versus Random Effects • Effects are random when the units are a sample of a larger population • have variation because sampled; another sample would give different data • Effects are fixed if they represent all possible members of a population: • eg, male/female; treatment groups; all the regions of the U.S.

  16. Fixed versus Random Effects • Effects treated as fixed or random depending on the research question • Random effects: generalize from the sample to a larger population • Random effects: reduce variation due to small sample size by fitting a distribution • Fixed effects: Control for unmeasured confounding at the higher level

  17. Methods of Analyzing Multilevel Data Fixed Effects Models: - Regression model with an indicator variable for each level 2 unit (minus one) - Conditional regression model Random Effects Models: - Regression model with generalized estimating equations (GEE) - Random or mixed effects regression model

  18. What are “shrinkage estimates”? • Also called Bayesian or empiric Bayesian estimates (Iezzoni text) or Best linear unbiased prediction estimates (SAS) • Can only be obtained from a random effects (not GEE) regression model • Variance of the higher level variable is modeled as if from a specified distribution (usually normal, but other possible)

  19. A Simple Random Effects Model • A simple random effects model is: yij =  + j+ eij, where  = overall mean, j = difference for MD, and eij = individual error • Model says there is random variation from the mean score at the level of MD’s plus variation at the level of patients • Bayesian estimates are the individual j’s obtained from the overall distribution

  20. Example of Shrinkage Estimates • In Patient Outcomes Research Team study of patient satisfaction with MD treatment for diabetes, raw mean patient scores by MD ranged from 53.4 to 87.1 • The random effects shrinkage estimates of the mean patient scores by MD ranged from 60.4 to 78.6 • Random effects shrinkage estimates are closer to the overall mean

  21. Controversy in Outcomes Research • Report Cards rank hospitals or physicians • Data used has at least two levels (hospitals or physicians and their patients) • Controversy is over the choice of statistical model for evaluating variation at the hospital or physician level

  22. Methods of Analyzing Hospital (or MD) Mortality Variance • Ignore hospital, run ordinary regression then predict average for each hospital • Remove hospital effect with indicator variables for hospitals (fixed effects model) then predict average for each hospital • Run random effects regression and obtain the Bayesian/shrinkage estimates for each hospital

  23. Shrinkage estimates and CA State CABG Data • Unadjusted estimate for each hosptial is estimated as from a normal distribution • More weight is given to hospitals with more CABG patients • Hospitals with smaller numbers move closer to the mean in modeling a normal distribution • Estimates somewhat software dependent

  24. Shrinkage Estimates: Software • Obtaining shrinkage estimates involves some software choices • Not all software provides them • STATA by itself doesn’t provide them • Different likelihood methods of fitting models • STATA add-on GLLAMM (free download) • SAS • For linear outcome, PROC MIXED • For non-linear, PROC NLMIXED and GLIMMIX • Some other software for multilevel data

More Related