1 / 22

Using Propensity Score Matching in Observational Services Research

Using Propensity Score Matching in Observational Services Research. Neal Wallace, Ph.D. Portland State University February 2014. Overview. What is Propensity Score Matching (PSM) When is PSM used How PSM works A Simple Recipe for Generalized PSM Some MH Services Research Examples

larrybutler
Télécharger la présentation

Using Propensity Score Matching in Observational Services Research

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Using Propensity Score Matching in Observational Services Research Neal Wallace, Ph.D. Portland State University February 2014

  2. Overview • What is Propensity Score Matching (PSM) • When is PSM used • How PSM works • A Simple Recipe for Generalized PSM • Some MH Services Research Examples • Summary Overview of PSM

  3. What is Propensity Score Matching • Propensity Score Matching (PSM) is a technique used to create commonality among groups of observations/subjects across observed characteristics • PSM is used in observational research where random assignment is not feasible • Reduces or eliminates sample selection bias in making comparisons across groups (e.g. treatment/control) on observable characteristics

  4. When is PSM used? • Suppose you are interested in whether two clinics provide equivalent amounts of treatment to individuals • But age and mental health status also determine utilization independent of clinic practice • A review of utilization, age and mental health status by clinic might look like….

  5. When is PSM used? • You can’t tell whether the difference in utilization is due to clinic practice, age/gender/MH status, or both

  6. When is PSM used? • You could use multivariate regression to “balance” the effects of potential confounding measures (age, gender, & MH status here) • The regression equation would look like: Visits = B0 + B1*Clinic + B2*Age + B3*Gender + B4*MH Status + e • And (hopefully) B1 would represent any difference in average individual utilization across clinics after accounting for differences in age, gender & MH status…

  7. When is PSM used? • But..what if a closer look at your samples revealed this: • You can’t balance what doesn’t exist (no young females in clinic A) • You won’t get balance on MH status using linear regression if relationship with utilization is non-linear

  8. When is PSM used? • You could try to “hand pick” matching subjects across clinics (e.g. basic case-control matching) • But this won’t be easy/feasible if your study sample has: • A large number of observations • A large number of matching variables (“unique” case types increase exponentially) • Continuous matching variables (e.g. fine grained status scale) • This is where PSM steps in with a statistical means of doing complex case matching on observed sample characteristics

  9. How it works • A propensity score is an estimated probability of an observation being in one sample group or another given a set of observed characteristics • Propensity Score = Pr (D=1|X) • D indicates sample group (one of our clinics in example) • X are observed characteristics (age, gender, MH status) • Typically estimated with logistic regression: Clinic(0,1) = B0 + B1*Age + B2*Gender + B3*MH Status + e

  10. How it works • A propensity score (PS) is an estimated probability of an observation being in one sample group or another given a set of observed characteristics • Propensity Score = P(X) = Pr (d=1|X) • D indicates sample group (one or our clinics in example) • X are observed characteristics (age, gender, MH status) • Typically estimated with logistic regression: Clinic(0,1) = B0 + B1*Age + B2*Gender + B3*MH Status + e

  11. How it works • Propensity score matching: match groups of observations on the estimated probability of being in a group (propensity score). • Key assumption: Being in a group is independent of outcomes conditional on group characteristics • This is false if there are unobserved characteristics associated with being in a group • PSM enables matching at the mean and in distribution of observed characteristics (single score represents multiple dimensions)

  12. How it Works • There are different ways to apply PSM • Exact matching on propensity score • if all characteristics are measured as binary indicators then each PS is unique • “Fuzzy” matching on propensity score • Find closest individual match e.g. nearest neighbor, caliper matching • General matching on propensity score • Match propensity score distribution e.g. stratify observations per decile of PS (0-.1,.1-.2,.2-.3,etc.)

  13. How it Works • A key element of any type of PSM is support: • Support means there is some reasonable match for any observation(s) in both groups • Observations without support will create biased results if they are not eliminated

  14. Density of scores for Clinic B Density Density of scores for clinic A Region of common support High probability of being in Clinic B given X 0 1 Propensity score

  15. How it Works • For general/distributional PSM you also need to test for balance: • Balance means that there are no statistical (or meaningful) differences in characteristics among observations within the same PS strata (e.g. decile) • If you cannot find balance you need to break PS into finer strata and test again • If you are still having problems you may need to add cross-products of measures in your PS logit regression and try again.

  16. A Recipe for General PSM • Run logistic regression (remember clinic equation) • Generate predicted probabilities (PS) of being in one group or another • Recode (group) raw PS into strata (at least quintiles) • Run a frequency of PS strata by observation group (e.g. frequency of PS quintile groups by clinic) • Check for any groups without support

  17. A Recipe for General PSM • Do a t-test on characteristic measures between observation groups (e.g. clinic) within each PS strata (with support) to test for balance • If you don’t have balance use steps noted above and repeat • If you have balance and support you can choose to: • Use PS or original “confounder” variables in regression to obtain comparison of outcomes (but remember linearity assumption issue) • Continue on to create a sample that is matched in distribution across PS groups

  18. A Recipe for General PSM • To “fully” match samples, randomly select observations from the larger observation group within each PS strata • Re-do your “sample characteristics” table to show sample equivalence at mean • You can now directly compare means of outcome using any statistical test you want: • without reference/inclusion of “confounder” variables • without concern for non-linear relations between confounders and outcomes

  19. A Recipe for General PSM • Results from doing PSM in our clinic example might look like this:

  20. MH Services Examples • The Use of Propensity Scores to Evaluate Outcomes for Community Clinics (Hodges & Grunwald, 2005): • Compare outcomes for youth in “exemplary” clinic to “typical care” • Use general PSM to find supported and balanced comparison group from other clinics (note: not “fully” matched but use PS as covariate) • Propensity Score Matching:An Illustrative Analysis of Dose Response (Foster 2003): • Compare outcomes for youths grouped by amount (“dose”) of outpatient services received • Use general PSM to find supported and balanced groups by “dose” (note: again not “fully” matched but use PS as covariate)

  21. Summary Overview of PSM • Next best approach after randomization to account for sample selection bias: • But..only good for observable characteristics • Increasingly expected as part of high quality observational research • At least checking for support can potentially save you from making bad comparisons • But if you have to drop observations you need to be clear about whom your sample represents (external validity) • “Fully” matching (by individual or strata) has additional side benefits: • Don’t need to include additional covariates in analysis • Don’t need to worry about non-linearity between covariates and outcomes • Any characteristic based sub-sample will also be “matched”

  22. Questions? Thank You nwallace@pdx.edu Mark O. Hatfield School of Government Portland State University

More Related