Using Propensity Score Matching in Observational Services Research

Using Propensity Score Matching in Observational Services Research Neal Wallace, Ph.D. Portland State University February 2014

Overview • What is Propensity Score Matching (PSM) • When is PSM used • How PSM works • A Simple Recipe for Generalized PSM • Some MH Services Research Examples • Summary Overview of PSM

What is Propensity Score Matching • Propensity Score Matching (PSM) is a technique used to create commonality among groups of observations/subjects across observed characteristics • PSM is used in observational research where random assignment is not feasible • Reduces or eliminates sample selection bias in making comparisons across groups (e.g. treatment/control) on observable characteristics

When is PSM used? • Suppose you are interested in whether two clinics provide equivalent amounts of treatment to individuals • But age and mental health status also determine utilization independent of clinic practice • A review of utilization, age and mental health status by clinic might look like….

When is PSM used? • You can’t tell whether the difference in utilization is due to clinic practice, age/gender/MH status, or both

When is PSM used? • You could use multivariate regression to “balance” the effects of potential confounding measures (age, gender, & MH status here) • The regression equation would look like: Visits = B0 + B1*Clinic + B2*Age + B3*Gender + B4*MH Status + e • And (hopefully) B1 would represent any difference in average individual utilization across clinics after accounting for differences in age, gender & MH status…

When is PSM used? • But..what if a closer look at your samples revealed this: • You can’t balance what doesn’t exist (no young females in clinic A) • You won’t get balance on MH status using linear regression if relationship with utilization is non-linear

When is PSM used? • You could try to “hand pick” matching subjects across clinics (e.g. basic case-control matching) • But this won’t be easy/feasible if your study sample has: • A large number of observations • A large number of matching variables (“unique” case types increase exponentially) • Continuous matching variables (e.g. fine grained status scale) • This is where PSM steps in with a statistical means of doing complex case matching on observed sample characteristics

How it works • A propensity score is an estimated probability of an observation being in one sample group or another given a set of observed characteristics • Propensity Score = Pr (D=1|X) • D indicates sample group (one of our clinics in example) • X are observed characteristics (age, gender, MH status) • Typically estimated with logistic regression: Clinic(0,1) = B0 + B1*Age + B2*Gender + B3*MH Status + e

How it works • A propensity score (PS) is an estimated probability of an observation being in one sample group or another given a set of observed characteristics • Propensity Score = P(X) = Pr (d=1|X) • D indicates sample group (one or our clinics in example) • X are observed characteristics (age, gender, MH status) • Typically estimated with logistic regression: Clinic(0,1) = B0 + B1*Age + B2*Gender + B3*MH Status + e

How it works • Propensity score matching: match groups of observations on the estimated probability of being in a group (propensity score). • Key assumption: Being in a group is independent of outcomes conditional on group characteristics • This is false if there are unobserved characteristics associated with being in a group • PSM enables matching at the mean and in distribution of observed characteristics (single score represents multiple dimensions)

How it Works • There are different ways to apply PSM • Exact matching on propensity score • if all characteristics are measured as binary indicators then each PS is unique • “Fuzzy” matching on propensity score • Find closest individual match e.g. nearest neighbor, caliper matching • General matching on propensity score • Match propensity score distribution e.g. stratify observations per decile of PS (0-.1,.1-.2,.2-.3,etc.)

How it Works • A key element of any type of PSM is support: • Support means there is some reasonable match for any observation(s) in both groups • Observations without support will create biased results if they are not eliminated

Density of scores for Clinic B Density Density of scores for clinic A Region of common support High probability of being in Clinic B given X 0 1 Propensity score

How it Works • For general/distributional PSM you also need to test for balance: • Balance means that there are no statistical (or meaningful) differences in characteristics among observations within the same PS strata (e.g. decile) • If you cannot find balance you need to break PS into finer strata and test again • If you are still having problems you may need to add cross-products of measures in your PS logit regression and try again.

A Recipe for General PSM • Run logistic regression (remember clinic equation) • Generate predicted probabilities (PS) of being in one group or another • Recode (group) raw PS into strata (at least quintiles) • Run a frequency of PS strata by observation group (e.g. frequency of PS quintile groups by clinic) • Check for any groups without support

A Recipe for General PSM • Do a t-test on characteristic measures between observation groups (e.g. clinic) within each PS strata (with support) to test for balance • If you don’t have balance use steps noted above and repeat • If you have balance and support you can choose to: • Use PS or original “confounder” variables in regression to obtain comparison of outcomes (but remember linearity assumption issue) • Continue on to create a sample that is matched in distribution across PS groups

A Recipe for General PSM • To “fully” match samples, randomly select observations from the larger observation group within each PS strata • Re-do your “sample characteristics” table to show sample equivalence at mean • You can now directly compare means of outcome using any statistical test you want: • without reference/inclusion of “confounder” variables • without concern for non-linear relations between confounders and outcomes

A Recipe for General PSM • Results from doing PSM in our clinic example might look like this:

MH Services Examples • The Use of Propensity Scores to Evaluate Outcomes for Community Clinics (Hodges & Grunwald, 2005): • Compare outcomes for youth in “exemplary” clinic to “typical care” • Use general PSM to find supported and balanced comparison group from other clinics (note: not “fully” matched but use PS as covariate) • Propensity Score Matching:An Illustrative Analysis of Dose Response (Foster 2003): • Compare outcomes for youths grouped by amount (“dose”) of outpatient services received • Use general PSM to find supported and balanced groups by “dose” (note: again not “fully” matched but use PS as covariate)

Summary Overview of PSM • Next best approach after randomization to account for sample selection bias: • But..only good for observable characteristics • Increasingly expected as part of high quality observational research • At least checking for support can potentially save you from making bad comparisons • But if you have to drop observations you need to be clear about whom your sample represents (external validity) • “Fully” matching (by individual or strata) has additional side benefits: • Don’t need to include additional covariates in analysis • Don’t need to worry about non-linearity between covariates and outcomes • Any characteristic based sub-sample will also be “matched”

Questions? Thank You nwallace@pdx.edu Mark O. Hatfield School of Government Portland State University

Using Propensity Score Matching in Observational Services Research