1 / 12

Propensity Score

Propensity Score. Overview: Motivation: what do we use a propensity score for? Constructing the propensity score Implementing the propensity score in Stata I’ll post these slides on the web. Motivation: a Case in which the Propensity Score is Useful . This is just an illustrative example:

bernad
Télécharger la présentation

Propensity Score

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Propensity Score • Overview: • Motivation: what do we use a propensity score for? • Constructing the propensity score • Implementing the propensity score in Stata • I’ll post these slides on the web

  2. Motivation: a Case in which the Propensity Score is Useful • This is just an illustrative example: • Suppose that you want to evaluate the effect of a scholarship program for secondary school on years of schooling. • Suppose every primary school graduate is eligible. • You have data on every child, including test scores, family income, age, gender, etc. • Scholarships are awarded based on some combination of test scores, family income, gender etc. but you don’t know the exact formula.

  3. Motivation (cont.) • If scholarships where assigned completely at random, we could just compare treatment and control group (randomized experiment) • In our example, assume you know that children with higher test scores are more likely to get the scholarship, but you don’t know how important this and other factors are, you just know that the decision is based on information you have and some randomness. • What can you do with this information?

  4. Motivation (cont.) • In principle, you could just run the following regression: years of schoolingi = β0+ β1 Ii scholarship+β2 Xi + εi where Iischolarship is a dummy variable for receiving the scholarship and Xi are all the variables that you think affect the probability of receiving a scholarship (Xi can be a whole vector of things), εi is the error term • This works only if you know that the probability of getting a scholarship is just a linear function of X. • But we may have no idea how the selection depends on X. For instance, they may use discrete cutoffs rather than a linear function

  5. Motivation (cont.) • Suppose your variables are not continuous, but they are categories (somewhat arbitrarily). • E.g. family income above or below $50 per week, scores above or below the mean, sex, age, etc. • Now, you could put in dummy variables for each category and interaction between all dummies. This would distinguish every group formed by the categories. • Or you could run separate regressions for each group • This is more flexible since it allows the effect of the scholarship to differ by group. • These methods are in principle correct, but they are only feasible if you have a lot of data and few categories.

  6. Constructing the Propensity Score • Estimation based on the propensity score can deal with these kinds of situations. • For it to work, you need to have a selection into the treatment (in our case the scholarship) that is based on observables (observed covariates). • The following gives a brief overview of how the propensity score is constructed. • In practice, you can download a canned Stata command that will do all of this for you.

  7. Definition and General Idea • Definition: The propensity score is the probability of receiving treatment (in our case a scholarship), conditional on the covariates (in our case Xi). • The idea is to compare individuals who based on observables have a very similar probability of receiving treatment (similar propensity score), but one of them received treatment and the other didn’t. • We then think that all the difference in the outcome variable is due to the treatment.

  8. First stage • In the first stage of a propensity score estimation, you find the propensity score. • To do this, you run a regression (probit or logit) that has the probability of receiving treatment on the left hand side, and the covariates that determine selection into the treatment on the right hand side: Iitreat = β0+ β1 Xi1 + β2 Xi2 +… (+γ 1 Xi12+ γ 2 Xi22…) + εi • The propensity score is just the predicted value Îitreatthat you get from this regression. • You start with a simple specification (e.g. just linear terms). Then you follow the following algorithm to decide whether this specification is good enough:

  9. Algorithm • Sort your data by the propensity score and divide it into blocks (groups) of observations with similar propensity sores. • Within each block, test (using a t-test), whether the means of the covariates are equal in the treatment and control group. If so  stop, you’re done with the first stage • If a particular block has one or more unbalanced covariates, divide that block into finer blocks and re-evaluate • If a particular covariate is unbalanced for multiple blocks, modify the initial logit or probit equation by including higher order terms and/or interactions with that covariate and start again.

  10. Second Stage • In the second stage, we look at the effect of treatment on the outcome (in our example of getting the scholarship on years of schooling), using the propensity score. • Once you have determined your propensity score with the procedure above, there are several ways to use it. I’ll present two of them (canned version in Stata for both): • Stratifying on the propensity score • Divide the data into blocks based on the propensity score (blocks are determined with the algorithm). Run the second stage regression within each block. Calculate the weighted mean of the within-block estimates to get the average treatment effect. • Matching on the propensity score • Match each treatment observation with one or more control observations, based on similar propensity scores. You then include a dummy for each matched group, which controls for everything that is common within that group.

  11. Implementation in Stata • Search pscore in help • Click on the word “pscore” • Download the command • See stata help for how to implement it • First stage: pscore treat X1 X2 X3…, pscore(scorename) • Second stage: attr (for matching) or atts (for stratifying): attr outcome treat, pscore(scorename)

  12. General Remarks • The propensity score approach becomes more appropriate the more we have randomness determining who gets treatment (closer to randomized experiment). • The propensity score doesn’t work very well if almost everyone with a high propensity score gets treatment and almost everyone with a low score doesn’t: we need to be able to compare people with similar propensities who did and did not get treatment. • The propensity score approach doesn’t correct for unobservable variables that affect whether observations receive treatment.

More Related