Créer une présentation
Télécharger la présentation

Télécharger la présentation
## Module 3: Impact Evaluation for TTLs

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -

**Module 3: Impact Evaluation for TTLs**Paul J. Gertler Chief Economist, HDN Sebastian Martinez Impact Evaluation Cluster, AFTRL HD Learning Week Washington DC November 2006 Slides by Paul Gertler and Sebastian Martinez**Measuring Impact**What makes a good impact evaluation?**Motivation**• “Traditional” M&E: • Is the program being implemented as designed? • Could the operations be more efficient? • Are the benefits getting to those intended? • Monitoring trends • Are indicators moving in the right direction? • NO inherent Causality • Impact Evaluation: • What was the effect of the program on outcomes? • Because of the program, are people better off? • What would happen if we changed the program? • Causality**Motivation**• Objective in evaluation is to estimate the CAUSAL effect of intervention X on outcome Y • What is the effect of a cash transfer on household consumption? • For causal inference we must understand the data generation process • For impact evaluation, this means understanding the behavioral process that generates the data • how benefits are assigned**Causation versus Correlation**• Recall: correlation is NOT causation • Necessary but not sufficient condition • Correlation: X and Y are related • Change in X is related to a change in Y • And…. • A change in Y is related to a change in X • Causation – if we change X how much does Y change • A change in X is related to a change in Y • Not necessarily the other way around**Causation versus Correlation**• Three criteria for causation: • Independent variable precedes the dependent variable. • Independent variable is related to the dependent variable. • There are no third variables that could explain why the independent variable is related to the dependent variable • External validity • Generalizability: causal inference to generalize outside the sample population or setting**Motivation**• The word cause is not in the vocabulary of standard probability theory. • Probability theory: two events are mutually correlated, or dependent if we find one, we can expect to encounter the other. • Example age and income • For impact evaluation, we supplement the language of probability with a vocabulary for causality.**Statistical Analysis & Impact Evaluation**• Statistical analysis: Typically involves inferring the causal relationship between X and Y from observational data • Many challenges & complex statistics • Impact Evaluation: • Retrospectively: • same challenges as statistical analysis • Prospectively: • we generate the data ourselves through the program’s design evaluation design • makes things much easier!**How to assess impact**• What is the effect of a cash transfer on household consumption? • Formally, program impact is: α = (Y | P=1) - (Y | P=0) • Compare same individual with & without programs at same point in time • So what’s the Problem?**Solving the evaluation problem**• Problem: we never observe the same individual with and without program at same point in time • Need to estimate what would have happened to the beneficiary if he or she had not received benefits • Counterfactual: what would have happened without the program • Difference between treated observation and counterfactual is the estimated impact**Finding a good counterfactual**• The treated observation and the counterfactual: • have identical factors/characteristics, except for benefiting from the intervention • No other explanations for differences in outcomes between the treated observation and counterfactual • The only reason for the difference in outcomes is due to the intervention**Measuring Impact**Tool belt of Impact Evaluation Design Options: • Randomized Experiments • Quasi-experiments • Regression Discontinuity • Difference in difference – panel data • Other (using Instrumental Variables, matching, etc) • In all cases, these will involve knowing the rule for assigning treatment**Choosing your design**• For impact evaluation, we will identify the “best” possible design given the operational context • Best possible design is the one that has the fewest risks for contamination • Omitted Variables (biased estimates) • Selection (results not generalizable)**Case Study**• Effect of cash transfers on consumption • Estimate impact of cash transfer on consumption per capita • Make sure: • Cash transfer comes before change in consumption • Cash transfer is correlated with consumption • Cash transfer is the only thing changing consumption • Example based on Oportunidades**Oportunidades**• National anti-poverty program in Mexico (1997) • Cash transfers and in-kind benefits conditional on school attendance and health care visits. • Transfer given preferably to mother of beneficiary children. • Large program with large transfers: • 5 million beneficiary households in 2004 • Large transfers, capped at: • $95 USD for HH with children through junior high • $159 USD for HH with children in high school**Oportunidades Evaluation**• Phasing in of intervention • 50,000 eligible rural communities • Random sample of of 506 eligible communities in 7 states - evaluation sample • Random assignment of benefits by community: • 320 treatment communities (14,446 households) • First transfers distributed April 1998 • 186 control communities (9,630 households) • First transfers November 1999**“Counterfeit” CounterfactualNumber 1**• Before and after: • Assume we have data on • Treatment households before the cash transfer • Treatment households after the cash transfer • Estimate “impact” of cash transfer on household consumption: • Compare consumption per capita before the intervention to consumption per capita after the intervention • Difference in consumption per capita between the two periods is “treatment”**Case 1: Before and After**• Compare Y before and after intervention αi = (CPCit | T=1) - (CPCi,t-1| T=0) • Estimate of counterfactual (CPCi,t| T=0) = (CPCi,t-1| T=0) • “Impact” = A-B CPC Before After A B t-1 t Time**Case 1: Before and After**• Compare Y before and after intervention αi = (CPCit | T=1) - (CPCi,t-1| T=0) • Estimate of counterfactual (CPCi,t| T=0) = (CPCi,t-1| T=0) • “Impact” = A-B • Does not control for time varying factors • Recession: Impact = A-C • Boom: Impact = A-D CPC Before After A D? B C? t-1 t Time**“Counterfeit” CounterfactualNumber 2**• Enrolled/Not Enrolled • Voluntary Inscription to the program • Assume we have a cross-section of post-intervention data on: • Households that did not enroll • Households that enrolled • Estimate “impact” of cash transfer on household consumption: • Compare consumption per capita of those who did not enroll to consumption per capita of those who enrolled • Difference in consumption per capita between the two groups is “treatment”**Those who did not enroll….**• Impact estimate: αi = (Yit | P=1) - (Yj,t| P=0) , • Counterfactual: (Yj,t| P=0) ≠ (Yi,t| P=0) • Examples: • Those who choose not to enroll in program • Those who were not offered the program • Conditional Cash Transfer • Job Training program • Cannot control for all reasons why some choose to sign up & other didn’t • Reasons could be correlated with outcomes • We can control for observables….. • But are still left with the unobservables**Impact Evaluation Example:Two counterfeit counterfactuals**• What is going on?? • Which of these do we believe? • Problem with Before-After: • Can not control for other time-varying factors • Problem with Enrolled-Not Enrolled: • Do no know why the treated are treated and the others not**Possible Solutions…**• We need to understand the data generation process • How beneficiaries are selected and how benefits are assigned • Guarantee comparability of treatment and control groups, so ONLY difference is the intervention**Measuring Impact**• Experimental design/randomization • Quasi-experiments • Regression Discontinuity • Double differences (diff in diff) • Other options**Choosing the methodology…..**• Choose the most robust strategy that fits the operational context • Use program budget and capacity constraints to choose a design, i.e. pipeline: • Universe of eligible individuals typically larger than available resources at a single point in time • Fairest and most transparent way to assign benefit may be to give all an equal chance of participating randomization**Randomization**• The “gold standard” in impact evaluation • Give each eligible unit the same chance of receiving treatment • Lottery for who receives benefit • Lottery for who receives benefit first**Population**Randomization Sample Randomization Treatment Group Control Group**External & Internal Validity**• The purpose of the first-stage is to ensure that the results in the sample will represent the results in the population within a defined level of sampling error (external validity). • The purpose of the second-stage is to ensure that the observed effect on the dependent variable is due to some aspect of the treatment rather than other confounding factors (internal validity).**Case 3: Randomization**• Randomized treatment/controls • Community level randomization • 320 treatment communities • 186 control communities • Pre-intervention characteristics well balanced**Measuring Impact**• Experimental design/randomization • Quasi-experiments • Regression Discontinuity • Double differences (diff in diff) • Other options**Case 4: Regression Discontinuity**• Assignment to treatment is based on a clearly defined index or parameter with a known cutoff for eligibility • RD is possible when units can be ordered along a quantifiable dimension which is systematically related to the assignment of treatment • The effect is measured at the discontinuity – estimated impact around the cutoff may not generalize to entire population**Indexes are common in targeting of social programs**• Anti-poverty programs targeted to households below a given poverty index • Pension programs targeted to population above a certain age • Scholarships targeted to students with high scores on standardized test • CDD Programs awarded to NGOs that achieve highest scores**Example: effect of cash transfer on consumption**• Target transfer to poorest households • Construct poverty index from 1 to 100 with pre-intervention characteristics • Households with a score <=50 are poor • Households with a score >50 are non-poor • Cash transfer to poor households • Measure outcomes (i.e. consumption) before and after transfer**Non-Poor**Poor**Case 4: Regression Discontinuity**• Oportunidades assigned benefits based on a poverty index • Where • Treatment = 1 if score <=750 • Treatment = 0 if score >750**Case 4: Regression Discontinuity**Baseline – No treatment 2**Case 4: Regression Discontinuity**Treatment Period**Potential Disadvantages of RD**• Local average treatment effects – not always generalizable • Power: effect is estimated at the discontinuity, so we generally have fewer observations than in a randomized experiment with the same sample size • Specification can be sensitive to functional form: make sure the relationship between the assignment variable and the outcome variable is correctly modeled, including: • Nonlinear Relationships • Interactions**Advantages of RD for Evaluation**• RD yields an unbiased estimate of treatment effect at the discontinuity • Can many times take advantage of a known rule for assigning the benefit that are common in the designs of social policy • No need to “exclude” a group of eligible households/individuals from treatment**Measuring Impact**• Experimental design/randomization • Quasi-experiments • Regression Discontinuity • Double differences (Diff in diff) • Other options**Case 5: Diff in diff**• Compare change in outcomes between treatments and non-treatment • Impact is the difference in the change in outcomes • Impact = (Yt1-Yt0) - (Yc1-Yc0)