New Frontiers in Poverty Measurement Part II

New Frontiers in Poverty Measurement Part II Peter Lanjouw, DECPI March, 2011

Outline Three Topics • Small area estimation of poverty: “Poverty Maps” • Comparing poverty with non-comparable data • Using repeated cross-sections to explore movements in and out of poverty

1. Poverty Mapping • Introduction • What is a poverty map? • Why is there demand? • Poverty mapping methodology • Examining the underlying assumptions

Introduction • Project within World Bank’s Research Department • In collaboration with academics • Methodological papers • Elbers, Lanjouw and Lanjouw (2003, Econometrica) • Hentschel et al. (2000) and ELL (2000, 2002) • PovMap Software (Qinghua Zhao) • Goal is to produce accurate estimates of welfare at small area level – “poverty maps”

What are Poverty Maps? • Not necessarily “Maps”; rather, • highly disaggregated databases of welfare • Poverty • Inequality • Average income/consumption • Calorie intake • Under-nutrition • Other indicators (health outcomes? life-expectancy?) • disaggregation may, but need not, be spatial • Poverty of “statistically invisible” groups

Example: Yunnan Province (China)

Example: Yunnan Province (China)Township-level poverty

Why is there demand? Growing world-wide interest in having access to local-level information 1. Process of decentralization: • Sub-national governments (state, municipality….) are expected to devise and implement policies • It is important to know which localities should be prioritized • Need to compare poverty across localities

Why is there demand? • Geographic targeting of resources • Fine geographic targeting typically results in less leakage than coarse targeting. • Simulations from Ecuador, Cambodia and Madagascar: • Poverty reduction attainable by a uniform lump-sum transfer can be achieved with less than one third of the total funds available if the funds are targeted to the poorest communities (on average less than 2000 households). • Of course, implementation of fine targeting can also be more costly

Basic problem • Main source of information on consumption welfare - household expenditure surveys - permit only limited disaggregation • Very large data sources (e.g. census) typically collect very limited information on welfare outcomes

Previous “solutions” • Collect larger samples • Expensive • Some kind of data compromise 2. Combine limited information available in data, such as the census, into some welfare index (e.g. “basic needs index” or “asset index”) • Ad-hoc, easily leads to multiple maps • How to interpret? • Measures of welfare do not line up with official numbers at the national/regional level

Combine Census and Survey • Impute a measure of welfare from household survey into census, using statistical prediction methods • Produces readily interpretable estimates: Works with exactly the same concept of welfare as traditional survey-based analysis. • Statistical precision can be gauged • Encouraging results to date • But, non-negligible data requirements

Data requirements • Survey and census have variables in common (questionnaires have to be corresponding) • Common variables are sufficiently correlated with consumption • Survey and census can be linked at the cluster level • Census includes variables that capture location specific effects (or 3rd data set) • Census enjoys large coverage

Methodology ELL (2002, 2003) • Estimate a model of, for example, per-capita consumption yh using sample survey data • Restrict explanatory variables to those that can be linked to households in survey and census • Estimate expected level of poverty or inequality for a target population using its census-based characteristics and the estimates from the model of y

Three Basic Stages • Zero stage: establish comparability of data sources; identify/merge common variables; understand sampling design; GIS info(?) • First stage: estimate model of consumption/income • Second stage: take parameter estimates to census, predict consumption, and estimate poverty and inequality.

Methodology • Let W(m, y) be a welfare measure based on a vector of household per-capita expenditures, y, and household sizes, m. • We want to estimate W for a target population (say a municipality, v) where y is unknown. • We estimate a model of consumption/income per capita: • where ηc is a cluster random effect allowing for a locational influence on consumption.

Estimation Details First Stage: • Estimate separate regressions per stratum • Use cluster weights where significant • Allow for non-normality of disturbances (parametric/non-parametric), and • Heteroskedasticity in individual-specific component of disturbances. • Model is estimated by GLS • Modelling criteria: explanatory power, significance of parameters, parsimony (overfitting), size of location effect.

Estimation Details Second Stage: Simulation into Census • For each household slope coefficients, , and simulated disturbance terms, and , are drawn from their corresponding distributions (parametric or semi-parametric) • Simulate per capita expenditure/income per household: • Apply r simulations • Calculate poverty in target population for each simulation. • Welfare estimate is mean estimate across r simulations • Standard error is standard deviation across simulations.

Prediction Error The error in the estimator can be decomposed as: • Idiosyncratic error – increases with smaller populations. • Model error – not related to size of target population. • Other elements can include: • Computation error – part of model error, can be negligible with sufficient number of simulations

Key assumptions • Model accurately describes consumption for each level to which it is applied • Conditional distribution of y given x in small area A is the same as in larger region R • Tarozzi and Deaton (2007) refer to this as the “Area homogeneity assumption” • A shared cluster error is able to provide an accurate account of the spatial correlation between households • Presence of spatial correlation will diminish precision of estimates • Validation studies are needed to check on these assumptions • Elbers, Lanjouw and Leite (2008) provide validation study for Brazil

Testing the Poverty Mapping Methodology:A Brazil Case Study • Elbers, Lanjouw and Leite (2008) consider Minas Gerais, Brazil • Brazil Census collects income data • Thin round (87.5%) collects single-question measure of household income • Thick round (12.5%) collects more detailed info. • Neither are judged reliable for an ‘official’ poverty map. • We focus on Minas Gerais (for computational ease) • 606,000 households in 12.5% sample (out of 4.8m) • 12.5% sample covers all 853 municipalities in Minas Gerais

Minas Gerais: Brazil within Brazil • Per Capita Income

Minas Gerais: Brazil within Brazil • Infant Mortality Rate

Minas Gerais: Brazil within Brazil • Life Expectancy

Testing the Poverty Mapping Methodology:Brazil • We draw 41 synthetic surveys from Census sample • 21 mimic sample design of POF - 2,800 households • 13 households per cluster/EA • 241 EA’s in about 151 Municipalities • 20 mimic sample design of PNAD – 12,000 households • 16 households per cluster/EA • 779 EA’s in 123 municipalities • We produce 41 poverty maps for Minas Gerais • We estimate location effect at EA level • We apply location effect at Municipality level • Tarozzi and Deaton’s conservative approach

Testing the Methodology in Brazil • Estimate 41 models

Testing the Methodology in Brazil • Exercise 1: Differences in returns • Apply one model in full census sample (specified in one PNAD sample) • Re-estimate model separately in each municipality (again in PNAD sample) • Compare predicted municipality-level income

Testing the Methodology in Brazil • Municipal level Poverty Estimates versus “Truth”

Testing the Methodology in MG • Overly Precise estimates?

Testing the Methodology in MG • Are poverty estimates usable?

Conclusion • Our evidence is quite supportive of ELL methodology and underlying assumptions. • BUT, evidence for one place need not imply assumptions hold everywhere. • Validation efforts like these must be undertaken wherever possible. • Sometimes survey data do allow one to probe ELL assumptions explicitly. Clearly that should be done, whenever possible. • Proper validation can be built into planning and design of future poverty mapping activities. • Involvement of census bureau is likely to be central.

2. Comparing Poverty Across Non-Comparable Data • Household surveys fielded at different times are rarely identical in every respect: • Timing of fieldwork • Refining/modification of questionnaire • Aggregation/disaggregation of consumption components • Shift from recall to diary, or changes in recall periods • Canpoverty still be readily compared? • “Great Indian Poverty Debate” focused on the 1999/0 round of the NSS survey which introduced slightly altered recall period. • Comparability of estimates was seriously compromised • Deaton & Tarozzi, Himanshu and Sen, Kijima and Lanjouw

Comparing Poverty • Can we use ELL method to impute consumption from one survey into another? • i.e impute consumption into survey of time t+1 based on a model estimated in survey of time t • Are predicted poverty rates for t+1similar to actual rates? • If so, this implies parameter estimates from consumption model are stable over time • All action is in changing X’s • Christiaensen, Lanjouw, Luotoand Stifel (2010) experiment with a variety of different consumption models in Kenya, Vietnam, and Russia • “test” this idea with surveys that are actually comparable, but pretend that the surveys are not.

Comparing Poverty • Example of Vietnam • Consider two household surveys: 1992/93 and 1997/8 • These surveys are generally regarded as high quality and fully comparable • Poverty in Vietnam declined significantly in this period 1993/4 1997/8 National 60.6% 37.4% Rural 68.5% 44.9% Urban 28.6% 9.0% • Indicative of major structural changes in Vietnam • A priori expectation that “returns” are changing, i.e. stable parameter assumption would not hold.

Comparing Poverty • “poverty map” style models

Caveats • Replicating this exercise in Russia, between 1994 and 1998, doesn’t work so well. • Major financial crisis: poverty rose from 15.5% to 43.8% • However, model for 1994 predicts poverty in 2003 reasonably well • But little change in poverty over this time period • Broad Conclusion: assumption of stable parameters in roughly adjacent years seems reasonable. • If there is a major crisis (earthquake, macroeconomic, etc.) then caution is warranted.

3. Using repeated cross-sections to explore movements in and out of poverty (Lanjouw, Luoto and McKenzie, 2011) • Goal: • Explore whether repeated cross-sections which are widely available can be used to provide some reasonable, basic, descriptives of transitions in and out of poverty. • Set out methods which we claim will give upper and lower bounds on mobility. • Validate these methods by using genuine panel data from Vietnam and Indonesia, generating repeated cross-sections from these panels, and comparing the results of our method to what one would estimate based on the genuine panel.

Our proposed approach • Combines ideas of poverty-mapping with pseudo-panel ideas. • Will set out for case of 2 rounds, can be extended easily to multiple rounds. • Let xi1 be characteristics of household iin time period 1, which are observed in both the round 1 and round 2 surveys: • All time-invariant characteristics (language, religion, ethnicity) • Characteristics of household head if the head doesn’t change across rounds (sex, place of birth, parental education, etc.) • Can include time-varying characteristics that can easily be recalled for round 1 in round 2 • E.g. whether household head was employed in round 1, place of residence in round 1, whether household has a TV in round 1, etc.

Projections • Project round 1 consumption or income onto xi1: • Project round 2 consumption or income onto same set of characteristics as they appear at time of second round: • Then we are interested in knowing quantities such as: Don’t observe for the same household

Proposed method • Step one: Use the sample of households observed in round 1, and regress on • Obtain the OLS estimator and the residuals: • Step two: For each household observed in round 2, take a random draw with replacement from the empirical distribution of residuals, then combine with parameter estimate and known x to estimate round 1 income or consumption:

Proposed method • Step Three: calculate movements into and out of poverty using in place of the unobserved round 1 variable: • Step Four: Repeat steps 1-3 R times, and take average of the quantity of interest over the R replications.

Under what conditions will this be consistent? • Condition 1: the underlying population sampled is the same in round 1 and round 2 • Requires measure of consumption to be same from round to round, no (non-random) changes in underlying population from births, deaths, migration out of sample…as with pseudo-panels in general, household analysis works best when restricted to households headed by prime age adults

Under what conditions will this be consistent? • Condition 2: εi1 is independent of yi2.This requires εi1 to be independent of εi2 (otherwise the distribution of εi1|yi2 >p is not the same as the unconditional distribution of εi1) • Won’t hold if: • Error term contains individual fixed effect • If shocks to consumption or income are non-transitory. • Expect in many cases this condition to be violated. So long as errors positively correlated (which seems likely in most cases), this will overstate mobility, providing an upper bound on movements into and out of poverty.

Lower bound method • Instead assume the prediction error for household iin round 1 is the same as it is for round 2 (perfect positive autocorrelation). • Step One: for sample of households surveyed in round 2, obtain OLS residuals: • Step Two: then estimate round 1 income or consumption as • Step Three: Use the estimated y from step 2 to calculate poverty dynamic of interest.

Measurement error • Methods here aim to estimate same level of movements into and out of poverty as one would observe in genuine panel data. • Some of this mobility will be due to measurement error. A variety of fixes in literature (e.g. Glewwe, 2005; Antman and McKenzie, 2007; Fields et al. 2007) • Basic idea of these is to study mobility which is related to mobility in some underlying variable (e.g. health, cohort characteristics, assets) • Not the goal here: we want to just see if we can match panel.

Datasets • Choose two genuine panels from Vietnam and Indonesia: • VLSS 1992/93 and 1997/98 • Period over which poverty fell from 58% to 37%, more households exiting poverty than entering • Panel of approximately 4800 households • Indonesian Family Life Survey 1997 and 2000 (IFLS2 and 3) • Static in terms of overall poverty levels, household moving into and out of poverty at similar rates • Panel of 7500

Validation of method • Randomly split each genuine panel into two sub-samples, A and B. • Use sub-sample A from round 1 and sub-sample B from round 2 as two repeated cross-sections. • Then carry out our method by using sub-sample A to impute round 1 values for sub-sample B, and compare to results we would get using genuine panel for sub-sample B.

Choosing variables • Consider a hierarchy of models which progressively employ more and more data that is sometimes, but not always, collected retrospectively. • Since we have the actual panel data to work with, we can force variables to be time-invariant by using round 1 variables. • Start with a basic “traditional model”, and add more regressors.

New Frontiers in Poverty Measurement Part II

New Frontiers in Poverty Measurement Part II

Presentation Transcript

Poverty measurement

Measurement of Poverty

Exploring New Frontiers

New Frontiers in Newborn Health:

Poverty Measurement

Part II: New Developments in RWD

NEW FRONTIERS

New Frontiers in QLR

Measurement – PART II

Forging New Frontiers

New Frontiers in Pathology

Poverty Measurement

New Frontiers in Pathology

New frontiers in internal communications

New Frontiers? „New frontiers in Evaluation“ April 24th and 25th, Vienna

Poverty Research and Poverty Measurement in Finland

New Frontiers

New Frontiers in Systems Biology

New Frontiers

New Frontiers in Systems Biology

New Frontiers in Particle Physics

Poverty Measurement in Tajikistan