Comments: The Big Picture for Small Areas

Comments: The Big Picture for Small Areas Alan M. ZaslavskyHarvard Medical School

Thanks to presenters • 3 interesting talks • Raise significant policy issues

Voting rights tabulation • Generic approach for beta-binomial modeling • Shrinkage calculations (R. Little) • Approach to quasi-Bayesian estimation for clustered survey data (D. Malec) • Why jurisdictional classes rather than prior centered on prediction? • Use of classes predictably biases up or down just above or below class boundary. • Problem of discreteness/thresholds

Voting rights tabulation • How ‘general purpose’ is the product? • Inference for point estimate of % • vs inference for P(>5%). • Presentation of results • Bayes methods → posterior distributions  • Present results for multiple inferences? • SAE of aggregates ≠ aggregate of SAEs • Perils of thresholds/discreteness

“Context specificity” • What does it add beyond predictive variance? • Model error worse than a sampling error – why? • Might be better understood as a measure of model-robustness. • Might not have unambiguous definition • In lead example, should precision of NHIS or BRFSS data define ‘specificity’? (NHIS-BRFSS association is a model estimate.) • Depends on which inference: Estimate of absolute levels sensitive to calibration Estimate of differences/ranking among areas unaffected by calibration

“Context specificity” • Highlights value of transparency of methodology • Develop heuristic explanations of components contributing to estimation and their ‘weights’ • “For estimation of XXX … • “Total (predictive) SE is … • “XX% from sampling in BRFSS … • “YY% from estimation of NHIS calibration model… • “ZZ% from model error of covariate model…”

Outcome screening • Prioritizing more global SAE program • Technical concerns • Do methods properly account for sampling variance of domain proportions? • In this 2-level model, why use ad hocmethods for level-2 variance estimation? • Strategic concerns • Consider costs & benefits as well as variances • Posterior ranking Є {overkill} ? • Consider families of outcomes, not just individual outcomes • e.g. 12 binomial variables, likely related, for same Asian population

Current state of SAE • Typically one variable or a few closely related • Relationships only as explicitly selected for models • Not higher-order interactions • Each major SAE a major project • High-level statistical expertise involved • Takes a long time • Lack of fully generic methods • (… although principles fairly well established) • Depends on amount & structure of available data, distributions & relationships, etc. • Often new methods required for each project

Path that extends current methods • More estimation projects • Elaborate more generic methods • Adapt to various data structures • More use of multilevel structure • Still univariate or low-dimensional • OK for many… • single-purpose surveys • health care applications (“profiling”)

Some goals for general-purpose surveys • Generate SAE for all current products • Detailed cross-tabulations • Microdata • Plausible (not “correct”) for all relationships • Valid presentation of uncertainty • Consistency of all products • Margins and aggregation of estimates

What might this look like? • Almost certainly requires some form of microdata synthesis • Yields consistency • Units that look ‘enough’ like real units • Two approaches • “Bottom up” synthesisof units (persons, households) • “Top down” imposition of constraints on synthetic samples of real units

Advantages of ‘top-down’ approach • Building from observed units makes high-order interactions realistic • Otherwise most difficult to model • Impose constraints via weighting or constrained resampling • Weighting is like predictive mean estimation; properties more readily controllable properties • Constraints may be from direct estimates, SAE, purely predictive estimates • Uncertainty via stochastic prediction of constraints and MI

Previous applications • Reweighting/Imputation of households for census undercount (Zaslavsky 1988, 1989) • Reweighting for food stamp microsimulations • “Large numbers of estimates for small areas” (Schirm & Zaslavsky 1997-2002) • High-order interactions crucial to simulation of program provisions • Reweight national CPS data to simulate each state in turn (direct and SAE controls)

Synthesis • Work will proceed on many fronts • Develop and integrate new data sources • Targeted SAE projects responsive to needs • Advances in dissemination & explication • Integrate improvements in SAE for marginal (single-variable) estimates into overall synthetic framework.

Thank you!

Comments: The Big Picture for Small Areas

Comments: The Big Picture for Small Areas

Presentation Transcript

Basic Java Syntax

Theories of Mental Representation

Acting on Feedback

Mrs. Bass’s Tagxedos

CC5212-1 Procesamiento Masivo de Datos Otoño 2014

Be A Picture Detective!

How to Deal with Comments on Your Blog?

Overview of C++

Navigation bar

Python Documentation

Comments on Mr. Lennhoff’s Defense of Raising Taxes

PBR Comments From SAME SB Conference

Peer and Self-Assessment

PBR Comments From SAME SB Conference

CREST

Comments for EESSI/CEN Draft Standards

Astronomy Picture of the Day

Preservation planning: the big picture and the small picture

St. Charles Lacrosse 2014

CREST

PBR Comments From SAME SB Conference

Seven Secrets to Success – Study Areas