1 / 56

Do We Still Need Probability Sampling in Surveys?

Do We Still Need Probability Sampling in Surveys?. Robert M. Groves University of Michigan and Joint Program in Survey Methodology, USA. Outline. The total survey error paradigm in scientific surveys The decline in survey participation The rise of internet panels

raymond
Télécharger la présentation

Do We Still Need Probability Sampling in Surveys?

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Do We Still Need Probability Sampling in Surveys? Robert M. Groves University of Michigan and Joint Program in Survey Methodology, USA

  2. Outline • The total survey error paradigm in scientific surveys • The decline in survey participation • The rise of internet panels • The “second era” of internet panels • So... do we need probability sampling?

  3. Outline • The total survey error paradigm in scientific surveys • The decline in survey participation • The rise of internet panels • The “second era” of internet panels • So... do we need probability sampling?

  4. The Ingredients of Scientific Surveys • A target population • A sampling frame • A sample design and selection • A set of target constructs • A measurement process • Statistical estimation

  5. Deming (1944) “On Errors in Surveys” • American Sociological Review! • First listing of sources of problems, beyond sampling, facing surveys

  6. Comments on Deming (1944) • Includes nonresponse, sampling, interviewer effects, mode effects, various other measurement errors, and processing errors • Includes nonstatistical notions (auspices) • Includes estimation step errors (wrong weighting) • Omits coverage errors • “total survey error” not used as a term

  7. Sampling Text Treatment of Total Survey Error • Kish, Survey Sampling, 1965 • 65 of 643 pages on various errors, with specified relationship among errors • Graphic on biases

  8. Frame biases “Consistent” Sampling Bias Sampling Biases Constant Statistical Bias Noncoverage Nonobservation Nonresponse Field: data collection Nonsampling Biases Observation Office: processing

  9. Total Survey Error (1979)Anderson, Kasper, Frankel, and Associates • Empirical studies on nonresponse, measurement, and processing errors for health survey data • Initial total survey error framework in more elaborated nested structure

  10. Sampling Variable Error Field Nonsampling Processing Frame Total Error Sampling Consistent Noncoverage Bias Nonobservation Nonresponse Nonsampling Field Observation Processing

  11. Survey Errors and Survey Costs (1989), Groves • Attempts conceptual linkages between total survey error framework and • psychometric true score theories • econometric measurement error and selection bias notions • Ignores processing error • Highest conceptual break on variance vs. bias • Second conceptual break on errors of nonobservation vs. errors of observation

  12. Mean Square Error construct validity theoretical validity empirical validity reliability Variance Errors of Nonobservation Observational Errors Coverage Nonresponse Sampling Interviewer Respondent Instrument Mode criterion validity - predictive validity - concurrent validity Bias Observational Errors Errors of Nonobservation Coverage Nonresponse Sampling Interviewer Respondent Instrument Mode

  13. Nonsampling Error in Surveys (1992), Lessler and Kalsbeek • Evokes “total survey design” more than total survey error • Omits processing error

  14. Introduction to Survey Quality, (2003), Biemer and Lyberg • Major division of sampling and nonsampling error • Adds “specification error” (a la “construct validity”) • Formally discusses process quality • Discusses “fitness for use” as quality definition

  15. Survey Methodology, (2004) Groves, Fowler, Couper, Lepkowski, Singer, Tourangeau • Notes twin inferential processes in surveys • from a datum reported to the given construct of a sampled unit • from estimate based on respondents to the target population parameter • Links inferential steps to error sources

  16. The Total Survey Error Paradigm Measurement Representation Inferential Population Construct Validity Target Population Coverage Error Measurement Sampling Frame Measurement Error Sampling Error Response Sample Processing Error Nonresponse Error Edited Data Respondents Survey Statistic

  17. Summary of the Evolution of “Total Survey Error” • Roots in cautioning against sole attention to sampling error • Framework contains statistical and nonstatistical notions • Most statistical attention on variance components, most on measurement error variance • Late 1970’s attention to “total survey design” • 1980’s-1990’s attempt to import psychometric notions • Key omissions in research

  18. 5 Myths of Survey Practice that TSE Debunks • “Nonresponse rates are everything” • “Nonresponse rates don’t matter” • Give as many cases to the good interviewers as they can work • Postsurvey adjustments eliminate nonresponse error • Usual standard errors reflect all sources of instability in estimates (measurement error variance, interviewer variance, etc.)

  19. Outline • The total survey error paradigm in scientific surveys • The decline in survey participation • The rise of internet panels • The “second era” of internet panels • So... do we need probability sampling?

  20. Response Rates • In most rich countries response rates on household and organizational surveys are declining • deLeeuw and deHeer (2002) model a 2 percentage point decline per year • Probability sampling inference is unbiased from nonresponse with 100% response rate

  21. Recent studies challenge a simple link between response rates and nonresponse error • Reading Keeter et al. (2000), Curtin et al. (2000), Merkle and Edelman(2002) suggests response rates don’t matter • Standard practice urges maximizing response rates What’s a practitioner to do?

  22. Mismatches between Statistical Expressions for Nonresponse Error and Practice

  23. What does the Stochastic View of Response Propensity Imply? • Key issue is whether the influences on survey participation are shared with the influences on the survey variables • Increased nonresponse rates do not necessarily imply increased nonresponse error • Hence, investigations are necessary to discover whether the estimates of interest might be subject to nonresponse errors

  24. Assembly of Prior Studies of Nonresponse Bias • Search of peer-reviewed and other publications • 47 articles reporting 59 studies • About 959 separate estimates (566 percentages) • mean nonresponse rate is 36% • mean bias is 8% of the full sample estimate • We treat this as 959 observations, weighted by sample sizes, multiply-imputed for item missing data, standard errors reflecting clustering into 59 studies and imputation variance

  25. Percentage Absolute Relative Bias

  26. Percentage Absolute Relative Nonresponse Bias by Nonresponse Rate for 959 Estimates from 59 Studies

  27. 1. Nonresponse Bias Happens

  28. 2. Large Variation in Nonresponse Bias Across Estimates Within the Same Survey, or

  29. 3. The Nonresponse Rate of a Survey is a Poor Predictor of the Bias of its Various Estimates (Naïve OLS, R2=.04)

  30. Conclusions • It’s not that nonresponse error doesn’t exist • It’s that nonresponse rates aren’t good predictors of nonresponse error • We need auxiliary variables to help us gauge nonresponse error

  31. A Practical Question “What attraction does a probability sample have for representing a target population if its nonresponse rate is very high and its respondent count is lower than equally-costly nonprobability surveys?”

  32. Outline • The total survey error paradigm in scientific surveys • The decline in survey participation • The rise of internet panels • The “second era” of internet panels • So... do we need probability sampling?

  33. A “Solution” to Response Rate Woes • Web surveys offer a very different cost structure than telephone and face-to-face surveys • Almost all fixed costs • Very fast data collection • But there is no sampling frame • Often probability sampling from large volunteer groups • Internet access varies across and within countries

  34. Access/Volunteer Internet Panels • Massive change in US commercial survey practice, moving from telephone and mail paper questionnaires to web surveys • Survey Sampling, a major supplier of telephone samples over the past two decades now reports that 80% of their business is web panel samples • Some businesses do only web survey measurement

  35. The Method • Recruitment of email ID’s from internet users • At survey organization’s web site • Through pop-ups or banners on others’ sites • Through third party vendors • A June 15, 2008, Google search of “make money doing surveys” yields 19,300 hits • “make $10 in 5 minutes” www.SurveyMonster.com

  36. There is a new industry • Greenfield Online • Survey Sampling • e-Rewards • Lightspeed • ePocrates • Knowledge Networks • Private company panels • Proprietary panels Baker, 2008 Inside Research, 2007

  37. Reward Systems Vary • Payment per survey • Points per survey, yielding eligibility for rewards • Points for sweepstakes

  38. Adjustment in Estimation • Estimation usually involves adjustment to some population totals • Some firms have propensity model-based adjustments • “proprietary estimation systems” abound

  39. Outline • The total survey error paradigm in scientific surveys • The decline in survey participation • The rise of internet panels • The “second era” of internet panels • So... do we need probability sampling?

  40. September, 2007, Respondent Quality Summit • Head of Proctor and Gamble market research • Cites Comscore: 0.25% of internet users responsible for 30% of responses to internet panels • Cites average number of panel memberships of respondents of 5-8 • Presents examples of failure to predict behaviors

  41. Coen et al., 2005 in Baker, 2008 The number of surveys taken matters.

  42. The Practical Indicators of “Quality” • Cheating on qualifying questions • Internal inconsistencies • Overly fast completion • “Straightlining” in grids • Gibberish or duplicated open end responses • Failure of “verification” items in grids • Selection of bogus or low-probability answers • Non-comparability of results with non-panel sample Baker, 2008

  43. Panel response rates are in decline as panelists do more surveys. MSI, 2005 in Baker, 2008

  44. Where are we now? • An industry in turmoil • Active study of correlates of low quality conducted by sophisticated clients • Professional associations attempting to define quality indicators

  45. Outline • The total survey error paradigm in scientific surveys • The decline in survey participation • The rise of internet panels • The “second era” of internet panels • So... do we need probability sampling?

  46. Access Panels and Inference • Access panels have conjoined frame development and sample selection • Without documentation of the frame development, assessment of coverage properties are not tractable • Many use probability sampling from the volunteer set, but ignore this in estimation

More Related