470 likes | 480 Vues
The Probability Sampling Tradition in a period of crisis. Q2010 Keynote speech Carl-Erik S ärndal Universit é de Montréal. The Probability Sampling Tradition has governed surveys at National Statistical Institutes (NSI:s) for decades
E N D
The Probability Sampling Tradition in a period of crisis Q2010 Keynote speech Carl-Erik Särndal Université de Montréal
The Probability Sampling Tradition has governed surveys at National Statistical Institutes (NSI:s) for decades Breaking a tradition : Not easy …
Background The merits of probability sampling, also known as scientific sampling, are put in question by severe imperfections : non-sampling errors, economic pressures etc. The problem not new – but more and more compelling
Background The probability sampling process • is expensive (through follow-ups); • its theoretical merits are compromised (by nonresponse, etc.) • “a few extra %” amount to very little • alternative data collection methods exist Yet probability sampling continues to be practiced. Wasteful ? Can we do without probability sampling?
My view is a (Canadian) theoretician’s view on (official) statistics production To what extent guided by (statistical science) theory ?
Theory as a basis for science (knowledge) Something we admire: Being able to predict facts about the world we live in by theoretical arguments and deduction This isthe predictive power of science In statistics: Want precise statements, backed by convincing theory, of level of unemployment, of industrial production, and so on
Theory as a basis for science Gérard Jorland : How is it possible that one can predict, merely by theoretical deductions, the existence of a new planet, or a new chemical element, or a new elementary particle? Based only on a calculus, on a set of mathematical equations ... remarkable achievement of the human mind.
Famous example: Planet Neptune was “found” by mathematical prediction by Le Verrier 1846, then empirically observed by Galle, at the position given by Le Verrier Many other examples come from physics, astronomy, chemistry
A hypothesis to test: The sciences are predictive to the extent that they are mathematically formulated. But that hypothesis is rejected : Today, Economics is highly mathematical and theoretical, but such arguments did not predict the current economic crisis, for example.
The contrast Physics: Predictive power of formal theory very high Economics: Predictive power of formal theory low
So “science formulated mathematically” does not guarantee “predictive power of theory” Why then are Physics and Economics different? Both are theoretical (mathematical) .
Contrasts Physics : the objects (planets, elementary particles, and so on) are inanimate ; predictive power very high Economics : the objects and the participants (human beings) are unpredictable, relationships highly complex; predictive power very low
Theory as a guide in statistics production Our ambition : Create knowledge (predictions) about our world through statistical surveys . To what extent is this activity supported by theory ? To what extent scientific ? Legitimate questions ! Some NSI:s take pride in “scientific principles”.
Sampling = Limiting attention to a small subsetTo what extent scientific ?We accept without hesitation that observing only n = 1,000 (or a few thousand) is enough - but provided the sample is “scientific”
What is a scientific sample ? RoperCentre, Univ. of Connecticut, says : A scientific sample is a process in which respondents are chosen randomly by one of several methods. The key component in the scientific sample is that everyone within the designated group (sample frame) has a chance of being selected. We may add : Such a sample also known as a probability sample It is not necessarily a representative sample in the sense “all have the same probability”.
scientific sample probability sample representative sample around these terms, unfortunate ambiguity and confusion reigns in literature, in conversation Ask, and you get a variety of responses
Sampling = Limiting attention to a small subset Two contrasting examples: Sampling trees in a forest - to predict volume Sampling human beings in a country - to predict (assess) unemployment, or health conditions, or expenditures
Estimating volume of wood on a sample of trees With classical probability sampling theory, we get not only a figure for the total volume of wood in the forest, but also a statement of its margin of error, free of any assumptions. We can determine exactly the accuracy we want.
Estimating unemployed on a sample of people We get from the LFS a figure, but we cannot quantify its margin of error. There is no objective declaration of numerical quality because unmeasured are : nonresponse error, measurement error, frame error, recording and data handling error, and so on
The contrast Trees are inanimate objects, like planets Human beings, they are precisely that, human, inconsistent, emotional, prone to error
The contrast Trees : Predictive power of probability sampling theory very high – objects do not “cause trouble” People : Predictive power of sampling theory very low - the survey is complex; human beings are involved
A large scale statistical investigation (survey) : “Unpredictable people are involved at so many points of this incredibly complex process” so we will never have a theory that will allow precise measurement of total survey error (Stanley McCarthy 2001) Producing numbers is (relatively) easy ; by comparison, stating their accuracy is difficult
Article by Platek and Särndal : Can a statistician deliver ? J. Official Statistics vol. 17 (2001), pp. 1 – 127 with 16 discussions and a rejoinder by the authors Can a statistician fulfill the promise (to society) ? Upon rereading : Have we advanced any, in 10 years ?
The title : Can a statistician deliver ? “Statistician” may denote the head of a National Statistical Institute (NSI) or a person expert in the subject (labour market, or health issues, or manufacturing industry, etc.) or a person trained in statistical science (methodologist)
As expected, feelings conveyed were of two kinds: high ranking NSI officials: “Keep the ship sailing”, despite difficult times academics and researchers: Regret the absence of a more solid (theoretical) base for (national) statistics production
Three themes are prominent in the 16 discussions (summarized in the authors’ rejoinder) : The role of theory The scientific and professional credo of the NSI The concept of quality in regard to the NSI’s activity
The uncertain future of the NSI I. Fellegi (Statistics Canada) on survival of the NSI. “Survival beyond quality” depends on • Respect for respondents, and • Credibility of information; Accuracy is an important part, but so are Relevance, Transparency & others
The uncertain future of the NSI I. Fellegi : A life and death question for the NSI is credibility : Information that is not believed will not be used, and the NSI has no function any more. Can the NSI count on future high co-operation and truthful response ? - More and more doubtful.
Believing numerical information We have no objective measures of “margin of error” But what about the Total Survey Error model ? (US Bureau of the Census, around 1950) It recognizes total error as a sum of a number of components. Can we not use these equations, this theory ?
Believing information The Total Survey Error model • helped us to focus on specific components of total error • disappointed us by failing to provide routine measures for the numerical quality of published statistics.
Believing information Discussants of Can a statistician deliver ? deliver “a death sentence” on the TSE model : “Unattainable and unrealistic ideal” “Utopian project” “Unrealistic utopian dream” Theory is there, but it does not work Some say: We choose not to use it In question are the notions of “probability” and “probable error”
Statistics Canada Quality Guidelines (1998) describes Survey Methodology as : “A collection of practices, backed by some theory and empirical evaluation, among which practitioners have to make sensible choices in the context of a particular application” A patchwork of theories, one for questionnaire design, one for motivating response, one for data handling and editing, one for imputation, one for estimation in small areas, and so on Fragmentation …
European Statistics Code of Practice (2005) Soundmethodology must underpin quality statistics. This requires adequate tools, procedures and expertise. The overall methodological framework of the statistical authority follows European and other international standards, guidelines, and goodpractices ... Survey designs, sample selections, and sample weights are well based and regularly reviewed, revised or updated … (Emphasis is mine.) A “be-good” encouragement; what about “scientific underpinnings” ?
The stark reality “Good practice” is the guide, not theory . Numerical quality is not assured . Large errors probably not infrequent; most go undetected . So what ? - Other important professions are also guided by a bunch of “good practices”
The NSI:s situation Its work is guided by “a collection of practices supported by some theory” plus requirement to keep response burden low With this frail and fragmented base, the NSI must produce reliable Official Statistics, for the good of the nation, a solid basis for policy decisions Not an enviable situation and a threat to NSI’s existence…
The Probability Sampling Tradition (born in 1930’s) created the concept of Nonresponse Rate : “the selected objects” (the probability sample) as compared with “the data delivering objects” (the respondents) We measure, steadfastly, sometimes misguidedly, the size ratio of those two sets
Our obsession with the Nonresponse Rate When NR rate was 2%, nobody worried When NR rate is now around 50%, we worry • Intuitively because the non-responding may be systematically related to target variable values • Probabilistically because “making the observation” (getting the response) has an unknown probability; the theory capsizes
The believers in Probability Sampling regret that the theory cannot cope The non-believers : Why worry about the NR rate ? Just collect some reasonably good data from a reasonably representative set of objects.
Our obsession with Nonresponse Rates Why not (in the manner of some private survey institutes) just get data from “a reasonably representative set of co-operative objects”, and not bother with this stifling concept of the Nonresponse Rate ? It is time that NSI:s deliver a strong endorsement of the Probability Sampling Tradition – if this is what they really believe in; otherwise, act accordingly
Our obsession with Nonresponse Rates NR rate itself is a poor indicator of NR bias, of “accuracy of estimates” See for ex. Groves (2006), Schouten (2009) Särndal and Lundström (2008)
Conclusions What options remain for the NSI today, to show their superior capacity to produce “serious numbers” amidst a deluge of “junk information” ? The underpinnings may be just “a collection of practices”, but still, the NSI is the model of statistical competence in the nation - and it must demonstrate this ! Media criticism of the NSI sometimes harsh.
Conclusions The NSI’s delicate balancing act vis-à-vis • The national government : fulfill the mandate • The world of theory and learning : show “scientific credibility” • The other (private) producers of statistics : tough competition • The supra-agency (EuroStat) : dictates
Conclusions A fact is that the quality component accuracy cannot be measured (probabilistically). Yet this is what users want desperately to have measured. When important numbers are proven wrong (by users), trust in the NSI suffers Other numbers may be wrong, but go unnoticed - and may not matter much .
Conclusions The Probability Sampling (Scientific Sampling) tradition, is a reflection of an idyllic past - now we are 2010 , not 1950 On what grounds is it still defendable, in our time? It is a challenge to the NSI, and to the academics (the theoreticians), to provide the answers
Conclusions The NSI vis-à-vis the scientific world : a sometimes hesitant relationship: Most NSI:s have a scientific (academic) advisory board NSI:s look to the learned world for support and acceptance NSI:s own investment in research may (understandably) be limited. Implementing new theory into the NSI's production has met with obstacles
Conclusions Relationship of the NSI to the world of learning; an empirical investigation, see Risto Lehtonen and Carl-Erik Särndal : Research and Development in Official Statistics and Scientific Co-operation with Universities: A Follow-Up Study , J. Official Statistics (2010)
Conclusions Debate article : S. Lundström and C.E. Särndal (2010): The devastating consequences of nonresponse : Probability sampling in question at Statistics Sweden . (In Swedish; internal report). Credit goes to Statistics Sweden for their courage to debate a sensitive issue.