200 likes | 329 Vues
This article explores the importance of distinguishing between signal and noise in statistical data, emphasizing the role of sample size in accurately interpreting results. It discusses the "Rule of Thumb," which indicates that to halve random noise, one must quadruple data collection. Examples, such as the coin flip experiment and the implications of p-values, illustrate how small effects can appear significant with large samples. The importance of effect size and numerical comparisons is highlighted to prevent misinterpretation of statistical significance in research.
E N D
Back-of-the-Envelope Statistics Jason Zimba Student Achievement Partners and Bennington College
Signal and Noise • Significance vs. Size • Regression Models
Call It: • I flip a coin a million times. • Results: 51% heads, 49% tails. • Do we think the coin is fair?
With N data points, expect random fluctuations to be of order in percentage terms. The Rule of Thumb: • Apply rule of thumb to coin problem: • N ~ 1,000,000 square root of N is 1000 • Expect fluctuations of about 1/1000 = 0.1% • Actual fluctuations ~1% • Ten times larger than one intuitively expects • Don’t bet on the coin being fair…!
Implications of the Rule • If you want to cut random noise in half, expect to gather four times as much data. • Suppose an astronomer wants to image a star. To make the image twice as clear, it will take 4times as much telescope time. • Square root dependency makes good data expensive to collect.
Implications of the Rule • The larger the N, the smaller the noise in percentage terms (one hopes….) • N = 100 expect ~ 10% noise • N = 10,000 expect ~ 1% noise
Implications of the Rule When N is too small, noise swamps the signal H. Wainer, “The Most Dangerous Equation,” American Scientist, May-June 2007
Signal and Noise • rule of thumb and its implications • Significance vs. Size • Regression Models
“Significant” means… • The effect being studied is real. • You’re looking at signal, not noise. • The effect being studied is unlikely to be due to chance. • p value: “p < 0.05.”
“Significant” does not mean that the effect is large. • If N is large enough, then the noise will be damped down, and very weak signals will emerge. • These weak signals are termed “significant” despite their small size (because they are real). • The p-value of an effect does not tell you how large (or important) the effect is. • For the coin flip problem, the 1% deviation is very significant (p < 0.000…001). The coin is almost certainly biased - just not very strongly.
Describing Effect Sizes Sizes are always relative. • A year is a long time for a snowstorm to last, a short time for an empire to last. • A millimeter is large for a molecule, small for an engagement ring. • A ton is large for a hog, small for an asteroid. Comparisons must be like with like. • “A year is longer than an ounce.” (??)
Computing effect size Example: How strongly does gender affect weight? • US avg. adult male weight = 190 lb 50 lb • US avg. adult female weight = 160 lb 50 lb • Diff. between avg. M and avg. F = 30 lb • Natural variability in adult weights = 50 lb • Effect size: • d = (diff. between averages) (natural variability) • = (30 lbs) (50 lbs) • = 30 50 • d = 0.6.
“Packaging” Effect Sizes • Beware of authors describing effect sizes as being ‘large’ … ‘substantial’ … ‘small’ … ‘quite small’ … ‘slight’, …. • “There is evidence of slightlygreater male variability in scores, although the causes remain unexplained.”* • Meaningful assessments of effect size go beyond everyday language to compare the effect numerically to some meaningful standard measured in the same units. • e.g. ‘The top quartile-bottom quartile difference in value-add among third-year teachers was X times as large as the first-year gain to experience.’ *Hyde et al., “Gender Similarities Characterize Math Performance,” Science, 25 July 2008, Vol. 321, pp. 494, 495.
Signal and Noise • rule of thumb and its implications • Significance vs. Size • Significant means real/not spurious • Effect could still be small and unimportant • Don’t let people “package” for you • Describe effect size numerically with like-to-like comparison • Regression Models
Birth Wt = 0.067 Wt of Mother - (0.57 lb/wk) Wks premature
Signal and Noise • rule of thumb and its implications • Significance vs. Size • Significant means real/not spurious • Effect could still be small and unimportant • Don’t let people “package” for you • Describe effect size numerically with like-to-like comparison • Regression Models • Regression models tell you how much a change in the “inputs” affects the “output.” • What looks like noise might actually be an effect - an effect of an unmeasured variable. • But if you put in too many variables, you can explain anything.