280 likes | 434 Vues
On Magic, Power, Extra Sensory Perception, Decline, and Death: The Ironic Effect of Multiple-Study Articles on Scientific Progress Ulrich Schimmack University of Toronto Mississauga. Recent Controversy in JPSP:ASC (2001)
E N D
On Magic, Power, Extra Sensory Perception, Decline, and Death: The Ironic Effect of Multiple-Study Articles on Scientific ProgressUlrich SchimmackUniversity of TorontoMississauga
Recent Controversy in JPSP:ASC (2001) • Bem (2011) “Feeling the Future” - 9 study article - Conclusion: “extrasensory perception for subliminal stimuli” (p < .00001) • Wagenmaker et al. (2011) - results are inconclusive - Article demonstrates weaknesses of current scientific practices in experimental social psychology (ESP) research.
Bem’s Article May Elicit Feelings of Dissonance • Unbalanced Triad (Heider) • I don’t belief in extrasensory perceptions.- I believe JPSP articles.- JPSP shows that extrasensory perception is possible. • Dissonance motives actions that reduce dissonance.- Start believing in extrasensory perception.- Stop believing ESP articles.
But now all sorts of well-established, multiply confirmed findings have started to look increasingly uncertain. It’s as if our facts were losing their truth: claims that have been enshrined in textbooks are suddenly unprovable. This phenomenon doesn’t yet have an official name, but it’s occurring across a wide range of fields, from psychology to ecology.
For many scientists, the effect is especially troubling because of what it exposes about the scientific process. If replication is what separates the rigor of science from the squishiness of pseudoscience, where do we put all these rigorously validated findings that can no longer be proved? Which results should we believe? What is more trustworthy: a finding in a single-study article or a finding in a multiple-study article?
The Problem: Too Many Significant Results! • Standard statistical theory - two causal factors - true effect - random factors • Type I error: infer true effect, when random factors produced observed effect. • Type II error: Infer no true effect, when random factors mask true effect. • Ignores a third factor that influences results in scientific articles (bias).
The Problem: Too Many Significant Results How many significant results should be found? - Power (Cohen, 1992) - bigger effect size increase chances - bigger sample size increase chances ESP studies tend to have ~60% power (Sedlmeier & Gigerenzer, 1989, Rossi, 1990) ESP journals publish 97% significant results(Sterling et al., 1995)
Magic produces 37% of significant effects in ESP journals. We are all magicians, but some more than others.
Power in multiple studies is a power function. It is hard to show significance once, but it is harder to do it again, and again, and again… Thus, it requires more magic for significant results in multiple study articles.
“Daryl Bem is a Cornell University psychologist who says he's been doing magic as a hobby since he was 17. Now he has managed what some scientists may call his greatest trick: he's written a paper attempting to prove the power of ESP — extrasensory perception — and had it accepted for publication in a major scientific journal ... Did Bem really find evidence of extrasensory perception, or will his paper turn out to be an embarrassment? Already, there are doubts in the scientific world. http://sites.google.com/site/waldorfwatch/esp
Main Effect for ESP Study 7 used supraliminal stimuli. Effect (d = .09) not significant with N = 200 (POW = 97% for d = .25). Bem does not explain what this means. - failed replication - moderator effect of condition “I now wish I had simply continued to use subliminal exposures” Ignore Study 7 and focus on 8 ESP studies with subliminal stimuli and 9 significant effects.
Conclusion M-index for main effect .10 M-index for moderator effect .01 M-index total .001 1 in a 1000 studies can produce this (or a better) better of results, if the hypotheses are true. More evidence needed? r(N – ES) = -.90
Future Replication Studies Already one failed replication of Study 8 (N = 112, power = .80) (Galak & Nelson, 2010) http://papers.ssrn.com/sol3/papers.cfm?abstract_id=1699970 If Power = .80, the probability to get three non-significant results is only p = .008. If Power = .90, the probability to get two non-significant results is p = .01. Thus, it takes only a few failed replications to provide more evidence that Bem is a magician.
Gaillot, Baumeister et al. (2007) “Sugar High” Dvorak & Simons (2000, PSPB) failed to replicate studies 3-6, r = .15, n.s. Kurzbach (2010) showed that studies 3-6 failed to replicate Study 1.
Decline Effect II: Terror Management is Dying Grenberg 1994 original study - think about death - word completion as dependent variable (coff _ _ ) - N = 25, d = 1.7, POW = .98 Meta-analysis by Hayes (2010, Psych. Bull) lists 28 significant studies (0 failed replications)average d = 1.08, average N = 60, POW = .97, m-index = .43. However, strong decline effect, r = -.60 First, published non-significant effect by Niemic (2010, JPSP), d = .30, N = 57, n.s.
Decline Effect III: Malleability of Race-IAT Dasgupta & Greenwald (2001), JPSP N = 32, 16 per celld = .7, POW = .50 [d = .2, POW = .09] Joy-Gaba & Nosek (2010), Social Psychology N = 4,628, d = .08, POW = .77
Conclusion • “Less is more except of course for sample size” (Cohen, 1990, p. 1304) • Implications • Request power-analysis in method section • Get rid off null-hypothesis testing, report confidence intervals