370 likes | 598 Vues
Spotting pseudoreplication. Inspect spatial (temporal) layout of the experiment Examine degrees of freedom in analysis. Degrees of freedom (df). Number of independent terms used to estimate the parameter = Total number of datapoints – number of parameters estimated from data.
E N D
Spotting pseudoreplication • Inspect spatial (temporal) layout of the experiment • Examine degrees of freedom in analysis
Degrees of freedom (df) Number of independent terms used to estimate the parameter = Total number of datapoints – number of parameters estimated from data
Example: Variance If we have 3 data points with a mean value of 10, what’s the df for the variance estimate? Independent term method: Can the first data point be any number? Yes, say 8 Can the second data point be any number? Yes, say 12 Can the third data point be any number? No – as mean is fixed ! Variance is (y – mean)2 / (n-1)
Example: Variance If we have 3 data points with a mean value of 10, what’s the df for the variance estimate? Independent term method: Therefore 2 independent terms (df = 2)
Example: Variance If we have 3 data points with a mean value of 10, what’s the df for the variance estimate? Subtraction method Total number of data points? 3 Number of estimates from the data? 1 df= 3-1 = 2
Example: Linear regression Y = mx + b Therefore 2 parameters estimated simultaneously (df = n-2)
Example: Analysis of variance (ANOVA) A B C a1 b1 c1 a2 b2 c2 a3 b3 c3 a4 b4 c4 What is n for each level?
Example: Analysis of variance (ANOVA) A B C a1 b1 c1 a2 b2 c2 a3 b3 c3 a4 b4 c4 df = 3 df = 3 df = 3 n = 4 How many df for each variance estimate?
Example: Analysis of variance (ANOVA) A B C a1 b1 c1 a2 b2 c2 a3 b3 c3 a4 b4 c4 df = 3 df = 3 df = 3 What’s the within-treatment df for an ANOVA? Within-treatment df = 3 + 3 + 3 = 9
Example: Analysis of variance (ANOVA) A B C a1 b1 c1 a2 b2 c2 a3 b3 c3 a4 b4 c4 If an ANOVA has k levels and n data points per level, what’s a simple formula for within-treatment df? df = k(n-1)
Spotting pseudoreplication An experiment has 10 fertilized and 10 unfertilized plots, with 5 plants per plot. The researcher reports df=98 for the ANOVA (within-treatment MS). Is there pseudoreplication?
Spotting pseudoreplication An experiment has 10 fertilized and 10 unfertilized plots, with 5 plants per plot. The researcher reports df=98 for the ANOVA. Yes! As k=2, n=10, then df = 2(10-1) = 18
Spotting pseudoreplication An experiment has 10 fertilized and 10 unfertilized plots, with 5 plants per plot. The researcher reports df=98 for the ANOVA. What mistake did the researcher make?
Spotting pseudoreplication An experiment has 10 fertilized and 10 unfertilized plots, with 5 plants per plot. The researcher reports df=98 for the ANOVA. Assumed n=50: 2(50-1)=98
Why is pseudoreplicationa problem? Hint: think about what we use df for!
How prevalent? Hurlbert (1984): 48% of papers Heffner et al. (1996): 12 to 14% of papers
Statistics review Basic concepts: • Variability measures • Distributions • Hypotheses • Types of error • Common analyses • T-tests • One-way ANOVA • Two-way ANOVA • Randomized block
Variance • Ecological rule # 1: Everything varies • …but how much does it vary?
Variance • S2= Σ (xi – x )2 • n-1 Sum-of-square cake x
Variance • S2= Σ (xi – x )2 • n-1 x
S2= Σ (xi – x )2 • n-1 Variance What is the variance of 4, 3, 3, 2 ? What are the units?
Variance variants 1. Standard deviation (s, or SD) = Square root (variance) Advantage: units
Variance variants 2. Standard error (S.E.) = s n Advantage: indicates precision
How to report • We observed 29.7 (+ 5.3) grizzly bears per month (mean + S.E.). • A mean (+ SD)of 29.7 (+ 7.4) grizzly bears were seen per month + 1SE or SD - 1SE or SD
Distributions Normal • Quantitative data Poisson • Count (frequency) data
Normal distribution 67% of data within 1 SD of mean 95% of data within 2 SD of mean
Poisson distribution mean Mostly, nothing happens (lots of zeros)
Poisson distribution • Frequency data • Lots of zero (or minimum value) data • Variance increases with the mean
What do you do with Poisson data? • Correct for correlation between mean and variance by log-transforming y (but log (0) is undefined!!) • Use non-parametric statistics (but low power) • Use a “generalized linear model” specifying a Poisson distribution
Hypotheses • Null (Ho): no effect of our experimental treatment, “status quo” • Alternative (Ha): there is an effect
Whose null hypothesis? Conditions very strict for rejecting Ho, whereas accepting Ho is easy (just a matter of not finding grounds to reject it). A criminal trial? Exotic plant species? WTO?
Hypotheses Null (Ho) and alternative (Ha): always mutually exclusive So if Ha is treatment>control…
Types of error Reject Ho Accept Ho Ho true Ho false
Types of error • Usually ensure only 5% chance of type 1 error (ie. Alpha =0.05) • Ability to minimize type 2 error: called power