Spotting pseudoreplication

Spotting pseudoreplication • Inspect spatial (temporal) layout of the experiment • Examine degrees of freedom in analysis

Degrees of freedom (df) Number of independent terms used to estimate the parameter = Total number of datapoints – number of parameters estimated from data

Example: Variance If we have 3 data points with a mean value of 10, what’s the df for the variance estimate? Independent term method: Can the first data point be any number? Yes, say 8 Can the second data point be any number? Yes, say 12 Can the third data point be any number? No – as mean is fixed ! Variance is  (y – mean)2 / (n-1)

Example: Variance If we have 3 data points with a mean value of 10, what’s the df for the variance estimate? Independent term method: Therefore 2 independent terms (df = 2)

Example: Variance If we have 3 data points with a mean value of 10, what’s the df for the variance estimate? Subtraction method Total number of data points? 3 Number of estimates from the data? 1 df= 3-1 = 2

Example: Linear regression Y = mx + b Therefore 2 parameters estimated simultaneously (df = n-2)

Example: Analysis of variance (ANOVA) A B C a1 b1 c1 a2 b2 c2 a3 b3 c3 a4 b4 c4 What is n for each level?

Example: Analysis of variance (ANOVA) A B C a1 b1 c1 a2 b2 c2 a3 b3 c3 a4 b4 c4 df = 3 df = 3 df = 3 n = 4 How many df for each variance estimate?

Example: Analysis of variance (ANOVA) A B C a1 b1 c1 a2 b2 c2 a3 b3 c3 a4 b4 c4 df = 3 df = 3 df = 3 What’s the within-treatment df for an ANOVA? Within-treatment df = 3 + 3 + 3 = 9

Example: Analysis of variance (ANOVA) A B C a1 b1 c1 a2 b2 c2 a3 b3 c3 a4 b4 c4 If an ANOVA has k levels and n data points per level, what’s a simple formula for within-treatment df? df = k(n-1)

Spotting pseudoreplication An experiment has 10 fertilized and 10 unfertilized plots, with 5 plants per plot. The researcher reports df=98 for the ANOVA (within-treatment MS). Is there pseudoreplication?

Spotting pseudoreplication An experiment has 10 fertilized and 10 unfertilized plots, with 5 plants per plot. The researcher reports df=98 for the ANOVA. Yes! As k=2, n=10, then df = 2(10-1) = 18

Spotting pseudoreplication An experiment has 10 fertilized and 10 unfertilized plots, with 5 plants per plot. The researcher reports df=98 for the ANOVA. What mistake did the researcher make?

Spotting pseudoreplication An experiment has 10 fertilized and 10 unfertilized plots, with 5 plants per plot. The researcher reports df=98 for the ANOVA. Assumed n=50: 2(50-1)=98

Why is pseudoreplicationa problem? Hint: think about what we use df for!

How prevalent? Hurlbert (1984): 48% of papers Heffner et al. (1996): 12 to 14% of papers

Statistics review Basic concepts: • Variability measures • Distributions • Hypotheses • Types of error • Common analyses • T-tests • One-way ANOVA • Two-way ANOVA • Randomized block

Variance • Ecological rule # 1: Everything varies • …but how much does it vary?

Variance • S2= Σ (xi – x )2 • n-1 Sum-of-square cake x

Variance • S2= Σ (xi – x )2 • n-1 x

S2= Σ (xi – x )2 • n-1 Variance What is the variance of 4, 3, 3, 2 ? What are the units?

Variance variants 1. Standard deviation (s, or SD) = Square root (variance) Advantage: units

Variance variants 2. Standard error (S.E.) = s n Advantage: indicates precision

How to report • We observed 29.7 (+ 5.3) grizzly bears per month (mean + S.E.). • A mean (+ SD)of 29.7 (+ 7.4) grizzly bears were seen per month + 1SE or SD - 1SE or SD

Distributions Normal • Quantitative data Poisson • Count (frequency) data

Normal distribution 67% of data within 1 SD of mean 95% of data within 2 SD of mean

Poisson distribution mean Mostly, nothing happens (lots of zeros)

Poisson distribution • Frequency data • Lots of zero (or minimum value) data • Variance increases with the mean

What do you do with Poisson data? • Correct for correlation between mean and variance by log-transforming y (but log (0) is undefined!!) • Use non-parametric statistics (but low power) • Use a “generalized linear model” specifying a Poisson distribution

Hypotheses • Null (Ho): no effect of our experimental treatment, “status quo” • Alternative (Ha): there is an effect

Whose null hypothesis? Conditions very strict for rejecting Ho, whereas accepting Ho is easy (just a matter of not finding grounds to reject it). A criminal trial? Exotic plant species? WTO?

Hypotheses Null (Ho) and alternative (Ha): always mutually exclusive So if Ha is treatment>control…

Types of error Reject Ho Accept Ho Ho true Ho false

Types of error • Usually ensure only 5% chance of type 1 error (ie. Alpha =0.05) • Ability to minimize type 2 error: called power

Spotting pseudoreplication

Spotting pseudoreplication

Presentation Transcript

Lecture 2: Blocks and pseudoreplication

Word Spotting DTW

Waste Spotting

Spotting sun dogs

Bee Math/Queen Spotting

NY WINE SPOTTING

Pseudoreplication and Ecology

Spotting the sick child.

Spotting Phony Sites

Spotting faulty logic

Film Spotting 1

Spotting patterns

Spotting Techniques

Language Feature Spotting

Spotting Web Vulnerabilities

Saturation, Flat-spotting

Lecture 2: Replication and pseudoreplication

Dynamic Match Lattice Spotting

Answer Spotting

Sample spotting techniques

Spotting FACES

Blocks and pseudoreplication