Review of yesterday

Review of yesterday

Review of yesterday • Nice biological thinking! • Don’t FOCUS too much on error sources! • Your study may be correct in finding no difference or no relationship… • Use introduction for hypothesis about biology, not theory of statistics. • English. Do spell check. Help each other. • web resources?

Review of yesterday • Do read the basic manuals • Open and Save in R • Alternative 1: xls.ReadWrite  read.xls() • Alternative 2: copy  read.delim() • Save graphs • Ctrl + C (in graph)  Ctrl + V (in word) bitmap • Ctrl + W (in graph)  Ctrl + V (in word) • windows meta file

Continuous variables • If possible  Import as Excel with xlsReadWrite • If not  be ware of 1,5 or 1.5 • comma  use read.delim2(”clipboard”) • period  use read.delim(”clipboard”) • Also: Which is your response and which is your explanatory? • Avoid Length..in.mm.  Write length

Copy paste statistics Only change green fat code! plot(y~x,xlab=“Seed size”)

R tricks Ctrl + X  copy AND paste arrow up  last line history(Inf)  all commands without> plot(y~x)

word pricks col="red" goes col=“red” Change this  Computer tricks manual Or try Notepad++

16 14 12 10 8 6 4 Red ants Black ants Logistic regression 2  2 tables Categoric 1.0 Melica 0.8 0.6 Prob. of choosing Melica 0.4 0.2 0.0 Response variable Luzula 4.5 5.5 6.5 7.5 Ant size Regression Anova Continuous - - Seed size Continuous Categoric Explanatory variable

16 14 12 10 8 6 4 Red ants Black ants Response variable Regression Anova Continuous - - Seed size Continuous Categoric Explanatory variable

Maple – modular plastisity

R tricks plot(y~x,cex=2,cex.lab=1.5,cex.axis=2) Change the size of the graph window. Want an extra graph window? x11()

Next invasive Rumex

Birch– reproductive cost Change the size of the graph window in R, not word.

Maple – dispersability

Lichen Outliers

Outliers • Causes: • Typingerrors. • Data points affected by unwanted stuffs. • Biologically relevant data points • Butperhaps given unproportionallylargeeffect on result… • Tell the reader if you have removed any!

Lichens Standardised study design. Only lichens between 0.5 and 1.5 meters? Only on trunk? Only…

Todays stuff

Can you really trust your studies? • Risk of chance effects. • By chance you maybe happened to get those special individuals…

Permutations • DecoupleResponse AND explanation: • Take all x-values. • Put them in a box and shake. • Pour them back in the x-column • Now there should be no relationship or no difference. Right? • But how large differences can you get by chance (with a 95 % probability)?

Lego shrimps

Lego shrimps I • Does shrimp size depend on water quality? • Red piece = shrimp size(y, response) • Blue or green piece = clean or polluted water (x, explanatory)

shrimp study

Lego shrimp I • Does shrimp size depend on water quality? • Red piece = shrimp size(y, response) • Blue or green piece = clean or polluted water (x, explanatory) • If we shuffle the x variable (red or green pieces) what difference may we get by chance? How large?

shrimp permutation...

Probability interval

Parametric tests use a normal curve instead

Area under the curve 95% No. random samples 2,5% 2,5% Difference

Risk of by chance only < 5 % 95% No. random samples 2,5% 2,5% Difference

Risk by chance = 1,4 +1,4 = 2,8 % 95% No. random samples 1,4% 1,4% Difference

p-value • 2,8 % is the probability (p) that there is NO real difference. • p = 0,028 means that there is a 2,8% chance that the groups do not really differ, but that we by chance get the data points that we collected. • 2,8 % probability is ridiculously small! We don’t believe in that! • If it’s is not just due to chance, it must depend on something else… • … e.g., water quality. The habitats differ significantly. • p < 0,05 counts as ridiculously small

What will affect the p-value? • The difference between groups • (… between their means) • The variation within groups • The sample size •  (unreliability of group means) • Variation • Sample size

t-test in R • t.test(y~x,var.equal=T)

Under the hood • Competent Drivers vs. Mechanics

t-test under the hood

How does a t-test work?

Are male frogs larger than female frogs?

Difference between means • Female mean = 9 • Male mean = 13 • Difference = 13-9=4Soft! • But the unreliability?

Measure of variation? • Variation ≈ red lines! • Mean red line length? • Nope! •  absolute values  hard • Instead: • ≈ Mean squared red lines!

Variance • ≈ Mean squared red lines!

Degrees of freedom • For a group variance the df = n-1 • Why? – • To calculate a variance the mean is required! (y-mean(y))2 • But given a mean, only n-1 data point variations (y-mean(y)) can freely change and be used to estimate the variance. • If we independently change n-1 deviations, the last one can't be independent. • It must sum up with the rest to zero. • It's "freedom" is locked, used, by the mean!

Variance

Variance & Standard deviation • Standard deviation = SD •  Sometimes used to show variability in graphs • ±1 SD = 68% of data points • ±1.96 SD = 95% of data points var(y) = 3.3 sd(y) = 1.8

How does the t-test work?

How much is 3.1? 1% 1% 2.5% 2.5%

Review of yesterday