Download Presentation
## Review of yesterday

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -

**Review of yesterday**[ ] perm book www.ekolog.se/mstat**Reporting statistics**• Birch twigs with flowers were • shorter than birch twigs without • flowers (t=2.7, p=0.014, N=30). • Use two significance digits • Or use p<0.0001**Copy-Paste Statistics**Open your data set in R: library(xlsReadWrite) yourdata<-read.xls(file.choose()) attach(yourdata) names(yourdata)**Check for Errors**names(yourdata) fix(yourdata) levels(your.x) is.numeric(your.y)**Check for Errors**levels(your.x) ”with ” ”Without”**Addresses**• x<-c(1,5,2,4) • x • 5 3 4 • x[2] • 5**Area under the curve**95% No. random samples 2,5% 2,5% Difference**Risk of by chance only < 5 %**95% No. random samples 2,5% 2,5% Difference**Risk by chance = 1,4 +1,4 = 2,8 %**95% No. random samples 1,4% 1,4% Difference**p-value**• 2,8 % is the probability (p) that there is NO real difference. • p = 0,028 means that there is a 2,8% chance that the groups do not really differ, but that we by chance get the data points that we collected. • 2,8 % probability is ridiculously small! We don’t believe in that! • If it’s is not just due to chance, it must depend on something else… • … e.g., water quality. The habitats differ significantly. • p < 0,05 counts as ridiculously small**p-value is NOT!**• p = 0.23 does NOT mean that there is NO difference. • There might be a difference, but we are not confident enough to state that. • The risk that the result is due to chance alone is too high. • p = 0.052 is very very close to ”ridiculously small” so there might indeed be a difference.**Intervals**• Not really confidence intervals… • A confidence interval is around the estimated value. • There is a 95 % chance that the true mean or the true slope is within the confidence interval.**Intervals**• We have calculated permutation critical values. • This interval includes the 95 % most likely values that you can get by chance alone (if there is no relationship or no difference). • If the actual value from a study is outside this critical interval, it means that the result is very unlikely to be an effect of chance alone. • Then the p-value is less than 5 % and the result is significant.**Critical values**No. random samples Difference**Today:**Categorical response variables**Categorical response variables**• Logistic regression • 2×2-test**16**14 12 10 8 6 4 Red ants Black ants Logistic regression 2 2 tables Categoric 1.0 Melica 0.8 0.6 Prob. of choosing Melica 0.4 0.2 0.0 Response variable Luzula 4.5 5.5 6.5 7.5 Ant size Regression Anova t-test Continuous - - Seed size Continuous Categoric Explanatory variable**Logistic regression**• Binary response = Either… or…(i.e., categorical with 2 outcomes) • Continuous explanatory**Logistic regression**• Examples: • Does seed size (x) affect germination (y)? • Does body fat reserves (x) affect survival (y) in hedge hogs? • Does flower size (x) affect pollination (y) or seed predation (y)?**Logistic regression**• Easy to test in R! • But the test ”under the hood” is complicated. • Hard to make a neat graph. • stripchart(x~y) • Hard to give an intuitive effect.(cf. regression slope)**16**14 12 10 8 6 4 Red ants Black ants Logistic regression 2 2 tables Categoric 1.0 Melica 0.8 0.6 Prob. of choosing Melica 0.4 0.2 0.0 Response variable Luzula 4.5 5.5 6.5 7.5 Ant size Regression Anova Continuous - - Seed size Continuous Categoric Explanatory variable**2×2 = proportions**• Binary respons = Either… or…(i.e., categorical with 2 alternatives) • Binary explantory = Either… or…(i.e., categorical with 2 alternatives) • Contingency tables • Fischer’s exact test easy in R(super easy in R commander!) • Sometimes Chi-2 have been used • BUT, it does not give an exact p-value**2×2 = proportion test**• Examples: • Do men and women have different preferenses? • Does different ants prefer different seeds? • …**Lunches for students and lecturers**lunchchoice ~ students.teachers**More than two groups**• Avoid! • What is really the explanatory variable (x)? • Could you do a logistic regression instead? • But it is possible Generalised linear model, something like a ”logistisk anova” (but it’s usually called a 3×2 contingency table)**Break?**• No?**Before your study**• What is your hypothesis? What would the result look like if your idea is correct? • What is your null hypothesis? • Draw graphs for Logistic regression!! • (Fake data – test to test)**Ant dispersal**• Elaiosome**Do ants prefer seeds with large elaiosomes?**Response: Elaiosome size (categorical) No explanatory**Ant preferences**• Green pearl = Small elaiosome (ant candy) • Black pearl = Large elaiosome (ant candy) • Blue bag = All ants that collect seeds • Do ants select for larger elaiosomes? • Experiment: 30 ants (=30 pearls)**Ant preferences**• Blue bag = All ants that collect seeds • ”By chance”-bag = equal numbers of • red and white pearls • Let’s pretend:White pearl = Large; Red pearl = Small • what experient results can you get by chance?**Is this because of chance?**• Is there a risk of more than 5 % that the pattern in the experiment is due to chance? • No! Aha! Significant!Then it must be explained by something else (than chance). • E.g., that there is not 50 % ants with small elaiosomes and 50 % with large. • And that ants prefer large elaiosomes. • Well, yes... Well, then our experiment result may be explained by chance effects (random sampling). And perhaps ants don’t care.**What is the chance (in percent) to get as clear pattern as**you got in your experiment, only by chance? = p-value