Download
slide1 n.
Skip this Video
Loading SlideShow in 5 Seconds..
Review of last week PowerPoint Presentation
Download Presentation
Review of last week

Review of last week

108 Vues Download Presentation
Télécharger la présentation

Review of last week

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

  1. Review of last week

  2. Variables • Response variable – y • Explanatory variable – x • [ today one of each ] • Continuous variables • Categorical variables (binary…)

  3. 16 14 12 10 8 6 4 Red ants Black ants Logistic regression 2  2 tables Categoric 1.0 Melica 0.8 0.6 Prob. of choosing Melica 0.4 0.2 0.0 Response variable Luzula 4.5 5.5 6.5 7.5 Ant size Regression Anova Continuous - - Seed size Continuous Categoric Explanatory variable

  4. 16 14 12 10 8 6 4 Red ants Black ants One continuous response variable& one or more explanatory variables Generel linear model Regression Anova + Continuous Response variable - - Seed size Continuous Categoric Explanatory variable

  5. Generel linear models with: Many continuousexplanatoriesare usually called multiple regression Many categorical explanatoriesare usually called multiway ANOVA One continuousexplanatory and one (or sometimes many) categorical explanatories are usually called ANCOVA.

  6. Test tools t-test F-test (anova) Chi-2

  7. Test tools t-test  F-test (anova) Chi-2

  8. Generel linear models with: Many continuousexplanatoriesare usually called multiple regression Many categorical explanatoriesare usually called multiway ANOVA One continuousexplanatory and one (or sometimes many) categorical explanatories are usually called ANCOVA.

  9. Mail me the data, please.

  10. Anova table on pollination Anova(lm(Seed.number~colour*poll.treat)) Sum Sq Df F value Pr(>F) colour 3122.3 1 19.8440 2.846e-05 *** poll.treat 1693.1 1 10.7604 0.001567 ** col:pol 829.8 1 5.2737 0.024406 * Residuals 11958.0 76

  11. Generel linear models with: Many continuousexplanatoriesare usually called multiple regression Many categorical explanatoriesare usually called multiway ANOVA One continuousexplanatory and one (or sometimes many) categorical explanatories are usually called ANCOVA.

  12. Assumptions for parametric tests with continuous response i.e., also linear models!! About the same variation in all groups or along a continuous variable or along fitted values Pretty normal residuals (= noice)

  13. About the same variation? Forest Meadow

  14. Pretty normal residuals Histogram of residuals Histogram of response variable seed size 14 20 Meadow 12 15 10 Forest No. species No. species 8 10 6 4 5 2 0 0 -1 -0,5 0 +0,5 0 1 2 3 Seed size in mm Distanse in mm from respective group mean

  15. [,1] [,2] [,3] [,4] [,5] [,6] [1,] 2 3 4 5 6 7 [2,] 3 4 5 6 7 8 [3,] 4 5 6 7 8 9 [4,] 5 6 7 8 9 10 [5,] 6 7 8 9 10 11 [6,] 7 8 9 10 11 12

  16. [,1] [,2] [,3] [,4] [,5] [,6] [1,] 1 2 3 4 5 6 [2,] 2 4 6 8 10 12 [3,] 3 6 9 12 15 18 [4,] 4 8 12 16 20 24 [5,] 5 10 15 20 25 30 [6,] 6 12 18 24 30 36

  17. 200 500 Plus effect or percent-effect 400 Plus effect From 0-5: 100 + 100 = 200 From 5-10:200 + 100 = 300 300 No. of Aphids 300 y 200 100 100 0 0 1 2 3 4 5 6 7 8 9 10 Weeks x

  18. 200 500 500 Plus effect or percent effect 400 Plus effect From 0-5: 100 + 100 = 200 From 5-10:200 + 100 = 300 300 No. of aphids 300 y 200 100 Percent effect From 0-5: 80  2,5 = 200 From 5-10:200  2,5 = 500 100 80 0 0 1 2 3 4 5 6 7 8 9 10 Weeks x

  19. Non transformed Log transformed 500 500 200 400 No. of aphids No. of aphids 50 300 20 200 10 5 100 2 1 0 0 1 2 3 4 5 6 7 8 9 10 0 2 4 6 8 10 Weeks Weeks

  20. 500 400 300 200 100 0 0 1 2 3 4 5 6 7 8 9 10 Plus effect or percent per percent 400 Seed weight in μg Plus effect From 0-5: 100 + 100 = 200 From 5-10:200 + 100 = 300 300 y 100 100 Percent per percent From 2,5 till 5 = 200%: 100  200% = 200 From 5 till 10 = 200%: 200  200% = 400 Leaf length in cm x

  21. Non transformed Log Log transformed 500 Seed weight in μg Seed weight in μg 200 400 50 300 200 10 5 100 2 1 0 0 1 2 3 4 5 6 7 8 9 1 2 5 10 Leaf length in cm Leaf length in cm

  22. A lichen size study

  23. 5 possible models • Lichen size only depends on the total mean. • Lichen size depends on what site the lichen grows (city vs university). • Lichen size depends on the tree size (≈ age?). • Lichen size depends both on site AND tree size. • Lichen size depends on tree size, but the relationship between tree size and lichen size differs between the sites (city / univ).

  24. Check your data import > names(d) [1] "tree.circum" "lich.diam" "tree.spec" "site” > is.numeric(lich.diam) [1] TRUE > is.numeric(tree.circum) [1] TRUE > levels(site) [1] "city" "uni"

  25. Check your data import > names(d) [1] "tree.circum" "lich.diam" "tree.spec" "site" > is.numeric(lich.diam) [1] TRUE > is.numeric(tree.circum) [1] FALSE > levels(site) [1] "city" "ciyt" "uni"

  26. Assumption plots

  27. Assumption plots

  28. Should you log your lichen sizes? • Does it look so bad that your test may be incorrect? • Does a log transformation improve the model assumptions? •  Constant variation most important. • Does it make biological sence that the explanatory variables affect the percent increase in lichen size rather than the increase in mm?

  29. Assumption plots on logged lichen sizes

  30. Should you log your lichen sizes? • Does it look so bad that your test may be incorrect? – Naa, probably not. • Does a log transformation improve the model assumptions? – YES! • Does it make biological sence with a percent increase? – Well I guess so. • OK, let’s use the logged values!

  31. Logged lichen size

  32. The lichen size study

  33. A B C D E F G Mainland Most: Fewest

  34. Logged lichen size

  35. 5 possible models • Log lichen size only depends on the total mean. • Log lichen size depends on what site the lichen grows (city vs university). • Log lichen size depends on the tree size (≈ age?). • Log lichen size depends both on site AND tree size. • Log lichen size depends on tree size, but the relationship between tree size and log lichen size differs between the sites (city / univ).

  36. Models log.lich.diam<-log10(lich.diam) log.mod.int<-lm(log.lich.diam~tree.circum+site +tree.circum:site) log.mod.both<-lm(log.lich.diam~tree.circum+site) log.mod.tree.circum<-lm(log.lich.diam~tree.circum) log.mod.site<-lm(log.lich.diam~site) log.mod.null<-lm(log.lich.diam~1)

  37. Logged lichen size

  38. Anova table on logged lichens Anova(lm(log.lich.diam~tree.circum+site+ tree.circum:site)) = Anova(log.mod.int) Response: log.lich.diam Sum Sq Df F value Pr(>F) tree.circum 0.5808 1 9.7584 0.002826 ** site 0.5431 1 9.1238 0.003797 ** tree.circum:site 0.0047 1 0.0784 0.780444 Residuals 3.3332 56

  39. Model comparison

  40. Test interaction! anova(log.mod.int,log.mod.both) Model 1: log.lich.diam ~ tree.circum + site + tree.circum:site Model 2: log.lich.diam ~ tree.circum + site Res.Df RSS Df Sum of Sq F Pr(>F) 1 56 3.3332 2 57 3.3378 -1 -0.0047 0.0784 0.7804

  41. Test site! anova(log.mod.both,log.mod.tree.circum) Model 1: log.lich.diam ~ tree.circum + site Model 2: log.lich.diam ~ tree.circum Res.Df RSS Df Sum of Sq F Pr(>F) 1 57 3.3378 2 58 3.8809 -1 -0.5431 9.2737 0.003516 **

  42. Test tree circumference! anova(log.mod.both,log.mod.site) Model 1: log.lich.diam ~ tree.circum + site Model 2: log.lich.diam ~ site Res.Df RSS Df Sum of Sq F Pr(>F) 1 57 3.3378 2 58 3.9187 -1 -0.5808 9.9188 0.002605 **

  43. Conclusion: • Log lichen size depends both on site AND tree size. • Lichens are larger at the University than in the city (p = 0.0035 given the effect of tree size). • Lichen size decreases with increasing tree size (p = 0.0026 given the effect of site)