330 likes | 571 Vues
BIOL 582. Lecture Set 9 Factor Interactions Factorial Models. BIOL 582. Factor interactions. Many hypotheses in biological research are really interested in patterns of change: How do plant/animals traits change across environments? (e.g., phenotypic plasticity)
E N D
BIOL 582 Lecture Set 9 Factor Interactions Factorial Models
BIOL 582 Factor interactions • Many hypotheses in biological research are really interested in patterns of change: • How do plant/animals traits change across environments? (e.g., phenotypic plasticity) • How do traits change through development? (e.g., ontogenetics) • Are patterns of variation constant across space or time? (e.g., spatial data) • How do physiological responses change for organisms subjected to different treatments before and after a stimulus?
BIOL 582 Factor interactions Species 1 Growth rate Species 2 Wet Dry • Understanding the patterns of change becomes clear with factor interactions • Interactions measure the joint effect of main effects A & B • Identifies whether response to A is dependent on level of B, or vice versa • Are VERY common in biological research • Example: 2 species in 2 environments (Factors A & B), species 1 has higher growth rate in moist environment, while species 2 has higher growth rate in dry environment. This would be identified as an interaction between species & environment Note: The study of trade-offs (reaction norms) in evolutionary ecology is based on the study of interactions
BIOL 582 Factor interactions 4 4 4 4 3 3 3 3 2 2 1 1 2 2 Species 1 0 0 Growth rate 1 1 Species 2 0 0 Wet Dry V V E1 E2 E1 E2 V E1 E2 V E1 E2 • Significant interactions identify a joint response of factors (response to Factor B depends on the level in Factor A) • Interpreting interactions for two factors, each with two levels is straightforward (consider two species in two environments, for example) divergence/convergence effect-no effect large effect/small effect reversal of values From Collyer and Adams. (2007). Ecology. 88:683-692.
BIOL 582 Factor interactions Species 1 Growth rate Species 2 Wet Dry • Consider what a true null hypothesis of no factor interaction might look like • This is correct, right?
BIOL 582 Factor interactions Species 1 Species 1 Species 1 Species 1 Growth rate Growth rate Species 2 Species 2 Species 2 Species 2 Wet Dry Wet Dry Wet Dry Wet Dry • Consider what a true null hypothesis of no factor interaction might look like • This is correct, right? • So are these! No species effect; No environment effect Interactions mean paralleled patterns of change Large species effect; No environment effect Small species effect; Small environment effect Small species effect; Small environment effect
BIOL 582 Factor interactions Species 1 Species 1 Species 1 Growth rate Species 2 Species 2 Species 2 Wet Dry Wet Dry Wet Dry • Consider how a null hypothesis of no factor interaction would be rejected • Important point: a significant interaction indicates differences in either the magnitude or direction of change (or both) between levels of one factor, among levels of the other factor Both species change, but they change in opposite directions One species changes; the other does not Both species change in a similar direction; one at a greater rate
BIOL 582 Factor interactions Species 1 Species 1 Growth rate Growth rate Species 2 Species 2 Wet Moist Dry Wet Moist Dry • Consider how a null hypothesis of no factor interaction would be rejected when there are more than two levels of change (more possibilities exist!)
BIOL 582 Factorial Model Set-up • First, consider this linear equation • Which has the model • That produces error
BIOL 582 Factorial Model Set-up • Possible “sub-models” (reduced models) of the full model. They are shown here in terms of decreasing complexity • Imagine that for every model, the SSE can be obtained easily (from residuals of predictions made by estimated model parameters). • There are five sets of SSE from the four different models • From model containing: both factors & interaction both factors only A factor only B factor only intercept only • All models contain an intercept
BIOL 582 Factorial Model Hypotheses Note: When using Type I SS, the order of factor introduction can be important (see examples in R) Note: One can use Types 1, II, or III SS. More on this in a moment.
BIOL 582 Factorial Model Uses and Assumptions • There are MANY uses for factorial models in biological research • Randomized Block designs: Subjects are randomized to treatments, within blocks. Blocks are experimental replicates. Block effects and interactions can be evaluated to consider extraneous sources of variation. • Temporal considerations (time as a factor) • Spatial considerations (geography, altitude as a factor) • Sexual dimorphism (sex as a factor) • ETC.! • Assumptions include • Normally distributed residuals (not data) • Homoscedasticity • Independent observations (i.e., sample sizes don’t contain multiple measurements on the same subjects; different samples or treatments do not contain the same subjects) • These are the assumptions of Linear Models!
BIOL 582 Factorial Model Evaluation • Summary of ANOVA for two-factor factorial models • Type I (Sequential) – values in blue only necessary for F distribution-determination of P-values. • Type III (Weighted) • * This approach assumes that factor A is already in the model • k is the number of parameters (coefficients) needed for the effect
BIOL 582 Factorial Model Evaluation • Summary of ANOVA for two-factor factorial models • Type II (Partially Sequential) • Type II SS does not seem remarkably different than type I SS, but the difference is more profound with complex models. • For example, consider a 3-factor factorial model, which has effects A, B, C, AB, AC, BC, ABC • For type III SS the reduced model for interactions removes only the interaction of interest; for type II SS, reduced models remove the factor or interaction plus any subsequent interactions • Greater than 2-factor factorials are hard to evaluate, but it is worth considering the 3-factor factorial for the sake of SS types.
BIOL 582 Factorial Model Evaluation • Summary of ANOVA for three-factor factorial models • Type III (Weighted) – F stat calculations removed for simplicity
BIOL 582 Factorial Model Evaluation • Summary of ANOVA for three-factor factorial models • Type II (Partially Sequential) – F stat calculations removed
BIOL 582 Factorial Model Evaluation • Summary of ANOVA for three-factor factorial models • Type I (Fully Sequential) – F stat calculations removed • Multi-factor factorial models can get kind of crazy • There are various pros and cons to using different SS types – beyond our worries, for the most part • Just remember that ANOVA is nothing more than a comparison of errors of different models • Having clarity in realizing two different models that should be compared is ALL YOU NEED TO KNOW • Canned ANOVA in R or other software will try to make tables like above, but not every row is always needed • Biological intuition is a smart way to go • Creating your own table with “relevant” sources of variation is sufficient
BIOL 582 Factorial Model ANOVA Example • Example from pupfish-parasite data in R > log.grubs<-log(GRUBS+1) > > # TYPE I SS, two ways > > lm.pop.sex<-lm(log.grubs~POPULATION*SEX) > lm.sex.pop<-lm(log.grubs~SEX*POPULATION) > > anova(lm.pop.sex) Analysis of Variance Table Response: log.grubs Df Sum Sq Mean Sq F value Pr(>F) POPULATION 1 0.088 0.0879 0.0553 0.814652 SEX 1 16.643 16.6425 10.4601 0.001658 ** POPULATION:SEX 1 6.606 6.6055 4.1517 0.044261 * Residuals 99 157.513 1.5910 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘’ 1 > anova(lm.sex.pop) Analysis of Variance Table Response: log.grubs Df Sum Sq Mean Sq F value Pr(>F) SEX 1 15.554 15.5543 9.7762 0.002321 ** POPULATION 1 1.176 1.1762 0.7392 0.391980 SEX:POPULATION 1 6.606 6.6055 4.1517 0.044261 * Residuals 99 157.513 1.5910 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘’ 1 >
BIOL 582 Factorial Model ANOVA Example • Example from pupfish-parasite data in R > qqnorm(resid(lm.pop.sex)) > shapiro.test(resid(lm.pop.sex)) Shapiro-Wilk normality test data: resid(lm.pop.sex) W = 0.9869, p-value = 0.4061 > qqnorm(resid(lm.sex.pop)) > shapiro.test(resid(lm.sex.pop)) Shapiro-Wilk normality test data: resid(lm.sex.pop) W = 0.9869, p-value = 0.4061
BIOL 582 Factorial Model ANOVA Example • Example from pupfish-parasite data in R > par(mfrow=c(1,2)) > plot(predict(lm.pop.sex),(resid(lm.pop.sex)-mean(resid(lm.pop.sex))/sd(resid(lm.pop.sex))),xlab="Predicted Values",ylab="Standardized Residuals",main="lm.pop.sex") > plot(predict(lm.sex.pop),(resid(lm.sex.pop)-mean(resid(lm.sex.pop))/sd(resid(lm.sex.pop))),xlab="Predicted Values",ylab="Standardized Residuals",main="lm.sex.pop")
BIOL 582 Factorial Model ANOVA Example • Example from pupfish-parasite data in R > log.grubs<-log(GRUBS+1) > > # TYPE I SS, two ways > > lm.pop.sex<-lm(log.grubs~POPULATION*SEX) > lm.sex.pop<-lm(log.grubs~SEX*POPULATION) > > anova(lm.pop.sex) Analysis of Variance Table Response: log.grubs Df Sum Sq Mean Sq F value Pr(>F) POPULATION 1 0.088 0.0879 0.0553 0.814652 SEX 1 16.643 16.6425 10.4601 0.001658 ** POPULATION:SEX 1 6.606 6.6055 4.1517 0.044261 * Residuals 99 157.513 1.5910 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘’ 1 > anova(lm.sex.pop) Analysis of Variance Table Response: log.grubs Df Sum Sq Mean Sq F value Pr(>F) SEX 1 15.554 15.5543 9.7762 0.002321 ** POPULATION 1 1.176 1.1762 0.7392 0.391980 SEX:POPULATION 1 6.606 6.6055 4.1517 0.044261 * Residuals 99 157.513 1.5910 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘’ 1 >
BIOL 582 Factorial Model ANOVA Example • Example from pupfish-parasite data in R > # TYPE III SS, two ways > > lm.pop.sex<-lm(log.grubs~POPULATION*SEX) > lm.sex.pop<-lm(log.grubs~SEX*POPULATION) > > drop1(lm.pop.sex,test="F") Single term deletions Model: log.grubs ~ POPULATION * SEX Df Sum of Sq RSS AIC F value Pr(F) <none> 157.51 51.752 POPULATION:SEX 1 6.6055 164.12 53.984 4.1517 0.04426 * --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘’ 1 > drop1(lm.sex.pop,test="F") Single term deletions Model: log.grubs ~ SEX * POPULATION Df Sum of Sq RSS AIC F value Pr(F) <none> 157.51 51.752 SEX:POPULATION 1 6.6055 164.12 53.984 4.1517 0.04426 * --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘’ 1 >
BIOL 582 Factorial Model ANOVA Example • Example from pupfish-parasite data in > # R will not "go all the way" so one can try the following > > lm.pop.by.sex<-lm(log.grubs~POPULATION*SEX) > lm.pop.and.sex<-lm(log.grubs~POPULATION+SEX) > > drop1(lm.pop.by.sex,test="F") Single term deletions Model: log.grubs ~ POPULATION * SEX Df Sum of Sq RSS AIC F value Pr(F) <none> 157.51 51.752 POPULATION:SEX 1 6.6055 164.12 53.984 4.1517 0.04426 * --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘’ 1 > drop1(lm.pop.and.sex,test="F") Single term deletions Model: log.grubs ~ POPULATION + SEX Df Sum of Sq RSS AIC F value Pr(F) <none> 164.12 53.984 POPULATION 1 1.1762 165.29 52.719 0.7167 0.399264 SEX 1 16.6425 180.76 61.932 10.1405 0.001934 ** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘’ 1 > This is an admixture of SS types WRONG
BIOL 582 Factorial Model ANOVA Example • Example from pupfish-parasite data in R > # The following requires using the 'car' (companion to applied regression) package > # you might have to install it. > # Not needed for type 1 > > library(car) > > anova(lm.pop.sex) Analysis of Variance Table Response: log.grubs Df Sum Sq Mean Sq F value Pr(>F) POPULATION 1 0.088 0.0879 0.0553 0.814652 SEX 1 16.643 16.6425 10.4601 0.001658 ** POPULATION:SEX 1 6.606 6.6055 4.1517 0.044261 * Residuals 99 157.513 1.5910 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘’ 1 > anova(lm.sex.pop) Analysis of Variance Table Response: log.grubs Df Sum Sq Mean Sq F value Pr(>F) SEX 1 15.554 15.5543 9.7762 0.002321 ** POPULATION 1 1.176 1.1762 0.7392 0.391980 SEX:POPULATION 1 6.606 6.6055 4.1517 0.044261 * Residuals 99 157.513 1.5910 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘’ 1
BIOL 582 Factorial Model ANOVA Example • Example from pupfish-parasite data in R > # The following requires using the 'car' (companion to applied regression) package > # you might have to install it. > # Not needed for type 1 > > Anova(lm.pop.sex, type="II") Anova Table (Type II tests) Response: log.grubs Sum Sq Df F value Pr(>F) POPULATION 1.176 1 0.7392 0.391980 SEX 16.643 1 10.4601 0.001658 ** POPULATION:SEX 6.606 1 4.1517 0.044261 * Residuals 157.513 99 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘’ 1 > Anova(lm.sex.pop, type="II") Anova Table (Type II tests) Response: log.grubs Sum Sq Df F value Pr(>F) SEX 16.643 1 10.4601 0.001658 ** POPULATION 1.176 1 0.7392 0.391980 SEX:POPULATION 6.606 1 4.1517 0.044261 * Residuals 157.513 99 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘’ 1
BIOL 582 Factorial Model ANOVA Example • Example from pupfish-parasite data in R > # The following requires using the 'car' (companion to applied regression) package > # you might have to install it. > # Not needed for type 1 > > Anova(lm.pop.sex, type="III") Anova Table (Type III tests) Response: log.grubs Sum Sq Df F value Pr(>F) (Intercept) 99.054 1 62.2572 4.119e-12 *** POPULATION 0.632 1 0.3973 0.52995 SEX 1.307 1 0.8214 0.36696 POPULATION:SEX 6.606 1 4.1517 0.04426 * Residuals 157.513 99 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘’ 1 > Anova(lm.sex.pop, type="III") Anova Table (Type III tests) Response: log.grubs Sum Sq Df F value Pr(>F) (Intercept) 99.054 1 62.2572 4.119e-12 *** SEX 1.307 1 0.8214 0.36696 POPULATION 0.632 1 0.3973 0.52995 SEX:POPULATION 6.606 1 4.1517 0.04426 * Residuals 157.513 99 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘’ 1
BIOL 582 Factorial Model ANOVA Example • General Rule of Thumb • If the interaction is significant, do not evaluate “main” effects; just evaluate the interaction • If the interaction is not significant, evaluate the main effects (ignore the interaction or remove it from the model) > interaction.plot(POPULATION,SEX,log.grubs)
BIOL 582 Factorial Model ANOVA Example • General Rule of Thumb • If the interaction is significant, do not evaluate “main” effects; just evaluate the interaction • If the interaction is not significant, evaluate the main effects (ignore the interaction or remove it from the model) > library(gplots) # requires that gplots package is installed > group<-factor(paste(SEX,POPULATION,SEP=".")) > plotmeans(log.grubs~group)
BIOL 582 Multiple Comparisons • SS Type is not important. • Multiple comparison tests like Tukey’s HSD use the SSE of the full model to calculate standard error. • Example from pupfish-parasite data in R > sex<-factor(SEX); pop<-factor(POPULATION) > > TukeyHSD(aov(log.grubs~pop*sex)) Tukey multiple comparisons of means 95% family-wise confidence level Fit: aov(formula = log.grubs ~ pop * sex) $pop diff lwr upr p adj 2-1 0.05856328 -0.4358007 0.5529273 0.8146523 $sex diff lwr upr p adj M-F 0.805493 0.3016897 1.309296 0.0020137 $`pop:sex` diff lwr upr p adj 2:F-1:F -0.2072937 -1.0667296 0.6521422 0.9220590 1:M-1:F 0.3153772 -0.6361571 1.2669114 0.8222764 2:M-1:F 1.1630006 0.1180954 2.2079059 0.0228731 1:M-2:F 0.5226708 -0.3367651 1.3821067 0.3894327 2:M-2:F 1.3702943 0.4085045 2.3320841 0.0018319 2:M-1:M 0.8476235 -0.1972817 1.8925287 0.1539578
BIOL 582 Final Remarks • Factorial ANOVA is essentially one-way ANOVA broken into subcomponents of variation. • Consider this example > lm.sex.pop<-lm(log.grubs~sex*pop) > group<-factor(paste(SEX,POPULATION,SEP=".")) > lm.group<-lm(log.grubs~group) > > summary(lm.sex.pop) Call: lm(formula = log.grubs ~ sex * pop) Residuals: Min 1Q Median 3Q Max -2.8590 -0.8546 0.1149 0.7944 2.8303 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 4.1649 0.2575 16.176 <2e-16 *** sexM 0.3154 0.3641 0.866 0.3885 pop2 -0.2073 0.3289 -0.630 0.5300 sexM:pop2 1.0549 0.5177 2.038 0.0443 * --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘’ 1 Residual standard error: 1.261 on 99 degrees of freedom Multiple R-squared: 0.129, Adjusted R-squared: 0.1026 F-statistic: 4.889 on 3 and 99 DF, p-value: 0.003273
BIOL 582 Final Remarks • Factorial ANOVA is essentially one-way ANOVA broken into subcomponents of variation. • Consider this example > lm.sex.pop<-lm(log.grubs~sex*pop) > group<-factor(paste(SEX,POPULATION,SEP=".")) > lm.group<-lm(log.grubs~group) > > summary(lm.group) Call: lm(formula = log.grubs ~ group) Residuals: Min 1Q Median 3Q Max -2.8590 -0.8546 0.1149 0.7944 2.8303 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 4.1649 0.2575 16.176 < 2e-16 *** groupF 2 . -0.2073 0.3289 -0.630 0.52995 groupM 1 . 0.3154 0.3641 0.866 0.38852 groupM 2 . 1.1630 0.3999 2.909 0.00448 ** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘’ 1 Residual standard error: 1.261 on 99 degrees of freedom Multiple R-squared: 0.129, Adjusted R-squared: 0.1026 F-statistic: 4.889 on 3 and 99 DF, p-value: 0.003273
BIOL 582 Final Remarks • Factorial ANOVA is essentially one-way ANOVA broken into subcomponents of variation. • Consider this example > TukeyHSD(aov(log.grubs~sex*pop)) Tukey multiple comparisons of means 95% family-wise confidence level Fit: aov(formula = log.grubs ~ sex * pop) $sex diff lwr upr p adj M-F 0.7938817 0.2900784 1.297685 0.0023212 $pop diff lwr upr p adj 2-1 0.2101225 -0.2842415 0.7044865 0.4010581 $`sex:pop` diff lwr upr p adj M:1-F:1 0.3153772 -0.6361571 1.2669114 0.8222764 F:2-F:1 -0.2072937 -1.0667296 0.6521422 0.9220590 M:2-F:1 1.1630006 0.1180954 2.2079059 0.0228731 F:2-M:1 -0.5226708 -1.3821067 0.3367651 0.3894327 M:2-M:1 0.8476235 -0.1972817 1.8925287 0.1539578 M:2-F:2 1.3702943 0.4085045 2.3320841 0.0018319 > TukeyHSD(aov(log.grubs~group)) Tukey multiple comparisons of means 95% family-wise confidence level Fit: aov(formula = log.grubs ~ group) $group diff lwr upr p adj F.2-F.1 -0.2072937 -1.0667296 0.6521422 0.9220590 M.1-F.1 0.3153772 -0.6361571 1.2669114 0.8222764 M.2-F.1 1.1630006 0.1180954 2.2079059 0.0228731 M.1-F.2 0.5226708 -0.3367651 1.3821067 0.3894327 M.2-F.2 1.3702943 0.4085045 2.3320841 0.0018319 M.2-M.1 0.8476235 -0.1972817 1.8925287 0.1539578
BIOL 582 Final Remarks • Next time using R, try this • or • It is super cool! > plot(lm.sex.pop) > par(mfcol=c(2,2)) > plot(lm.sex.pop)