Topics, Summer 2008

Topics, Summer 2008 Day 1. Introduction Day 2. Samples and populations Day 3. Evaluating relationships Day 4. Regression and Analysis of Variance (ANOVA) • ANOVA as a type of regression • Main effects and interactions Day 5. Logistic regression and mixed effects models • Probability, odds, and log odds (logit) • The logistic regression model • Mixing fixed and random effects

Regression • Linear model • designates one of variables as the dependent variable whose variation is being predicted • other variables then are independent variables which are added together to predict variation • Components of multiple linear regression • intercept (if included) models the expected value when all independent variables are set to 0 • use t-test to evaluate hypothesis that coefficient for each independent variable different from 0 • R2 measures proportion of variation accounted for by the model as a whole

Regression and ANOVA • Multiple linear regression • use lm(y~x1+x2+…+xq) command to model: ŷ = b0 + b1x1 + b2x2 + … + bqxq • Analysis of Variance (ANOVA) is a family of subtypes of linear regression where: • all of the independent (predictor) variables are nominal variables (factors) • use F-test to evaluate all levels of factor together • R2 measures efficacy of model as (as in other linear regression models) • use anova(lm(y~x1+…+xq)) or aov(y~x1+ … +xq)

Logistic regression • dependent variable is a nominal variable that can take one of two values, such as: • [i:] or [I] in Hillenbrand et al. (1995) dataset • NP or VP in Bresnan et al. (2007) dataset • Probability if only one independent variable: p(y) = p(y|x1)*p(x1) • can be made into linear model using logit • odds of observation value: p / (1-p) • log odds: ln(p / (1-p) ) logit(y) = b0 + b1x1 + b2x2 + … + bqxq

Topics, Summer 2008