170 likes | 316 Vues
Fighting for fame, scrambling for fortune, where is the end? Great wealth and glorious honor, no more than a night dream. Lasting pleasure, worry-free forever, who can attain?. Categorical Data Analysis. Chapter 4: Introduction to Generalized Linear Models. 3-Way Tables.
 
                
                E N D
Fighting for fame, scrambling for fortune, where is the end? Great wealth and glorious honor, no more than a night dream. Lasting pleasure, worry-free forever, who can attain?
Categorical Data Analysis Chapter 4: Introduction to Generalized Linear Models
Marginal vs. Conditional • Marginal independence: marginal odds ratio =1 • Conditional independence: conditional odds ratios =1 • The observed effect of X on Y might simply reflect effects of other covariates on both X and Y
Generalized Linear Models (GLM) 3 components: • Random component (Y1, Y2, … , Yn): n independent observations from a distribution in the exponential family (not necessary to be i.i.d.): for i=1,2,…,n, where a, b, c are all positive functions. e.g. poisson, binomial, exponential, normal
Systematic component: the right hand side of the model equation; often a linear combination of explanatory variables: Or in matrix format or X’B
Link component: the link between u(=E(Y)) and X’B In a model equation g(u)=X’B, g(.) is called the link function, a monotonic differentiable function • Most common link: Canonical link e.g. normal, binomial, poisson
Canonical Link • Normal data: identity link • Binary data: logit link • Count data: log link
Deviance • Deviance measures the loss of information from data reduction • Saturated model: the most general model which fits each observation • Could be each subject’s response or • Could be each cell frequency
GLMs for Binary Data • Linear probability model (identity link) • Logistic regression model (logit link) • Probit model (probit link)
Example: Snoring vs. Heart Disease Note: The fits will change if relative spacings between scores change.
Example: 2x2 Tables • Binary covariate X and response Y • Logit link GLM:
GLMs for Count Data • Poisson loglinear model • Count data: certain events occur over time, space or alike, e.g. the # of car accidents of a random sample of 100 drivers in 2005 • Rate data: count/(time or space or alike), e.g. the car accident rates of a random sample of 100 drivers in 2005 (Sec. 9.7.1, p. 385)
Example: Horseshoe Crabs Data: Table 4.3 (p. 127) • Y= # of satellites • X= carapace width • Poisson loglinear model • Poisson GLM with identity link
Inference for GLMs • Goodness of fit: • Measure: deviance • Test: Likelihood-Ratio (LR) tests • Model comparison: L-R tests • Residuals: Pearson and standardized