180 likes | 307 Vues
This overview delves into multiple regression techniques, emphasizing the inclusion of both continuous and categorical predictor variables. It explores relative abundance of plant functional types across temperate North America, assessing the impact of climate variables on C3 and C4 grass distributions. Key concepts like analysis of variance, collinearity, and model selection through techniques such as forward and backward selection are discussed. The importance of using transformed data, addressing computational challenges, and understanding matrix algebra in ordinary least squares estimation is also highlighted.
E N D
Extensions of simple linear regression • Multiple regression models: predictor variables are continuous • Analysis of variance: predictor variables are categorical (grouping variables), • But… general linear models can include both continuous and categorical predictors
Relative abundance of C3 and C4 plants • Paruelo & Lauenroth (1996) • Geographic distribution and the effects of climate variables on the relative abundance of a number of plant functional types (PFTs): shrubs, forbs, succulents, C3 grasses and C4 grasses.
Relative abundance of PTFs (based on cover, biomass, and primary production) for each site Longitude Latitude Mean annual temperature Mean annual precipitation Winter (%) precipitation Summer (%) precipitation Biomes (grassland , shrubland) data 73 sites across temperate central North America Response variable Predictor variables
Box 6.1 Relative abundance transformed ln(dat+1) because positively skewed
Collinearity • Causes computational problems because it makes the determinant of the matrix of X-variables close to zero and matrix inversion basically involves dividing by the determinant (very sensitive to small differences in the numbers) • Standard errors of the estimated regression slopes are inflated
Detecting collinearlity • Check tolerance values • Plot the variables • Examine a matrix of correlation coefficients between predictor variables
Dealing with collinearity • Omit predictor variables if they are highly correlated with other predictor variables that remain in the model
(lnC3)= βo+ β1(lat)+ β2(long)+ β3(latxlong) After centering both lat and long
Matrix algebra approach to OLS estimation of multiple regression models • Y=βX+ε • X’Xb=XY • b=(X’X) -1 (XY)
Criteria for “best” fitting in multiple regression with p predictors.