120 likes | 141 Vues
Find the combination of variables that explains the most variability in the simplest possible model. Use automated procedures with caution, such as principal components. Understand how principal components can be used as explanatory variables in a regression model to predict ratings. Apply stepwise regression (mixed) and best subsets techniques for variable selection. Differentiate between AIC and BIC criteria. Learn how to handle insignificant terms in a model and the importance of parsimony and R2adj. Practice problem and lab assignment included.
E N D
Stat 324 – Day 25 Penalized Regression
Last Time - Variable selection • Want to find the combination of variables that explains the most variability in the simplest possible model • Look for variables that explain a higher percentage of the remaining unexplained variation (partial correlation coefficients) • Can use automated procedures … with caution
Principal components • Example: Have ranked communities on 9 variables. What best distinguishes the communities? • Climate and Terrain (higher scores are better) • Housing (lower scores are better) • Health Care & the Environment (higher) • Crime (lower scores are better) • Transportation (higher) • Education (higher) • The Arts (higher) • Recreation (higher) • Economics (higher)
Example • The first principal component formula: • Could then be used as an explanatory variable in a regression model to predict rating • Second component can also be used with the bonus of being orthogonal to the first • *probably should standardize first
Example • Here is how the original variable correlate with the first three principal components Five variables have a strong correlation with PC1 (communities with better housing tend to have better health etc.) PC1 is really about quality of arts PC2 is about health PC3 suggests places with high crime tend to also have better recreation facilities
Last Time: AIC vs. BIC AIC BIC tyer: 322.4 te: 322.7 tye: 324.2 ter: 324.6 • tyer: 311.1 • tiyer: 311.9 • typer: 312.7 • tiyper: 313.9 The idea behind these measures is similar but BIC has a larger penalty for number of variables so tends to be a bit more conservative (often choosing smaller, less complex models)
Other notes • Insignificant terms • Doesn’t really hurt to leave them in the model as long as you clarify that they are not significant • vs. Parsimony, R2adj • Could keep in by request of subject matter expert or for sake of completeness (e.g., lower order terms of polynomial, set of indicator variables, indicators in presence of interactions)
Today • Another method, developed to deal with multicollinearity, is increasingly popular as a form of variable selection as well
To Do • Practice problem • Wednesday/Thursday: Lab Assignment • Email Dr. Chance questions!