Correlation and Regression
270 likes | 625 Vues
Correlation and Regression. It’s the Last Lecture Hooray!. Correlation. Analyze Correlate Bivariate… Click over variables you wish to correlate Options Can select descriptives and pairwise vs. listwise deletion
Correlation and Regression
E N D
Presentation Transcript
Correlation and Regression It’s the Last Lecture Hooray!
Correlation • Analyze Correlate Bivariate… • Click over variables you wish to correlate • Options Can select descriptives and pairwise vs. listwise deletion • Pairwise deletion – only cases with data for all variables are included (default) • Listwise deletion - only cases with data for both variables are included
Correlation • Assumptions: • Linear relationship between variables • Inspect scatterplot • Normality • Shapiro-Wilk’s W • Other issues: • Range restriction & Heterogenous subgroups • Identified methodologically • Outliers • Inspect scatterplot
Correlation • Partial Correlation – removes variance from a 3rd variable, like ANCOVA • Analyze Correlate Partial…
Regression • Analyze Regression Linear… • Use if both predictor(s) and criterion variables are continuous • Dependent = Criterion • Independent = Predictor(s) • Statistics… • Regression Coefficients (b & β) • Estimates • Confidence intervals • Covariance matrix
Regression • Statistics… • Model fit • R square change • Descriptives • Part and partial correlations • Collinearity diagnostics • Recall that you don’t want your predictors to be too highly related to one another • Collinearity/Mulitcollinearity – when predictors are too highly correlated with one another • Eigenvalues of the scaled and uncentered cross-products matrix, condition indices, and variance-decomposition proportions are displayed along with variance inflation factors (VIF) and tolerances for individual predictors • Tolerances should be > .2; VIF should be < 4
Regression • Statistics… • Residuals • Durbin-Watson • Tests correlation among residuals (i.e. autocorrelation) - significant correlation implies nonindependent data • Clicking on this will also display a histogram of residuals, a normal probability plot of residuals, and the case numbers and standardized residuals for the 10 cases with the largest standardized residuals • Casewise diagnostics • Identifies outliers according to pre-specified criteria
Regression • Plots… • Plot standardized residuals (*ZRESID) on y-axis and standardized predicted values (*ZPRED) on x-axis • Check “Normal probability plot” under “Standardized Residual Plots”
Regression • Assumptions: • Observations are independent • Linearity of Regression • Look for residuals that get larger at extreme values, i.e. if residual are normally distributed • Save unstandardized residuals • Click Save… Under “Residuals” click “Unstandardized” when you run your regression, • Run a Shapiro-Wilk’s W test on this variable (RES_1)
Regression • Normality in Arrays • Examine normal probability plot of the residuals, residuals should resemble normal distribution curve BADGOOD
Regression • Homogeneity of Variance in Arrays • Look for residuals getting more spread out as a function of predicted value – i.e. cone shaped patter in plot of standardized residuals vs. standardized predicted values BADGOOD
Logistic Regression • Analyze Regression Binary Logistic… • Use if criterion is dichotomous [no assumptions about predictor(s)] • Use “Multinomial Logistic…” if criterion polychotomous (3+ groups) • Don’t worry about that though
Logistic Regression • Assumptions: • Observations are independent • Criterion is dichotomous • No stats needed to show either one of these • Important issues: • Outliers • Save Influence Check “Cook’s” and “Leverage values” • Cook’s statistic – outlier = any variable > 4/(n-k-1), where n = # of cases & k = # of predictors • Leverage values – outlier = anything > .5
Logistic Regression • Multicollinearity • Tolerance and/or VIF statistics aren’t easily obtained with SPSS, so you’ll just have to let this one go • Options… • Classification plots • Table of actual # of S’s in each criterion group vs. predicted group membership – Shows, in detail, how well regression predicted data
Logistic Regression • Options… • Hosmer-Lemeshow goodness-of-fit • More robust than traditional χ2 goodness-of-fit statistic, particularly for models with continuous covariates and small sample sizes • Casewise listing of residuals • Helps ID cases with large residuals (outliers)
Logistic Regression • Options… • Correlations of estimates • Just what it sounds like, correlations among predictors • Iteration history • CI for exp(B) • Provides confidence intervals for standardized logistic regression coefficient • Categorical… • If any predictors are discrete, they must be identified here, as well as which group is the reference group (identified as 0 vs. 1)
Logistic Regression Output • Step number: 1 • Observed Groups and Predicted Probabilities • 32 ô ô • ó ó • ó ó • F ó ó • R 24 ô ô • E ó N ó • Q ó N ó • U ó NN ó • E 16 ô NNNN ô • N ó NNNN ó • C ó NNNN ó • Y ó N NNNN ó • 8 ô NNNNNN ô • ó N NNNNNN ó • ó N NN NNANNNN ó • ó N N AA ANNAAAAAANN ó • Predicted òòòòòòòòòòòòòòôòòòòòòòòòòòòòòôòòòòòòòòòòòòòòôòòòòòòòòòòòòòòò • Prob: 0 .25 .5 .75 1 • Group: AAAAAAAAAAAAAAAAAAAAAAAAAAAAAANNNNNNNNNNNNNNNNNNNNNNNNNNNNNN • Predicted Probability is of Membership for Non-Attritor • The Cut Value is .50 • Symbols: A - Attritor • N - Non-Attritor • Each Symbol Represents 2 Cases.