350 likes | 474 Vues
This outline provides an overview of multiple regression analysis, a key tool in social science research for predicting outcomes based on multiple variables. It covers concepts like bi-variate regression, model comparison using Delta R², and practical implementation in SPSS. By examining factors such as study time, test time, and sleep hours, researchers can assess their impact on test performance. The document also details different methods for entering variables in SPSS and the interpretation of results, including unstandardized and standardized coefficients.
E N D
Statistics for the Social Sciences Prediction with multiple variables Psychology 340 Spring 2010
Outline • Multiple regression • Comparing models, Delta r2 • Using SPSS
Multiple Regression • Typically researchers are interested in predicting with more than one explanatory variable • In multiple regression, an additional predictor variable (or set of variables) is used to predict the residuals left over from the first predictor.
Multiple Regression • Bi-variate regression prediction models Y = intercept + slope (X) + error
“residual” “fit” Multiple Regression • Multiple regression prediction models • Bi-variate regression prediction models Y = intercept + slope (X) + error
whatever variability is left over First Explanatory Variable Second Explanatory Variable Third Explanatory Variable Fourth Explanatory Variable Multiple Regression • Multiple regression prediction models
whatever variability is left over First Explanatory Variable Second Explanatory Variable Third Explanatory Variable Fourth Explanatory Variable Multiple Regression • Predict test performance based on: • Study time • Test time • What you eat for breakfast • Hours of sleep
versus versus Multiple Regression • Predict test performance based on: • Study time • Test time • What you eat for breakfast • Hours of sleep • Typically your analysis consists of testing multiple regression models to see which “fits” best (comparing r2s of the models) • For example:
Response variable Total variability it test performance Total study time r = .6 Multiple Regression Model #1: Some co-variance between the two variables • If we know the total study time, we can predict 36% of the variance in testperformance R2 for Model = .36 64% variance unexplained
Multiple Regression Model #2: Add test time to the model • Little co-variance between these test performance and test time • We can explain more the of variance in test performance R2 for Model = .49 Response variable Total variability it test performance Total study time r = .6 51% variance unexplained Test time r = .1
Multiple Regression Model #3: No co-variance between these test performance and breakfast food • Not related, so we can NOT explain more the of variance in test performance R2 for Model = .49 Response variable Total variability it test performance breakfast r = .0 Total study time r = .6 51% variance unexplained Test time r = .1
Multiple Regression Model #4: Some co-variance between these test performance and hours of sleep • We can explain more the of variance • But notice what happens with the overlap (covariation between explanatory variables), can’t just add r’s or r2’s R2 for Model = .60 Response variable Total variability it test performance breakfast r = .0 Total study time r = .6 40% variance unexplained Hrs of sleep r = .45 Test time r = .1
Multiple Regression in SPSS Setup as before: Variables (explanatory and response) are entered into columns • A couple of different ways to use SPSS to compare different models
Regression in SPSS • Analyze: Regression, Linear
Predicted (criterion) variable into Dependent Variable field • All of the predictor variables into the Independent Variable field Multiple Regression in SPSS • Method 1:enter all the explanatory variables together • Enter:
Multiple Regression in SPSS • The variables in the model • r for the entire model • r2 for the entire model • Unstandardized coefficients • Coefficient for var1 (var name) • Coefficient for var2 (var name)
Coefficient for var1 (var name) • Coefficient for var2 (var name) Multiple Regression in SPSS • The variables in the model • r for the entire model • r2 for the entire model • Standardized coefficients
Multiple Regression • Which β to use, standardized or unstandardized? • Unstandardized β’s are easier to use if you want to predict a raw score based on raw scores (no z-scores needed). • Standardized β’s are nice to directly compare which variable is most “important” in the equation
First Predictor variable into the Independent Variable field • Click the Next button Multiple Regression in SPSS • Method 2: enter first model, then add another variable for second model, etc. • Enter: • Predicted (criterion) variable into Dependent Variable field
Second Predictor variable into the Independent Variable field • Click Statistics Multiple Regression in SPSS • Method 2 cont: • Enter:
Multiple Regression in SPSS • Click the ‘R squared change’ box
Multiple Regression in SPSS • Shows the results of two models • The variables in the first model (math SAT) • The variables in the second model (math and verbal SAT)
Multiple Regression in SPSS • Shows the results of two models • The variables in the first model (math SAT) • The variables in the second model (math and verbal SAT) • r2 for the first model • Model 1 • Coefficients for var1 (var name)
Coefficients for var1 (var name) • Coefficients for var2 (var name) Multiple Regression in SPSS • Shows the results of two models • The variables in the first model (math SAT) • The variables in the second model (math and verbal SAT) • r2 for the second model • Model 2
Multiple Regression in SPSS • Shows the results of two models • The variables in the first model (math SAT) • The variables in the second model (math and verbal SAT) • Change statistics: is the change in r2 from Model 1 to Model 2 statistically significant?
“residual” “fit” Hypothesis testing with Regression • Multiple Regression • We can test hypotheses about the overall model
Multiple Regression in SPSS • Null Hypotheses • H0: University GPA is not predicted by SAT verbal or SAT Math scores • p < 0.05, so reject H0, SAT math and verbal predict University GPA
First Explanatory Variable Second Explanatory Variable Third Explanatory Variable Fourth Explanatory Variable Hypothesis testing with Regression • Multiple Regression • We can test hypotheses about each of these explanatory hypotheses within a regression model • So it’ll tell us whether that variable is explaining a “significant”amount of the variance in the response variable • We can test hypotheses about the overall model
H0: Coefficient for var1 = 0 • p < 0.05, so reject H0, var1 is a significant predictor • H0: Coefficient for var2 = 0 • p > 0.05, so fail to reject H0, var2 is a not a significant predictor Multiple Regression in SPSS • Null Hypotheses
Hypothesis testing with Regression • Multiple Regression • We can test hypotheses about each of these explanatory hypotheses within a regression model • So it’ll tell us whether that variable is explaining a “significant”amount of the variance in the response variable • We can test hypotheses about the overall model • We can also use hypothesis testing to examine if the change in r2 is statistically significant
Hypothesis testing with Regression • Shows the results of two models • The variables in the first model (math SAT) • The variables in the second model (math and verbal SAT) • r2 for the first model • Model 1 • Coefficients for var1 (var name)
Coefficients for var1 (var name) • Coefficients for var2 (var name) Hypothesis testing with Regression • Shows the results of two models • The variables in the first model (math SAT) • The variables in the second model (math and verbal SAT) • r2 for the second model • Model 2
The 0.002 change in r2 is not statistically significant (p = 0.46) Hypothesis testing with Regression • Shows the results of two models • The variables in the first model (math SAT) • The variables in the second model (math and verbal SAT) • Change statistics: is the change in r2 from Model 1 to Model 2 statistically significant?
Regression in Research Articles • Bivariate prediction models rarely reported • Multiple regression results commonly reported
Cautions in Multiple Regression • We can use as many predictors as we wish but we should be careful not to use more predictors than is warranted. • Simpler models are more likely to generalize to other samples. • If you use as many predictors as you have participants in your study, you can predict 100% of the variance. Although this may seem like a good thing, it is unlikely that your results would generalize to any other sample and thus they are not valid. • You probably should have at least 10 participants per predictor variable (and probably should aim for about 30).