1 / 39

BA 201

BA 201. Lecture 14 Multiple Regression Model. Topics. Developing the Multiple Linear Regression Inferences on Population Regression Coefficients Pitfalls in Multiple Regression and Ethical Issues. The Multiple Regression Model.

liam
Télécharger la présentation

BA 201

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. BA 201 Lecture 14 Multiple Regression Model

  2. Topics • Developing the Multiple Linear Regression • Inferences on Population Regression Coefficients • Pitfalls in Multiple Regression and Ethical Issues

  3. The Multiple Regression Model Relationship between 1 dependent & 2 or more independent variables is a linear function Population Y-intercept Population slopes Random Error Residual Dependent (Response) variable for sample Independent (Explanatory) variables for sample model

  4. Simple Linear Regression Model Revisited Y X Observed Value

  5. Population Multiple Regression Model Bivariate model(2 Independent Variables: X1 and X2)

  6. Sample Multiple Regression Model Bivariate model Sample Regression Plane

  7. Multiple Linear Regression Equation Too complicated by hand! Ouch!

  8. Multiple Regression Model: Example (0F) Develop a model for estimating heating oil used for a single family home in the month of January based on average temperature and amount of insulation in inches.

  9. Multiple Regression in PHStat • PHStat | Regression | Multiple Regression … • EXCEL spreadsheet for the heating oil example.

  10. Sample Multiple Regression Equation: Example Excel Output For each degree increase in temperature, the estimated average amount of heating oil used is decreased by 5.437 gallons, holding insulation constant. For each increase in one inch of insulation, the estimated average use of heating oil is decreased by 20.012 gallons, holding temperature constant.

  11. Interpretation of Estimated Coefficients • Slope (bi) • Estimated that the average value of Y changes by bi for each 1 unit increase in Xi holding all other variables constant (ceterus paribus) • Example: If b1 = -2, then fuel oil usage (Y) is expected to decrease by an estimated 2 gallons for each 1 degree increase in temperature (X1) given the inches of insulation (X2) • Y-Intercept (b0) • The estimated average value of Y when all Xi = 0

  12. Simple and Multiple Regression Compared • Coefficients in a simple regression pick up the impact of that variable plus the impacts of other variables that are correlated with it and the dependent variable but are excluded from the model. • Coefficients in a multiple regression net out the impacts of other variables in the equation. • Hence they are called the netregression coefficients. • They still pick up the effects of other variables that excluded form the model but are correlated with the included variables and the dependent variable.

  13. Simple and Multiple Regression Compared:Example • Two simple regressions: • Multiple Regression:

  14. Simple and Multiple Regression Compared: Excel Output

  15. Simple and Multiple Regression Compared: Excel Output =

  16. Venn Diagrams and Explanatory Power of a Simple Regression Variations in Oil explained by the error term Variations in Temp not used in explaining variation in Oil Oil Variations in Oil explained by Temp or variations in Temp used in explaining variation in Oil Temp

  17. Venn Diagrams and Explanatory Power of a Simple Regression (continued) Oil Temp

  18. Venn Diagrams and Explanatory Power of a Multiple Regression Overlapping variation in both Temp and Insulation are used in explaining the variation in Oil but NOT in the estimation of nor Variation NOT explained by Temp nor Insulation Oil Temp Insulation

  19. Coefficient of Multiple Determination • Proportion of Total Variation in Y Explained by All X Variables Taken Together • Never Decreases When a New X Variable is Added to Model • Disadvantage When Comparing Models

  20. Venn Diagrams and Explanatory Power of Regression Oil Temp Insulation

  21. Adjusted Coefficient of Multiple Determination • Proportion of Variation in Y Explained by All X Variables adjusted for the Number of X Variables Used and the Sample Size • Penalize Excessive Use of Independent Variables • Smaller than • Useful in Comparing among Models • Could Decrease If an Insignificant New X Variable Is Added to the Model

  22. Coefficient of Multiple Determination Excel Output • Adjusted r2 • reflects the number of explanatory variables and sample size • is smaller than r2

  23. Interpretation of Coefficient of Multiple Determination • 96.56% of the total variation in heating oil can be explained by different temperature and the variation in the amount of insulation • 95.99% of the total fluctuation in heating oil can be explained by different temperature and the variation in the amount of insulation after adjusting for the number of explanatory variables and sample size

  24. Example: Adjusted r2Can Decrease Adjusted r 2 decreases when k increases from 2 to 3

  25. Using The Model to Make Predictions Predict the amount of heating oil used for a home if the average temperature is 300 and the insulation is 6 inches. The predicted heating oil used is 278.97 gallons

  26. Predictions in PHStat • PHStat | Regression | Multiple Regression … • Check the “Confidence and Prediction Interval Estimate” box • EXCEL spreadsheet for the heating oil example.

  27. Another Example • The Excel spreadsheet that contains the multiple regression result of regressing Mid-term scores on quiz scores and attendance score

  28. Residual Plots • Residuals Vs • May need to transform Y variable • Residuals Vs • May need to transform variable • Residuals Vs • May need to transform variable • Residuals Vs Time • May have autocorrelation

  29. Residual Plots: Example Maybe some non-linear relationship No Discernable Pattern

  30. Testing for Overall Significance • Shows if there is a Linear Relationship between all of the X Variables Together and Y • Shows if Y Depends Linearly on all of the X Variables Together as a Group • Use F Test Statistic • Hypotheses: • H0: 1 = 2 = … = k = 0 (No linear relationship) • H1: At least one i  0 ( At least one independent variable affects Y ) • The Null Hypothesis is a Very Strong Statement • Almost Always Reject the Null Hypothesis

  31. Testing for Overall Significance (continued) • Test Statistic: • where F has k numerator and (n-k-1) denominator degrees of freedom

  32. Test for Overall SignificanceExcel Output: Heating Oil Example p value k = 2, the number of explanatory variables n - 1

  33. H0: 1 = 2 = … = k = 0 H1: At least one i 0  = .05 df = 2 and 12 Critical Value(s): Test for Overall SignificanceExample Solution Test Statistic: Decision: Conclusion:  F 168.47 (Excel Output) Reject at  = 0.05 There is evidence that at least one independent variable affects Y  = 0.05 F 0 3.89

  34. Test for Significance:Individual Variables • Shows if There is a Linear Relationship Between the Variable Xi and Y while Holding the Effects of other X’s Fixed • Show if Y Depends Linearly on a Single Xi Individually while Holding the Effects of other X’s Fixed • Use t Test Statistic • Hypotheses: • H0: i= 0 (No linear relationship) • H1: i 0 (Linear relationship between Xi and Y)

  35. t Test StatisticExcel Output: Example t Test Statistic for X1 (Temperature) t Test Statistic for X2 (Insulation)

  36. t Test : Example Solution Does temperature have a significant effect on monthly consumption of heating oil? Test at  = 0.05. H0: 1 = 0 H1: 1 0 df = 12 Critical Value(s): Test Statistic: Decision: Conclusion: t Test Statistic = -16.1699 Reject H0 at  = 0.05 Reject H Reject H 0 0 There is evidence of a significant effect of temperature on oil consumption. .025 .025 b1 0 t 2.1788 -2.1788 0

  37. Confidence Interval Estimate for the Slope Provide the 95% confidence interval for the population slope 1(the effect of temperature on oil consumption). -6.169 1 -4.704 The estimated average consumption of oil is reduced by between 4.7 gallons to 6.17 gallons per each increase of 10 F.

  38. Additional Pitfalls and Ethical Issues • Fail to Understand that Interpretation of the Estimated Regression Coefficients are Performed Holding All Other Independent Variables Constant • Fail to Evaluate Residual Plots for Each Independent Variable

  39. Summary • Developed the Multiple Regression Model • Addressed Testing the Significance of the Multiple Regression Model • Discussed Inferences on Population Regression Coefficients • Addressed Pitfalls in Multiple Regression and Ethical Issues

More Related