1 / 69

Session 4

Session 4. Outline for Session 4. Summary Measures for the Full Model Top Section of the Output Interval Estimation More Multiple Regression Movers Nonlinear Regression Insurance. Top Section: Summary Statistics. Top Section: Summary Statistics.

arama
Télécharger la présentation

Session 4

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Session 4

  2. Outline for Session 4 • Summary Measures for the Full Model • Top Section of the Output • Interval Estimation • More Multiple Regression • Movers • Nonlinear Regression • Insurance Applied Regression -- Prof. Juran

  3. Top Section: Summary Statistics Applied Regression -- Prof. Juran

  4. Applied Regression -- Prof. Juran

  5. Applied Regression -- Prof. Juran

  6. Top Section: Summary Statistics Applied Regression -- Prof. Juran

  7. Applied Regression -- Prof. Juran

  8. As stated earlier R2 is closely related to the correlation between X and Y, indeed Furthermore, R2 , andthus rX,Y , is closely related to the slope of the regression line via Thus, testing the significance of the slope, testing the significance of R2 and testing the significance of rX,Y are essentially equivalent. Applied Regression -- Prof. Juran

  9. Applied Regression -- Prof. Juran

  10. Applied Regression -- Prof. Juran

  11. Applied Regression -- Prof. Juran

  12. Applied Regression -- Prof. Juran

  13. Applied Regression -- Prof. Juran

  14. Applied Regression -- Prof. Juran

  15. Interval Estimation Applied Regression -- Prof. Juran

  16. An Image of the Residuals Y yi (xi , yi) X xi (xi , yi) The observed values: The fitted values: The residuals: Recall: The regression line passes through the data so that the sum of squared residuals is as small as possible. Applied Regression -- Prof. Juran

  17. Regression and Prediction Regression lines are frequently used for predicting future values of Y given future, conjectural or speculative values of X. Suppose we posit a future value of X, say x0. The predicted value, , is Applied Regression -- Prof. Juran

  18. Under our assumptions this is an unbiased estimate of Y given that x=x0 ,regardless of the value of x0. Let 0 = E(Y(x0)) and thus, since the estimate is unbiased, 0 = b0 + b1x0. However, be alert to the fact that this estimate (prediction) of a future value has a standard error of Furthermore, the standard error of the prediction of the expected (mean) value of Y given x = x0 is Applied Regression -- Prof. Juran

  19. From these facts it follows that a 2-sided “confidence” interval on the expected value of Y given x= x0, 0, is given by Applied Regression -- Prof. Juran

  20. A 2-sided “prediction”interval on future individual values of Y given x = x0, y0, is given by Applied Regression -- Prof. Juran

  21. Confidence Interval on E(Y(x0)) Prediction Interval on Y(x0) Applied Regression -- Prof. Juran

  22. Note that both of these intervals are parabolic functions in x0, have their minimum interval width at x0 = , and their widths depend on and on Sxx The sum of squared x term appears so often in regression equations that it is useful to use the abbreviation Sxx. Note that Sxx can easily be obtained from the variance as computed in most spreadsheets or statistics packages. Applied Regression -- Prof. Juran

  23. An Image of the Prediction and Confidence Intervals Applied Regression -- Prof. Juran

  24. Applied Regression -- Prof. Juran

  25. Applied Regression -- Prof. Juran

  26. Applied Regression -- Prof. Juran

  27. All-Around Movers The management question here is whether historical data can be used to create a cost estimation model for intra-Manhattan apartment moves. The dependent variable is the number of labor hours used, which is a proxy for total cost in the moving business. There are two potential independent variables: volume (in cubic feet) and the number of rooms in the apartment being vacated. Applied Regression -- Prof. Juran

  28. Summary Statistics Applied Regression -- Prof. Juran

  29. Applied Regression -- Prof. Juran

  30. Applied Regression -- Prof. Juran

  31. Applied Regression -- Prof. Juran

  32. Applied Regression -- Prof. Juran

  33. The Most Obvious Simple Regression Applied Regression -- Prof. Juran

  34. Applied Regression -- Prof. Juran

  35. An Alternative Simple Regression Model Applied Regression -- Prof. Juran

  36. Applied Regression -- Prof. Juran

  37. A Multiple Regression Model Applied Regression -- Prof. Juran

  38. Applied Regression -- Prof. Juran

  39. Preliminary Observations • Volume is the best single predictor, but perhaps not useful if customers are to be expected to collect these data and enter them on a web site. • Rooms is a pretty good predictor (not as good as Volume), and may be more useful on a practical basis. Applied Regression -- Prof. Juran

  40. Preliminary Observations • The multiple regression model makes better predictions, but not much better than either of the simple regression models. • The multiple regression model has problems with multicollinearity. Notice the lack of significance for the Rooms variable (and the strange coefficient). Applied Regression -- Prof. Juran

  41. , corresponding to the estimated number of hours for one Prediction intervals specific move, given one specific value for the number of rooms. , corresponding to the estimated population average Confidence intervals number of hours over a large number of moves, all with the same number of rooms. Applied Regression -- Prof. Juran

  42. Validity of the Rooms Model Applied Regression -- Prof. Juran

  43. Analysis of the Residuals Applied Regression -- Prof. Juran

  44. Applied Regression -- Prof. Juran

  45. Comments on the Rooms Model • Good explanatory power • Statistically Significant • Points fit the line well • But… • Small apartments tend to be over-estimated • Large apartments tend to be badly estimated, especially on the high side • Maybe could use more data • Maybe nonlinear Applied Regression -- Prof. Juran

  46. = B Note: If , then (A) = B. ln e A A Non-linear Model? Applied Regression -- Prof. Juran

  47. Applied Regression -- Prof. Juran

  48. Applied Regression -- Prof. Juran

  49. Applied Regression -- Prof. Juran

  50. Histogram of Residuals Histogram of Residuals 14 12 12 10 10 8 8 Frequency Frequency 6 6 4 4 2 2 0 0 -35 -30 -25 -20 -15 -10 -5 0 5 10 15 20 25 30 35 -35 -30 -25 -20 -15 -10 -5 0 5 10 15 20 25 30 35 Residual Error Residual Error Linear Model Exponential Model Residual Analysis Applied Regression -- Prof. Juran

More Related