1 / 58

Chapter 12

Chapter 12. Multiple Regression Analysis and Model Building. Chapter 12 - Chapter Outcomes. After studying the material in this chapter, you should be able to: Understand the general concepts behind model building using multiple regression analysis.

liam
Télécharger la présentation

Chapter 12

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Chapter 12 Multiple Regression Analysis and Model Building

  2. Chapter 12 - Chapter Outcomes After studying the material in this chapter, you should be able to: Understand the general concepts behind model building using multiple regression analysis. Apply multiple regression analysis to business, decision-making situations. Analyze the computer output for a multiple regression model and test the significance of the independent variables in the model.

  3. Chapter 12 - Chapter Outcomes(continued) After studying the material in this chapter, you should be able to: Recognize potential problems when using multiple regression analysis and take the steps to correct the problems. Incorporate qualitative variables into the regression model by using dummy variables.

  4. Multiple Regression Analysis SIMPLE LINEAR REGRESSION MODEL (POPULATION MODEL) where: y = Value of the dependent variable x = Value of the independent variable = Population’s y-intercept = Slope of the population regression line = Error term, or residual

  5. Multiple Regression Analysis ESTIMATED SIMPLE LINEAR REGRESSION MODEL where: b0 = Estimated y intercept b1 = Estimated slope coefficient

  6. Multiple Regression Analysis A residual or prediction error is the difference between the actual value of y and the predicted value of y.

  7. Multiple Regression Analysis The standard error of the estimate refers to the standard deviation of the model errors. The standard error measures the dispersion of the actual values of the dependent variable around the fitted regression plane.

  8. Multiple Regression Analysis MULTIPLE REGRESSION MODEL (POPULATION MODEL) where: = Population’s regression constant = Population’s regression coefficient for variable j; j=1, 2, … k k =Number of independent variables = Model error

  9. Multiple Regression Analysis ESTIMATED MULTIPLE REGRESSION MODEL

  10. Multiple Regression Analysis A model is a representation of an actual system using either a physical or mathematical portrayal.

  11. Model Specification • Decide what you want to do and select the dependent variable. • List the potential independent variables for your model. • Gather the sample data (observations) for all variables.

  12. Multiple Regression Analysis The correlation coefficient is a quantitative measure of the strength of the linear relationship between two variables. The correlation coefficient, r, ranges between -1.0 and +1.0.

  13. Multiple Regression Analysis CORRELATION COEFFICIENT One x variable with y or

  14. Multiple Regression Analysis CORRELATION COEFFICIENT One x variable with another x

  15. Multiple Regression Analysis(Example) Multiple Regression Model: House Characteristics: x1 = Square feet = 2,100; x2 = Age = 15; x3 = Number of Bedrooms = 4; x4 = Number of baths = 3; x5 = Size of garage = 2 Point Estimate for Sale Price:

  16. Coefficient of Determination MULTIPLE COEFFICIENT OF DETERMINATION The percentage of variation in the dependent variable explained by the independent variable in the regression model:

  17. Model Diagnosis • Is the overall model significant? • Are the individual variables significant? • Is the standard deviation of the model error too large to provide meaningful results? • Is multicollinearity a problem?

  18. Is the Model Significant? If the null hypothesis is true, the overall regression model is not useful for predictive purposes.

  19. Is the Model Significant? F-TEST STATISTIC where: SSR = Sum of squares regression SSE = Sum of squares error n = Number of data points k = Number of independent variables Degrees of freedom = D1 = k and D2 = n - k - 1

  20. Is the Model Significant? ADJUSTED R-SQUARED A measure of the percentage of explained variation in the dependent variable that takes into account the relationship between the number of cases and the number of independent variables in the regression model. where: n = Number of data points k = Number of independent variables

  21. Are the Individual Variables Significant?

  22. Are the Individual Variables Significant? t-TEST FOR SIGNIFICANCE OF EACH REGRESSION COEFFICIENT where: bi = Sample slope coefficient for the ith independent variable sbi= Estimate of the standard error for the ith sample slope coefficient n-k-1 = Degrees of freedom

  23. Are the Individual Variables Significant? (From Figure 12-7)  /2 = 0.01  /2 = 0.01 Decision Rule: If -2.364  t  2.364, accept H0 Otherwise, reject H0

  24. Are the Individual Variables Significant? (From Figure 12-7)

  25. Is the Standard Deviation of the Regression Model Too Large? ESTIMATE FOR THE STANDARD DEVIATION OF THE MODEL where: SSE = Sum of squares error n = Sample size k = Number of independent variables

  26. Is Multicollinearity A Problem? Multicollinearity refers to the situation when high correlation exists between two independent variables. This means the two variables contribute redundant information to the multiple regression model. When highly correlated independent variables are included in the regression model, they can adversely affect the regression results.

  27. Some Indications of Severe Multicollinearity • Incorrect signs on the coefficients. • A sizable change in the values of the previous coefficients when a new variable is added to the model. • A variable the previously significant in the model becomes insignificant when a new independent variable is added. • The estimate of the standard deviation of the model increases when a variable is added to the model.

  28. Is Multicollinearity A Problem? The variance inflation factor is a measure of how much the variance of an estimated regression coefficient increases if the independent variables are correlated. A VIF equal to one for a given independent variable indicates that this independent variable is not correlated with the remaining independent variables in the model. The greater the multicollinearity, the larger the VIF will be.

  29. Is Multicollinearity A Problem? VARIANCE INFLATION FACTOR where: Rj2 = Coefficient of determination when the jth independent variable is regressed against the remaining k - 1 independent variables.

  30. Multiple Regression Analysis CONFIDENCE INTERVAL FOR THE REGRESSION COEFFICIENT where: bi = Point estimate for the regression coefficient xi t/2= Critical t-value for a 1 -  confidence interval sbi= The standard error of the ith regression coefficient

  31. Multiple Regression Analysis(Example from Figure 12-9) $55.16 $70.97

  32. Using Qualitative Independent Variables A dummy variable is a variable that is assigned a value equal to 0 or 1 depending on whether the observation possesses a given characteristic or not.

  33. Using Qualitative Independent Variables (Example 12-2) Dummy Variable: Estimated Regression:

  34. Using Qualitative Independent Variables (Example 12-2) If No MBA: If MBA:

  35. Using Qualitative Independent Variables(Figure 12-11) MBAs Non-MBAs b2 = 35,236 = Regression coefficient on the dummy variable

  36. Nonlinear Relationships Exponential Relationship of Increased Demand for Electricity versus Population Growth Electricity Demand Population

  37. Nonlinear Relationships Diminishing Returns Relationship of Advertising versus Sales Sales Advertising

  38. Nonlinear Relationships POLYNOMIAL POPULATION REGRESSION MODEL where: 0 = Population’s regression constant i = Population’s regression coefficient for variable xj : j = 1, 2, …k p = Order of the polynomial i = Model error

  39. Nonlinear Relationships(Figure 12-18)

  40. Nonlinear Relationships(Figure 12-19)

  41. Nonlinear Relationships(Figure 12-20)

  42. Nonlinear Relationships(Figure 12-21) R2 = 0.7272

  43. Nonlinear Relationships Interaction refers to the case in which one independent variable (such as x2) affects the relationship between another independent variable (x1) and a dependent variable (y).

  44. Nonlinear Relationships A composite model is the model that contains both the basic terms and the interactive terms.

  45. Nonlinear Relationships A Composite Model Basic Terms Interactive Terms

  46. Stepwise Regression Stepwise regression refers to a method which develops the least squares regression equation in steps, either through forward selection, backward elimination, or through standard stepwise regression.

  47. Stepwise Regression The coefficient of partial determination is the measure of the marginal contribution of each independent variable, given that other independent variables are in the model.

  48. Best Subsets Regression Cp STATISTIC where: p = k(Number of independent variables in model) + 1 T = 1 + The total number of independent variables to be considered for inclusion in the model Rp2 = Coefficient of multiple determination for the model with p = k parameters RT2 = Coefficient of multiple determination for the model that contains all T parameters

  49. Analysis of Residuals The following problems can be inferred through graphical analysis of residuals: • The regression function is not linear. • The model errors do not have a constant variance. • The model errors are not independent. • The model errors are not normally distributed.

  50. Analysis of Residuals RESIDUAL

More Related