1 / 112

FORECASTING WITH REGRESSION MODELS TREND ANALYSIS

BUSINESS FORECASTING. FORECASTING WITH REGRESSION MODELS TREND ANALYSIS. Prof. Dr. Burç Ülengin ITU MANAGEMENT ENGINEERING FACULTY FALL 2011. OVERVIEW. The bivarite regression model Data inspection Regression forecast process Forecasting with simple linear trend

rane
Télécharger la présentation

FORECASTING WITH REGRESSION MODELS TREND ANALYSIS

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. BUSINESS FORECASTING FORECASTING WITH REGRESSION MODELS TREND ANALYSIS Prof. Dr. Burç Ülengin ITU MANAGEMENT ENGINEERING FACULTY FALL 2011

  2. OVERVIEW • The bivarite regression model • Data inspection • Regression forecast process • Forecasting with simple linear trend • Causal regression model • Statistical evaluation of regression model • Examples...

  3. The Bivariate Regression Model • The bivariate regression model is also known a simple regression model • It is a statistical tool that estimates the relationship between a dependent variable(Y) and a single independent variable(X). • The dependent variable is a variable which we want to forecast

  4. The Bivariate Regression Model General form Dependent variable Independent variable Specific form: Linear Regression Model Random disturbance

  5. The Bivariate Regression Model • The regression model is indeed a line equation • 1= slope coefficient that tell us the rate of change in Y per unit change in X • If 1= 5, it means that one unit increase in X causes 5 unit increase in Y •  is random disturbance, which causes for given X, Y can take different values • Objective is to estimate 0 and 1 such a way that the fitted values should be as close as possible

  6. The Bivariate Regression ModelGeometrical Representation Y Good fit      Poor fit     X The red line is more close the data points than the blue one

  7. Best Fit Estimates population sample

  8. Best Fit Estimates-OLS

  9. Misleading Best Fits Y Y e2=100 e2=100                   X X Y Y   e2=100      e2=100            X X

  10. THE CLASSICAL ASSUMPTIONS • 1. The regression model is linear in the coefficients, correctly specified, & has an additive error term. • 2. E(e) = 0. • All explanatory variables are uncorrelated with the error term. • Errors corresponding to different observations are uncorrelated with each other. • The error term has a constant variance. • No explanatory variable is an exact linear function of any other explanatory variable(s). • The error term is normally distributed such that:

  11. Regression Forecasting Process • Data consideration: plot the graph of each variable over time and scatter plot. Look at • Trend • Seasonal fluctuation • Outliers • To forecast Y we need the forecasted value of X • Reserve a holdout period for evaluation and test the estimated equation in the holdout period

  12. An Example: Retail Car Sales • The main explanatory variables: • Income • Price of a car • Interest rates- credit usage • General price level • Population • Car park-number of cars sold up to time-replacement purchases • Expectation about future • For simple-bivariate regression, income is chosen as an explanatory variable

  13. Bi-variate Regression Model • Population regression model • Our expectation is1>0 • But, we have no all available data at hand, the data set only covers the 1990s. • We have to estimate model over the sample period • Sample regression model is

  14. Retail Car Sales and Disposable Personal Income Figures Quarterly car sales 000 cars Disposable income $

  15. OLS Estimate Dependent Variable: RCS Method: Least Squares Sample: 1990:1 1998:4 Included observations: 36 Variable Coefficient Std. Error t-Statistic Prob. C 541010.9 746347.9 0.724878 0.4735 DPI 62.39428 40.00793 1.559548 0.1281 R-squared 0.066759 Mean dependent var 1704222. Adjusted R-squared 0.039311 S.D. dependent var 164399.9 S.E. of regression 161136.1 Akaike info criterion 26.87184 Sum squared resid 8.83E+11 Schwarz criterion 26.95981 Log likelihood -481.6931 F-statistic 2.432189 Durbin-Watson stat 1.596908 Prob(F-statistic) 0.128128

  16. Basic Statistical Evaluation • 1 is the slope coefficient that tell us the rate of change in Y per unit change in X • When the DPI increases one $, the number of cars sold increases 62. • Hypothesis test related with 1 • H0: 1=0 • H1: 10 • t test is used to test the validity of H0 • t = 1/se(1) • If t statistic > t table Reject H0 or Pr <  (exp. =0.05) Reject H0 • If t statistic < t table Do not reject H0 or Pr >  Do not reject H0 • t= 1,56 < t table or Pr = 0.1281 > 0.05 Do not Reject H0 • DPI has no effect on RCS

  17. Basic Statistical Evaluation • R2 is the coefficient of determination that tells us the fraction of the variation in Y explained by X • 0<R2<1 , • R2 = 0 indicates no explanatory power of X-the equation. • R2 = 1 indicates perfect explanation of Y by X-the equation. • R2 = 0.066 indicates very weak explanation power • Hypothesis test related with R2 • H0: R2=0 • H1: R20 • F test check the hypothesis • If F statistic > F table Reject H0 or Pr <  (exp. =0.05) Reject H0 • If F statistic < F table Do not reject H0 or Pr >  Do not reject H0 • F-statistic=2.43 < F table or Pr = 0.1281 > 0.05 Do not reject H0 • Estimated equation has no power to explain RCS figures

  18. Graphical Evaluation of Fitand Error Terms Residuls show clear seasonal pattern

  19. Model Improvement • When we look the graph of the series, the RCS exhibits clear seasonal fluctuations, but PDI does not. • Remove seasonality using seasonal adjustment method. • Then, use seasonally adjusted RCS as a dependent variable.

  20. Seasonal Adjustment • Sample: 1990:1 1998:4 • Included observations: 36 • Ratio to Moving Average • Original Series: RCS • Adjusted Series: RCSSA • Scaling Factors: • 1 0.941503 • 2 1.119916 • 3 1.016419 • 4 0.933083

  21. Seasonally Adjusted RCS and RCS

  22. OLS Estimate Dependent Variable: RCSSA Method: Least Squares Sample: 1990:1 1998:4 Included observations: 36 Variable Coefficient Std. Error t-Statistic Prob. C 481394.3 464812.8 1.035674 0.3077 DPI 65.36559 24.91626 2.623411 0.0129 R-squared 0.168344 Mean dependent var 1700000. Adjusted R-squared 0.143883 S.D. dependent var 108458.4 S.E. of regression 100352.8 Akaike info criterion 25.92472 Sum squared resid 3.42E+11 Schwarz criterion 26.01270 Log likelihood -464.6450 F-statistic 6.882286 Durbin-Watson stat 0.693102 Prob(F-statistic) 0.012939

  23. Basic Statistical Evaluation • 1 is the slope coefficient that tell us the rate of change in Y per unit change in X • When the DPI increases one $, the number of cars sold increases 65. • Hypothesis test related with 1 • H0: 1=0 • H1: 10 • t test is used to test the validity of H0 • t = 1/se(1) • If t statistic > t table Reject H0 or Pr <  (exp. =0.05) Reject H0 • If t statistic < t table Do not reject H0 or Pr >  Do not reject H0 • t= 2,62 < t table or Pr = 0.012 < 0.05 Reject H0 • DPI has statistically significant effect on RCS

  24. Basic Statistical Evaluation • R2 is the coefficient of determination that tells us the fraction of the variation in Y explained by X • 0<R2<1 , • R2 = 0 indicates no explanatory power of X-the equation. • R2 = 1 indicates perfect explanation of Y by X-the equation. • R2 = 0.1683 indicates very weak explanation power • Hypothesis test related with R2 • H0: R2=0 • H1: R20 • F test check the hypothesis • If F statistic > F table Reject H0 or Pr <  (exp. =0.05) Reject H0 • If F statistic < F table Do not reject H0 or Pr >  Do not reject H0 • F-statistic = 6.88 < F table or Pr = 0.012 < 0.05 Reject H0 • Estimated equation has some power to explain RCS figures

  25. Graphical Evaluation of Fitand Error Terms No seasonality but it still does not look random disturbance Omitted Variable? Business Cycle?

  26. Trend Models

  27. Simple Regression ModelSpecial Case: Trend Model • Independent variable Time, t = 1, 2, 3,...., T-1, T • There is no need to forecast the independent variable • Using simple transformations, variety of nonlinear trend equations can be estimated , therefore the estimated model can mimic the pattern of the data

  28. Suitable Data Pattern

  29. Chapter 3 Exercise 13College Tuition Consumers' Price Index by Quarter Holdout period

  30. OLS Estimates Dependent Variable: FEE Method: Least Squares Sample: 1986:1 1994:4 Included observations: 36 Variable Coefficient Std. Error t-Statistic Prob. C 115.7312 1.982166 58.38624 0.0000 @TREND 3.837580 0.097399 39.40080 0.0000 R-squared 0.978568 Mean dependent var 182.8889 Adjusted R-squared 0.977938 S.D. dependent var 40.87177 S.E. of regression 6.070829 Akaike info criterion 6.498820 Sum squared resid 1253.069 Schwarz criterion 6.586793 Log likelihood-114.9788 F-statistic 1552.423 Durbin-Watson stat 0.284362 Prob(F-statistic) 0.000000 e2

  31. Basic Statistical Evaluation • 1 is the slope coefficient that tell us the rate of change in Y per unit change in X • Each year tuition increases 3.83 points. • Hypothesis test related with 1 • H0: 1=0 • H1: 10 • t test is used to test the validity of H0 • t = 1/se(1) • If t statistic > t table Reject H0 or Pr <  (exp. =0.05) Reject H0 • If t statistic < t table Do not reject H0 or Pr >  Do not reject H0 • t= 39,4 > t table or Pr = 0.0000 < 0.05 Reject H0

  32. Basic Statistical Evaluation • R2 is the coefficient of determination that tells us the fraction of the variation in Y explained by X • 0<R2<1 , • R2 = 0 indicates no explanatory power of X-the equation. • R2 = 1 indicates perfect explanation of Y by X-the equation. • R2 = 0.9785 indicates very weak explanation power • Hypothesis test related with R2 • H0: R2=0 • H1: R20 • F test check the hypothesis • If F statistic > F table Reject H0 or Pr <  (exp. =0.05) Reject H0 • If F statistic < F table Do not reject H0 or Pr >  Do not reject H0 • F-statistic= 1552 < F table or Pr = 0.0000 < 0.05 Reject H0 • Estimated equation has explanatory power

  33. Graphical Evaluation of Fit Holdout period ACTUALFORECAST 1995 Q1 260.00 253.88 1995 Q2259.00 257.72 1995 Q3 266.00 261.55 1995 Q4 274.00 265.39

  34. Graphical Evaluation of Fitand Error Terms Residuals exhibit clear pattern, they are not random Also the seasonal fluctuations can not be modelled Regression model is misspecified

  35. Model Improvement • Data may exhibit exponential trend • In this case, take the logarithm of the dependent variable • Calculate the trend by OLS • After OLS estimation forecast the holdout period • Take exponential of the logarithmic forecasted values in order to reach original units

  36. Suitable Data Pattern

  37. Original and Logarithmic Transformed Data LOG(FEE) FEE 4.844187 127.000 4.844187 127.000 4.867534 130.000 4.912655 136.000 4.912655 136.000 4.919981 137.000 4.941642 140.000 4.976734 145.000 4.983607 146.000

  38. OLS Estimate of the Logrithmin Trend Model Dependent Variable: LFEE Method: Least Squares Sample: 1986:1 1994:4 Included observations: 36 Variable Coefficient Std. Error t-Statistic Prob. C 4.816708 0.005806 829.5635 0.0000 @TREND 0.021034 0.000285 73.72277 0.0000 R-squared 0.993783 Mean dependent var 5.184797 Adjusted R-squared 0.993600 S.D. dependent var 0.222295 S.E. of regression 0.017783 Akaike info criterion -5.167178 Sum squared resid 0.010752 Schwarz criterion -5.079205 Log likelihood 95.00921 F-statistic 5435.047 Durbin-Watson stat 0.893477 Prob(F-statistic) 0.000000

  39. Forecast Calculations obs FEE LFEEF FEELF=exp(LFEEF) 1993:1 228.0000 5.405651 222.6610 1993:2 228.0000 5.426684 227.3940 1993:3 235.0000 5.447718 232.2276 1993:4 243.0000 5.468751 237.1639 1994:1 244.0000 5.489785 242.2052 1994:2 245.0000 5.510819 247.3536 1994:3 251.0000 5.531852 252.6114 1994:4 259.0000 5.552886 257.9810 1995:1 260.0000 5.573920 263.4648 1995:2 259.0000 5.594953 269.0651 1995:3 266.0000 5.615987 274.7845 1995:4 274.0000 5.637021 280.6254

  40. Graphical Evaluation of Fitand Error Terms Residuals exhibit clear pattern, they are not random Also the seasonal fluctuations can not be modelled Regression model is misspecified

  41. Model Improvement • In order to deal with seasonal variations remove seasonal pattern from the data • Fit regression model to seasonally adjusted data • Generate forecasts • Add seasonal movements to the forecasted values

  42. Suitable Data Pattern

  43. Multiplicative Seasonal Adjustment • Included observations: 40 • Ratio to Moving Average • Original Series: FEE • Adjusted Series: FEESA • Scaling Factors: • 1 1.002372 • 2 0.985197 • 3 0.996746 • 4 1.015929

  44. Original and Seasonally Adjusted Data

  45. OLS Estimate of the Seasonally Adjusted Trend Model Dependent Variable: FEESA Method: Least Squares Sample: 1986:1 1995:4 Included observations: 40 Variable Coefficient Std. Error t-Statistic Prob. C 115.0387 1.727632 66.58749 0.0000 @TREND 3.897488 0.076240 51.12152 0.0000 R-squared 0.985668 Mean dependent var 191.0397 Adjusted R-squared 0.985291 S.D. dependent var 45.89346 S.E. of regression 5.566018 Akaike info criterion 6.319943 Sum squared resid 1177.261 Schwarz criterion 6.404387 Log likelihood -124.3989 F-statistic 2613.410 Durbin-Watson stat 0.055041 Prob(F-statistic) 0.000000

  46. Graphical Evaluation of Fitand Error Terms Residuals exhibit clear pattern, they are not random There is no seasonal fluctuations Regression model is misspecified

  47. Model Improvement • Take the logarithm in order to remove existing nonlinearity • Use additive seasonal adjustment to logarithmic data • Apply OLS to seasonally adjusted logrithmic data • Forecast holdout period • Add seasonal movements to reach seasonal forecasts • Take an exponential in order to reach original seasonal forecasts

  48. Suitable Data Pattern

  49. Logarithmic Transformation and Additive Seasonal Adjustment • Sample: 1986:1 1995:4 • Included observations: 40 • Difference from Moving Average • Original Series: LFEE =log(FEE) • Adjusted Series: LFEESA • Scaling Factors: • 1 0.002216 • 2 -0.014944 • 3 -0.003099 • 4 0.015828

  50. Original and Logarithmic Additive Seasonally Adjustment Series

More Related