670 likes | 685 Vues
Chapter 5. Heteroskedasticity. A regression line. What is in this Chapter?. How do we detect this problem What are the consequences of this problem? What are the solutions?. What is in this Chapter?.
E N D
Chapter 5 Heteroskedasticity
What is in this Chapter? • How do we detect this problem • What are the consequences of this problem? • What are the solutions?
What is in this Chapter? • First, We discuss tests based on OLS residuals, likelihood ratio test, G-Q test and the B-P test. The last one is an LM test. • Regarding consequences, we show that the OLS estimators are unbiased but inefficient and the standard errors are also biased, thus invalidating tests of significance
What is in this Chapter? • Regarding solutions, we discuss solutions depending on particular assumptions about the error variance and general solutions. • We also discuss transformation of variables to logs and the problems associated with deflators, both of which are commonly used as solutions to the heteroskedasticity problem.
5.1 Introduction • The homoskedasticity = variance of the error terms is constant • The heteroskedasticity = variance of the error terms is non-constant • Illustrative Example • Table 5.1 presents consumption expenditures (y) and income (x) for 20 families. • Suppose that we estimate the equation by ordinary least squares. We get (figures in parentheses are standard errors)
5.1 Introduction • We get (figures in parentheses are standard errors) y=0.847 + 0.899 x R2 = 0.986 (0.703) (0.0253) RSS=31.074 Section 5.4
5.1 Introduction • The residuals from this equation are presented in Table 5.3 • In this situation there is no perceptible increase in the magnitudes of the residuals as the value of x increases • Thus there does not appear to be a heteroskedasticity problem.
5.2 Detection of Heteroskedasticity • In the illustrative example in Section 5.1 we plotted estimated residual against to see whether we notice any systematic pattern in the residuals that suggests heteroskedasticity in the error. • Note however, that by virtue if the normal equation, and are uncorrelated though could be correlated with .
5.2 Detection of Heteroskedasticity • Thus if we are using a regression procedure to test for heteroskedasticity, we should use a regression of on or a regression of or • In the case of multiple regression, we should use powers of , the predicted value of , or powers of all the explanatory variables.
5.2 Detection of Heteroskedasticity • The test suggested by Anscombe and a test called RESET suggested by Ramsey both involve regressing and testing whether or not the coefficients are significant. • The test suggested by White involves regressing on all the explanatory variables and their squares and cross products. For instance, with explanatory variables x1, x2, x3, it involves regressing
5.2 Detection of Heteroskedasticity • Glejser suggested estimating regressions of the type and so on and testing the hypothesis
5.2 Detection of Heteroskedasticity • The implicit assumption behind all these tests is that where zi os an unknown variable and the different tests use different proxies or surrogates for the unknown function f(z).
5.2 Detection of Heteroskedasticity • Thus there is evidence of heteroskedasticity even in the log- linear from, although casually looking at the residuals in Table 5.3, we concluded earlier that the errors were homoskedastic. • The Goldfeld-Quandt, to be discussed later in this section, also did not reject the hypothesis of homoskedasticity. • The Glejser tests, however, show significant heteroskedasticity in the log-linear form.
Assignment • Redo this illustrative example • The figure of the absolute value of the residual and x variable • Linear form • Log-linear form • Three types of tests: • Linear form and log-linear form • The e-view table • Reject/accept the null hypothesis of homogenous variance
5.2 Detection of Heteroskedasticity • Some Other Tests (General tests) • Likelihood Ratio Test • Goldfeld and Quandt Test • Breusch-Pagan Test
5.2 Detection of Heteroskedasticity Likelihood Ratio Test • If the number of observations is large, one can use a likelihood ratio test. • Divide the residuals (estimated from the OLS regression) into k group with ni observations in the i th group, . • Estimate the error variances in each group by . • Let the estimate of the error variance from the entire sample be .Then if we define as
5.2 Detection of Heteroskedasticity • Goldfeld and Quandt Test • If we do not have large samples, we can use the Goldfeld and Quandt test. • In this test we split the observations into two groups — one corresponding to large values of x and the other corresponding to small values of x —
5.2 Detection of Heteroskedasticity • Fit separate regressions for each and then apply an F-test to test the equality of error variances. • Goldfeld and Quandt suggest omitting some observations in the middle to increase our ability to discriminate between the two error variances.
5.2 Detection of Heteroskedasticity Breusch-Pagan Test • Suppose that . • If there are some variables that influence the error variance and if , then the Breusch and Pagan test is atest of the hypothesis • The function can be any function.
5.2 Detection of Heteroskedasticity • For instance, f(x) can be ,and so on. • The Breusch and Pagan test does not depend on the functional form. • Let S0 = regression sum of squares from a regression of Then has a X 2 –distribution with d.f. r. • This test is an asymptotic test. An intuitive justification for the test will be given after an illustrative example.
5.2 Detection of Heteroskedasticity Illustrative Example • Consider the data in Table 5.1. To apply the Goldfeld-Quandt test we consider two groups of 10 observations each, ordered by the values of the variable x. • The first group consists of observations 6, 11, 9, 4, 14, 15, 19, 20 ,1, and 16. • The second group consists of the remaining 10.
5.2 Detection of Heteroskedasticity Illustrative Example • The estimate equations were Group 1: y=1.0533+ 0.876 x R2 = 0.985 (0.616) (0.038) = 0.475 Group 2: y=3.279 + 0.835 x R2 = 0.904 (3.443) (0.096) = 3.154
5.2 Detection of Heteroskedasticity • The F- ratio for the test is • The 1% point for the F-distribution with d.f. 8 and 8 is 6.03. • Thus the F-value is significant at the 1% level and we reject the hypothesis if homoskedasticity.
5.2 Detection of Heteroskedasticity Group 1: log y = 0.128 + 0.934 x R2 = 0.992 (0.079) (0.030) = 0.001596 Group 2: log y = 0.276 + 0.902 x R2 = 0.912 (0.352) (0.099) = 0.002789 • The F-ratio for the test is
5.2 Detection of Heteroskedasticity • For d.f. 8 and 8, the 5% point from the F-tables is 3.44. • Thus if we use the 5% significance level, we do not reject the hypothesis of homoskedasticity if we consider the linear form but do not reject it in the log-linear form. • Note that the White test rejected the hypothesis in both the forms.
5.4 Solutions to the Heteroskedasticity Problem • There are two types of solutions that have been suggested in the literature for the problem of heteroskedasticity: • Solutions dependent on particular assumptions about σi. • General solutions. • We first discuss category 1: weighted least squares (WLS)
5.4 Solutions to the Heteroskedasticity Problem Thus the constant term in this equation is the slope coefficient in the original equation.
5.4 Solutions to the Heteroskedasticity Problem • Prais and Houthakker found in their analysis of family budget data that the errors from the equation had variance increasing with household income. • They considered a model ,that is, . • In this case we cannot divide the whole equation by a known constant as before. • For this model we can consider a two-step procedure as follows.
5.4 Solutions to the Heteroskedasticity Problem • First estimate and by OLS. • Let these estimators be and . • Now use the WLS procedure as outlined earlier, that is, regress on and with no constant term. • The limitation of the two-step procedure: the error involved in the first step will affect the second step
5.4 Solutions to the Heteroskedasticity Problem • This procedure is called a two-step weighted least squares procedure. • The standard errors we get for the estimates of and from this procedure are valid only asymptotically. • The are asymptotic standard errors because the weights have been estimated.
5.4 Solutions to the Heteroskedasticity Problem • One can iterate this WLS procedure further, that is, use the new estimates of and to construct new weights and then use the WLS procedure, and repeat this procedure until convergence. • This procedure is called the iterated weighted least squares procedure. However, there is no gain in (asymptotic) efficiency by iteration.
5.4 Solutions to the Heteroskedasticity Problem • Illustrative Example As an illustration, again consider the data in Table 5.1.We saw earlier that regressing the absolute values of the residuals on x (in Glejser’s tests) gave the following estimates: Now we regress (with no constant term) where .
5.4 Solutions to the Heteroskedasticity Problem The resulting equation is If we assume that , the two-step WLS procedure would be as follows. Section 5.1
5.4 Solutions to the Heteroskedasticity Problem • Next we compute and regress .The results were • The in these equations are not comparable. But our interest is in estimates of the parameters in the consumption function.
Assignment • Use the data of Table 5.1 to do the WLS • Consider the log-liner form • Run the Glejser’s tests to check if the log-linear regression model still has non-constant variance • Estimate the non-constant variance and run the WLS • Write a one-step program using Gauss program
5.5 Heteroskedasticity and the Use of Deflators • There are two remedies often suggested and used for solving the heteroskedasticity problem: • Transforming the data to logs • Deflating the variables by some measure of "size."
5.5 Heteroskedasticity and the Use of Deflators • One important thing to note is that the purpose in all these procedures of deflation is to get more efficient estimates of the parameters • But once those estimates have been obtained, one should make all inferences—calculation of the residuals, prediction of future values, etc., from the original equation—not the equation in the deflated variables.
5.5 Heteroskedasticity and the Use of Deflators • Another point to note is that since the purpose of deflation is to get more efficient estimates, it is tempting to argue about the merits of the different procedures by looking at the standard errors of the coefficients. • However, this is not correct, because in the presence of heteroskedasticity the standard errors themselves are biased, as we showed earlier