1 / 37

390 likes | 1.31k Vues

Lecture 10 : Heteroskedasticity. Econ 488. Order of Testing. Omitted variables and incorrect functional form (Adjusted R 2 ) Either A or B, but not both Serial Correlation (Durbin-Watson) Heteroskedasticity (Park’s Test, White’s Test) Multicollinearity (Correlation Matrix, VIF)

Télécharger la présentation
## Lecture 10 : Heteroskedasticity

**An Image/Link below is provided (as is) to download presentation**
Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.
Content is provided to you AS IS for your information and personal use only.
Download presentation by click this link.
While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.
During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

**Lecture 10 :Heteroskedasticity**Econ 488**Order of Testing**• Omitted variables and incorrect functional form (Adjusted R2) • Either A or B, but not both • Serial Correlation (Durbin-Watson) • Heteroskedasticity (Park’s Test, White’s Test) • Multicollinearity (Correlation Matrix, VIF) • Irrelevant Variables (t-test)**Homoskedasticiy**• Ideal Case: Homoskedasticity • Error variance σ2 is constant across sample • σ2 measures dispersion of dependent variable around regression line • Homoskedasticity means that the average relationship between dependent variable and independent variable is the same throughout sample**Heteroskedasticity**• Heteroskedasticity (or heteroscedasticity) is when σ2 is not constant across sample • Dispersion of dependent variable around regression line is not constant.**Why do we care?**• If we don’t fix heteroskedasticity: • Coefficients are not efficient (not minimum variance) • Estimated standard errors biased and inconsistent…meaning • t-stats are not right!**When can it occur?**• Whenever dispersion around regression line differs within sample • means relationship between dependent variable and independent variable differs within sample • Example: MLB Payroll and Market Size**2008 MLB Payrolls**• Large Markets:(Population>5,000,000) • Mean: $104,000,000 • Std Dev: $44,600,000 • Min: $21,800,000 (Florida Marlins) • Max: $209,000,000 (NY Yankees) • Small Markets:(Population<5,000,000) • Mean: $78,800,000 • Std Dev: $28,300,000 • Min: $43,800,000 (Tampa Bay Rays) • Max: $139,000,000 (Detroit Tigers)**Heteroskedasticity**• Note: Same principle applies when observations are groups that differ in size. e.g.: • States (population) • Countries (population) • Colleges (enrollment) • Companies (sales) • Etc.**Another Example**• Household income and consumption. • Low-income households • Little Flexibility in spending • Most income spend on necessities: • Food, shelter, clothing, transportation, utilities • Little dispersion of consumption around mean consumption. • Small σ2**Household Income vs. Consumption**• High income households • More flexibility in spending • Once necessities are purchased, much remains to be spent in different ways • Big Spenders • Savers and Investors • Large dispersion of consumption around mean.**Pure vs. Impure Heteroskedasticity**• Impure – Occurs when regression is not correctly specified • E.g. omitted variables • Can cause heteroskedasticity • Pure – Occurs due to nature of data**Consequences**• If we ignore heteroskedasticity, coefficient estimates are: • Unbiased – OK! • Consistent – OK! • Inefficient – Not OK. • t-tests are inaccurate.**Detection**• Tests detect heteroskedasticity • But won’t distinguish between pure and impure types • If test uncovers heteroskedasticity–STOP! • Try to decide if you have omitted variable. • If you do… • Include it in your model, and then retest for heteroskedasticity**Detection**• OR…If you don’t have an omitted variable: • Employ one of the remedies we’ll discuss • After you “fix” the problem, • Test again • If you still have heteroskedasticity, • It might be the impure type**Detection**• Plots • Estimate model, save residuals • Plot residuals against each independent variable separately Example: data3-6.gdt**Park Test**• If there is heteroskedasticity, then… • Var(εi)= σ2 Zi2 • εi = error term • σ2 = variance of homoskedastic error term • Zi= proportionality factor • If you know something about Z, you can use the Park test. • Find a variable that is related to heteroskedasticity (e.g. population)**Park Test**• Run regression, obtain residuals • Run the following regression: • ln(ei2)= α0+ α1ln(Zi)+ ui • Where: • ei= residuals from regression • Zi= best choice as to proportionality factor in data • ui= classical error term • Test the significance of ln(Zi). • If significant, there is evidence of heteroskedasticity.**Park Test**• Problem: We don’t always have a good Z • So, we can use White’s Test**White’s Test**• H0: No Heteroskedasticity • HA: Heteroskedasticity**White’s Test**• Estimate Equation • Yi=β0+β1X1i+β2X2i+εi • Save residual and square it. • Regress squared residual on a constant, X1, X2, X12, X22, X1X2 (all combinations of X’s) • ui2=α0+ α1X1i+ α2X2i + α3X1i2+ α4X2i2+ α5X1iX2i+ vi**White’s Test**• Compute N*R2 • N= sample size • R2 = unadjusted R2 • Reject Null if • NR2 >χ2 (Chi-Square) with 5 degrees of freedom • Because there are 5 independent vars in auxiliary regression (step 3)**White’s Test**• If you have 3 independent vars, auxiliary regression will have 9 independent vars. • X1, X2, X3, X12, X22, X32, X1X2, X2X3, X1X3 • If you have 6 independent vars, auxiliary regression will have 27 independent vars! • This can get out of hand quickly.**White’s Test Version 2**• Same as before, except in auxiliary regression only use the X and X2 terms (no cross products) • Use when you have a lot of independent variables.**Remedies For Heteroskedasticity**• Heteroskedasticity-Corrected Standard Errors • Fixes consistency of standard errors, so when N is large, standard errors are correct. • In gretl, just check the “robust standard error” box when running a regression**Remedies For Heteroskedasticity**• Weighted Least Squares (WLS) • (1) Yi=β0+β1X1i+β2X2i+εi • (2) Var(εi)= σ2 Zi2 • eqn. (1) is equivalent to • (3) Yi=β0+β1X1i+β2X2i+Ziui • So we can divide through by Zi**Remedies For Heteroskedasticity**• Step one: • Step two: estimate by OLS • Caution about step 2: there are two cases.**Remedies For Heteroskedasticity**• Case 1: Z is not in the original equation • Old: Yi=β0+β1X1i+β2X2i+εi • New: • What’s Missing? • The constant! • Solution: Add a constant • Better:**Remedies For Heteroskedasticity**• Case 2: Z is in the original equation • Suppose X1 is Z • Old: Yi=β0+β1X1i+β2X2i+εi • New: • What’s different about this equation? • One of the slope coefficients in the original equation becomes an intercept! • This happens because X1i/X1i=1**Remedies For Heteroskedasticity**• That is: • Intercept value in the new equation is the same as slope β2 in the original equation. • What should you look at in the new equation to find the equation of X2? • The constant.**Remedies For Heteroskedasticity**Example: saving.gdt (weight by income)

More Related