1 / 23

Some Topics In Multivariate Regression

Some Topics In Multivariate Regression. Some Topics. We need to address some small topics that are often come up in multivariate regression. I will illustrate them using the Housing example. Ton finish the house price sectionConfidence intervals Dummy Variables One sided tests Scale of data

Télécharger la présentation

Some Topics In Multivariate Regression

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Some Topics In Multivariate Regression

  2. Some Topics • We need to address some small topics that are often come up in multivariate regression. • I will illustrate them using the Housing example. Ton finish the house price sectionConfidence intervals • Dummy Variables • One sided tests • Scale of data • Functional Form

  3. Woldridge refs • Chapter 1 • Chapter 2.1, 2.2,2.3,2.5 • Chapter 3.1,3.2,3.3,3.4,3.5 • Chapter 4.1, 4.2, 4.5, 4.6 • Chapter 5

  4. 1. Confidence Intervals (4.3) • We can construct an interval within which the true value of the parameter lies • We have seen that • P(-1.96 ≤ t ≤ 1.96)=0.95 for large N-K • More generally:

  5. Interval b± tc *se(b) will contain b with (1-a)% confidence. • Where tc is “critical value” and is determined by the significance level (a) and the degrees of freedom (df=N-K) • For the case where N-K is large (>100) and a is 5% then tc = 1.96 • Same as the set of values of beta, which could not be rejected if they were null hypotheses • The range of possible values consistent with the data • A way of avoiding some of the ambiguity in the formulation of hypothesis tests • Formally: A procedure which will generate an interval containing the true value (1-a)% times in repeated samples

  6. 2. Dummy Variables • Wooldridge Chapter 7.1-7.4 • Record classifications • Dichotomous: “yes/no” e.g. trial, gender etc • Ordinal e.g. level of education • OLS doesn’t treat them differently • Need to be careful about how coefficients are interpreted • Illustrate with “sex” in the wage regression • Use labour2006.dta and dummy.do • Sex =1 iff female sex=0 iff male

  7. Our basic model is wagei = 1 + 2educi + ui • This can be interpreted a predicting wages based on schooling i.e. E[wagei]= 1 + 2E[educi] • Suspect that wages are systematically different between men and women wagei = 1 + 2educi + 3sexi + ui

  8. Now theprediction becomes: E[wagei]= 1 + 2 E[educi]+ 3 if female E[wagei]= 1 + 2 E[educi] if male • Note that “sex” disappears when it is zero • This translates into separate intercepts on the graph • The extra € for being a woman • Testing if 3 is significant is test of significant difference in wages between the two groups

  9. Interaction • While the intercept could be different the slope could be also • i.e. the degree of discrimination could be different between the two groups • Different returns to education • Model this by an “interaction term” wagei = 1 + 2educi + 3sexi + 4 educi*sexi + ui

  10. Now theprediction becomes: E[wagei]= 1 + (2 + 4 )*E[educi]+ 3 if woman E[wagei]= 1 + 2 E[educi] if man • Note that “sex” disappears when it is zero • This translates into separate intercepts and slopes on the graph • The extra € for bringing a woman and an extra % for each levl of education • Testing if 4 is significant is test of significant difference in return to education between the two groups

  11. 3. One Tailed Test (4.2) Need to be careful about the interpretation of the null and alternative Think of example of gender discrimination • State the Hypothesis we want to test H0: bsex >= 0 H1: bsex < 0 • Calculate the test statistic assuming that H0 =0 true. t=-8.33 • Reject null if t<-critical value at chosen sig level • Can reject null as -8.58<-1.64

  12. “Acceptance” Region

  13. Which you use is up to you. But • Beware of translating directly from English • Be aware of the implications • Rule of thumb: • H1: “what you expect” e.g. guilt • H0: “what you fear” e.g. innocent • So the test procedure minimizes the prob of rejecting what you fear when it is true • This notion works for a two sided test also

  14. Redo the gender discrimination example from the other side • State the Hypothesis we want to test H0: bsex <= 0 H1: bsex >0 • Calculate the test statistic assuming that H0 =0 true. t=-8.33 • Reject null if t> critical value at chosen sig level • Cannot reject null as -8.58<1.64

  15. The difference between the two • The first H0: bsex >= 0 H1: bsex < 0 • 5% chance of rejecting null when it is correct • i.e. of stating bsex < 0 when in fact bsex >= 0 • i.e. of stating there is discrimination when in fact there is none • The second H0: bsex <= 0 H1: bsex > 0 • 5% chance of rejecting null when it is correct • i.e. of stating bsex > 0 when in fact bsex <= 0 • i.e. of stating there is no discrimination when in fact there is some

  16. 4. Scale (2.4 & 6.1) • The scale of the data may matter • i.e. whether we measure consumption in € or €bn or even £ • Basic model: yi = b1 + b2 xi + ei • Change scale of xi : xi* = xi/c • Estimate: yi = b1* + b2* xi*+ ei • b2*= c.b2 • se(b2) = c.se(b2) • Slope coefficient and se change, all other statistics (t-stats, R2, F, etc.) unchanged.

  17. Change scale of yi : yi* = yi/c • Estimate y*i = b1* + b2* xi + ei • b2*= b2 /c • b1*= b1 /c • se(b2) = se(b2)/c • se(b1) = se(b1)/c • t-stats, R2, F unchanged • Both X and Y rescaled yi* = yi/c, xi* = xi/c • Estimate y*i = b1* + b2* x* + ei • If rescaled by same amount: • b1*= b1 /cse(b1) = se(b1)/c • b2 and se(b2) unchanged • t-stats, R2, F unchanged

  18. 5. Functional Form (6.2) • Four common functional forms • Linear: qt = a + pt + ut • Log-Log: lnqt = a + lnpt + ut • Semilog: qt = a + lnpt + ut • or lnqt = a + pt + ut • How to choose? • Which fits the data best (cannot compare R2 unless y is same) • Which is most convenient (do we want elasticity, rate of return?) • How trade-off two goals

  19. Elasticity and Marginal Effects

More Related