Multiple Regression Applications

Multiple Regression Applications Lecture 16

Phillips Curve example • Phillips curve as an example of a regime shift. • Data points from 1950 - 1970: There is a downward sloping, reciprocal relationship between wage inflation and unemployment W UN

Phillips Curve example (2) • But if we look at data points from 1971 - 1996: • From the data we can detect an upward sloping relationship • ALWAYS graph the data between the 2 main variables of interest W UN

Phillips Curve example (3) • There seems to be a regime shift between the two periods • note: this is an arbitrary choice of regime shift - it was not dictated by a specific change • We will use the Chow Test (F-test) to test for this regime shift • the test will use a restricted form: • it will also use an unrestricted form: • D is the dummy variable for the regime shift, equal to 0 for 1950-1970 and 1 for 1971-1996

Phillips Curve example (4) • L16_1.xls estimates the restricted regression equations and calculates the F-statistic for the Chow Test: • The null hypothesis will be: H0 : b1 = b3 = 0 • we are testing to see if the dummy variable for the regime shift alters the intercept or the slope coefficient • The F-statistic is (* indicates restricted) Where q=2

Phillips Curve example (5) • The expectation of wage inflation for the first time period: • The expectation of wage inflation for the second time period: • You can use the spreadsheet data to carry out these calculations

Relaxing Assumptions • A review of what we have learned in regression so far and a look forward to what we will happen when we relax assumptions around the regression line • Introduction to new concepts: • Heteroskedasticity • Serial correlation (also known as autocorrelation) • Non-independence of independent variables

CLRM Revision • Calculating the linear regression model (using OLS) • Use of the sum of square residuals: calculate the variance for the regression line and the mean squared deviation • Hypothesis tests: t-tests, F-tests, c2 test. • Coefficient of determination (R2) and the adjustment. • Modeling: use of log-linear, logs, reciprocal. • Relationship between F and R2 • Imposing linear restrictions: e.g. H0: b2 = b3 = 0 (q = 2); H0: a + b = 1. • Dummy variables and interactions; Chow test.

Relaxing assumptions • What are the assumptions we have used throughout? • Two assumptions about the population for the bi-variate case: 1. E(Y|X) = a + bX (the conditional expectation function is linear); 2. V(Y|X) = (conditional variances are constant) • Assumptions concerning the sampling procedure (i= 1..n) 1. Values of Xi (not all equal) are prespecified; 2. Yi is drawn from the subpopulation having X = Xi; 3. Yi ‘s are independent. • Consequences are: 1. E(Yi) = a + bXi; 2. V(Yi) = s2; 3. C(Yh, Yi) = 0 • How can we test to see if these assumptions don’t hold? • What can we do if the assumptions don’t hold?

Homoskedasticity • We would like our estimates to be BLUE • We need to look out for three potential violations of the CLRM assumptions: heteroskedasticity, autocorrelation, and non-independence of X (or simultaneity bias). • Heteroskedasticity: usually found in cross-section data (and longitudinal) • In earlier lectures, we saw that the variance of is • This is an example of homoskedasticity, where the variance is constant

X X1 X2 X3 Homoskedasticity (2) • Homoskedasticity can be illustrated like this: Y constant variance around the regression line

Heteroskedasticity • But, we don’t always have constant variance s2 • We may have a variance that varies with each observation, or • When there is heteroskedasticty, the variance around the regression line varies with the values of X

Heteroskedasticity (2) • The non-constant variance around the regression line can be drawn like this: Y X X1 X2 X3

Serial (auto) correlation • Serial correlation can be found in time series data (and longitudinal data) • Under serial correlation, we have covariance terms • where Yi and Yh are correlated or each Yi is not independently drawn • This results in nonzero covariance terms

Serial (auto) correlation (2) • Example: We can think of this using time series data such that unemployment at time t is related to unemployment in the previous time period t-1 • If we have a model with unemployment as the dependent variable Yt then • Yt and Yt-1 are related • et and et-1 are also related

Non-independence • The non-independence of independent variables is the third violation of the ordinary least squares assumptions • Remember from the OLS derivation that we minimized the sum of the squared residuals • we needed independence between the X variable and the error term • if not, the values of X are not pre-specified • without independence, the estimates are biased

Summary • Heteroskedasticity and serial correlation • make the estimates inefficient • therefore makes the estimated standard errors incorrect • Non-independence of independent variables • makes estimates biased • instrumental variables and simultaneous equations are used to deal with this third type of violation • Starting next lecture we’ll take a more in-depth look at the three violations of the CLRM assumptions

Multiple Regression Applications