1 / 24

Properties of OLS

Properties of OLS. How Reliable is OLS?. Learning Objectives. Review of the idea that the OLS estimator is a random variable How do we judge the quality of an estimator? Show that OLS is unbiased Show that OLS is Consistent Show that OLS is efficient The Gauss-Markov Theorem.

Télécharger la présentation

Properties of OLS

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Properties of OLS How Reliable is OLS?

  2. Learning Objectives • Review of the idea that the OLS estimator is a random variable • How do we judge the quality of an estimator? • Show that OLS is unbiased • Show that OLS is Consistent • Show that OLS is efficient • The Gauss-Markov Theorem

  3. 1. OLS is a Random Variable • An Estimator is any formula/algorithm that produces an estimate of the “truth” • An estimator is a function of the sample data e.g. OLS • Others: “Draw a line” • For individual consumption function its not so obvious • Implications for accuracy of the estimates • So how do we choose between different estimators? • What are the criteria • What is so special about OLS? • What does it take for OLS to go wrong?

  4. Recall the Definition of OLS • Review of where OLS comes from. • Think of fitting a line to the data. This will never pass through every point • Every choice of b0 and b1 will generate a new set of ui • OLS chooses b0 and b1 to min sum of squared ui • “Best fit” • R2 • so what?

  5. OLS Formulae • The key issue is that both are functions of the data so the precise value of each estimate will depend on the particular data points included in the sample • This observation is basis of all statistical inference and all judgments regarding the quality of estimates.

  6. Distribution of Estimator • Estimator is a random variable because sample is random • “Sampling error” or “Sampling distribution of the estimator” • To see the impact of the sampling on estimates, try different samples (see histograms over) • Key point: even if we have the correct model we could get an answer that is way off just because we are unlucky in the sample. • How do we know if we have been unlucky? How can we minimise the chances of bad luck? • This is basically how we assess the quality of one estimation procedure compared to another

  7. Comparing MAD and OLS • Both estimators are random variables • The OLS estimator has lower variance than the MAD distribution • Both are centred around the true value of beta (0.75)

  8. How Judge an Estimator? • Comparing estimators amounts to comparing distributions • Estimators are judged on three criteria • Unbiased • Consistent • Efficient • These criteria are all different takes on the question of: what is the probability that I will get a seriously wrong answer from my regression? • OLS is the Best Linear Unbiased Estimator (BLUE) • Gauss-Markov Theorem • This is why it is used

  9. 2. Bias • Sampling distribution of the estimator is centered around the true value • E(bOLS)=b • Implication: With repeated attempts OLS will give correct answer on average • Does not imply that it will give the correct answer in any given regression • Consider the stylized distribution of two estimators below • The OLS estimator is centered around the true value. The alternative is not • Is OLS better? • Suppose your criteria were the avoidance of really small answers? • Unbiasedness hinges on the model being correctly specified i.e. correct variables • Omitted relevant variables • It doesn’t require a large number of observations: “small sample” • Both MAD and OLS are unbiased

  10. OLS is centered around the true value but has a relatively high probability of returning a value that is low

  11. 3. Consistency • Consistency is a large sample property i.e. asymptotic property • As N , the distribution of the estimator collapse to the true value  • The distribution gets narrower • This is more useful than unbiasedness because it implies that the probability of getting any wrong answer falls as sample size increases • Formalises the common intuition that more data is better • “Law of Large Numbers” • Note an estimator could be biased but still consistent e.g. 2SLS • Consistency requires a correctly specified model

  12. Same estimator, Larger sample As sample size increases we get closer to truth e.g. prob of error falls

  13. As sample size increases we get closer to truth

  14. 4. Efficiency • An efficient estimator has minimum variance of all possible alternatives • Squashed distribution • Looks similar to consistency but is small sample property • Compares different estimators applied to the same sample • OLS is efficient i.e. “best” • Reason why it is used where possible • GLS, IV, WLS, 2SLS are inefficient • OLS is more efficient in our example than MAD

  15. Same sample size, different estimator Prob of error is lower for efficient estimator at any sample size

  16. 5. Gauss-Markov Theorem • A formal statement of what we have just discussed • Mathematical specification is required in order to do hypothesis tests • Standard model • Observation = systematic component + random error: • yi = 1 +2 xi + ui • Sample regression line estimated using OLS estimators: • yi = b1 + b2 xi + ei

  17. Assumptions • Linear Model ui = yi - 1- 2xi • Error terms have mean = 0 • E(ui|x)=0 => E(y|x) = 1 + 2xi • Error terms have constant variance (independent of x) • Var(ui|x) = 2=Var(yi|x) (homoscedastic errors) • Cov(ui, ui )= Cov(yi, yi )= 0. (no autocorrelation) • X is not a constant and is fixed in repeated samples. • Additional assumption: • ui~N(0, 2) => yi~N(1- 2xi, 2)

  18. Summary BLUE • Best Linear Unbiased Estimator • Linear: Linear function of the data • Unbiased: Expected Value of the estimator equals true value • Doesn’t mean always get the correct answer • Algebraic proof in book • Unbiasedness property hinges on the model being correctly specified i.e. E(xi ui)=0

  19. Some Comments On BLUE • Aka Gauss-Markov Theorem: • First 5 assumptions above must hold • OLS estimators are the “best” among all linear and unbiased estimators because • they are efficient: i.e. they have smallest variance among all other linear and unbiased estimators • G-M result does not depend on normality of dependent variable • Normality comes in when we do hypothesis tests • G-M refers to the estimators b1, b2, not to actual values of b1, b2 calculated from a particular sample • G-M applies only to linear and unbiased estimators, there are other types of estimators which we can use and these may be better in which case disregard G-M • a biased estimator may be more efficient than an unbiased one which fulfills G-M.

  20. Conclusions • OLS estimator is a random variable: • its precise value varies with the particular sample used • OLS is unbiased: • the distribution is centred on the truth • OLS is Consistent: • Probability of large error falls with sample size

  21. OLS is efficient: • Probability of large error is smallest of all possible estimators • The Gauss-Markov Theorem: • formal statement that 3-5 hold when certain assumptions are true

  22. What’s Missing? • What happens to OLS when the assumptions of the GM theorem are violated • There will be some other estimator that will be better • We will look at 4 violations: • Omitted variable bias • Multicolinearity • Heteroscedastcity • Autocorrelation

More Related