1 / 55

Inference for Regression

delila
Télécharger la présentation

Inference for Regression

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


    1. Inference for Regression Course: AP Statistics Chapter: 27 Book: Stats: Modeling the World Authors: BVD (2nd edition)

    2. Inference for: Categorical Variables: Use Chi-Squared Procedures

    3. Regression reminders Regression Line:

    4. Regression reminders Regression Line:

    5. Regression reminders Regression Line:

    6. So…what’s new?? Regression Line:

    7. Chapter 27 Regression Line:

    8. Population Statistics Regression Line:

    9. Population Regression Line Sample Regression Line:

    10. Population Regression Line Population Regression Line:

    11. Confidence Interval (for slope) How do we find the Standard Error??

    12. Confidence Interval (for slope) How do we find the Standard Error?? We don’t! We’ll let our calculator (or a computer printout) give it to us.

    13. Confidence Interval (for slope) Really? We don’t care about the Standard Error for the slope? Well….actually, we care a little. It depends on three things: 1) The spread of the residuals -more about this later! The spread of the x-values 3) The sample size (n)

    14. Confidence Interval (for slope) Let’s find the Standard Error. Ready to try?? Here is a sample computer printout.

    15. Confidence Interval (for slope) Help! That’s too confusing. What do I need?

    16. Confidence Interval (for slope) The Constant you see is the value of

    17. Confidence Interval (for slope) Age is the name of x and the slope

    18. Confidence Interval (for slope) Age is the name of x and the slope

    19. Confidence Interval (for slope) Income is the name of y (the response variable)

    20. Confidence Interval (for slope) The degree of freedom is given ….. df = 25

    21. Confidence Interval (for slope) And so is the Standard Error for the slope! 337.7

    22. Confidence Interval (for slope) The equation of the regression line would be:

    23. Confidence Interval (for slope) Wait a second….that’s chapter 8. We’re in Ch. 27. We want to find the Confidence Interval for Slope!

    24. Confidence Interval (for slope)

    25. Confidence Interval (for slope)

    26. Confidence Interval (for slope)

    27. Hypothesis Testing for Slope

    28. Hypothesis Testing for Slope

    29. Hypothesis Testing for Slope

    30. Hypothesis Testing for Slope

    31. Hypothesis Testing for Slope

    32. Hypothesis Testing for Slope

    33. Hypothesis Testing for Slope

    34. Hypothesis Testing for Slope

    36. Conditions & Assumptions What about the conditions and assumptions??? We skipped them…. And …… THAT’S BAD!

    37. Conditions & Assumptions There are 4 of them to satisfy. Linearity assumption The scatterplot of the data should be “roughly linear”. We show this two ways and we have done both before! 1) Graph the scatterplot and look at it. Does it look straight? 2) Graph the residuals against the x-variable. It should be randomly scattered. If this condition fails then straighten the data (see Ch. 9)

    38. Conditions & Assumptions

    39. Conditions & Assumptions

    40. Conditions & Assumptions 2) Independence Assumption The next three are a little tricky. That’s only because we need to understand what is happening with inference on regression lines. Here’s the situation: When you have a sample of data and you find the sample regression line for that data you are fitting the line that best fits (or passes through) the y-values that you have plotted at each x-value. Here is an example:

    41. Conditions & Assumptions

    42. Conditions & Assumptions

    43. Conditions & Assumptions

    44. Conditions & Assumptions

    45. Conditions & Assumptions

    46. Conditions & Assumptions 2) Independence Assumption Okay, back to #2. We now know the residuals (errors) are what we care about here. For #2 we want these to be independent for a given sample. If the sample was collected randomly, we are fine. Just state that the data can be assumed to be independent because the sample was random. You have no reason to believe that any y-value (or residual) has any impact on another one. Easy!

    47. Conditions & Assumptions 2) Independence Assumption Wait…didn’t you say this was hard? Well, it can be. If you are graphing a time plot (x represents time) the y-values might not be independent. Now you need to check the residuals. So…we graph them against the x-values (you already did this!) and see what we get. It should be a random scatter. Any pattern will show there is some sort of relationship which indicates a lack of independence. Moving on….

    48. Conditions & Assumptions 3) Equal Variance Assumption Okay…this one is a little tricky. But, that’s only because you don’t know WHY we are checking for it. Let’s stop and figure that out first. The best thing to do is to go once more to that image of normal models along the line….

    49. Conditions & Assumptions 3) Equal Variance Assumption What we want is for the spread of each set of y-values to be roughly the same. Remember, we care about residuals, so what this means is that we want to Standard Deviation of the residuals to be uniform. That means the residuals should be the same throughout.

    50. Conditions & Assumptions 3) Equal Variance Assumption That means we want the spread of each set of y-values to be roughly the same. Remember, we care about residuals, so what this means is that we want the Standard Deviation of the residuals to be uniform. Huh? Well, it means should not fan out, or clump together. The spread about the line should be the same (constant) throughout. This is called the, “DOES THE PLOT THICKEN?” Condition. How do we check for this? Residuals again. If the plot does fan out, it will show up in the residual plot against y. Here it is:

    51. Conditions & Assumptions 3) Equal Variance Assumption

    52. Conditions & Assumptions 4) Normal Population Assumption

    53. Conditions & Assumptions 4) Normal Population Assumption

    54. Conditions & Assumptions 4) Normal Population Assumption

    55. Practice Problem!

More Related