1 / 16

Regression

Regression. Weight. Height. What would you expect for other heights?. How much would an adult female weigh if she were 5 feet tall?. This distribution is normally distributed. (we hope).

sheba
Télécharger la présentation

Regression

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Regression

  2. Weight Height What would you expect for other heights? How much would an adult female weigh if she were 5 feet tall? This distribution is normally distributed. (we hope) She could weigh varying amounts – in other words, there is a distribution of weights for adult females who are 5 feet tall. What about the standard deviations of all these normal distributions? We want the standard deviations of all these normal distributions to be the same. Where would you expect the TRUE LSRL to be?

  3. Regression Model • The mean response my has a straight-line relationship with x: • Where: slope b and intercept a are unknown parameters • For any fixed value of x, the responsey varies according to a normal distribution. Repeated responses of y are independent of each other. • The standard deviation of y (sy) is the same for all values of x. (sy is also an unknown parameter)

  4. Suppose we look at part of a population of adult women. These women are all 64 inches tall. What distribution does their weight have?

  5. We use to estimate • The slope b of the LSRL is an unbiased estimator of the true slope b. • The intercept a of the LSRL is an unbiased estimator of the true intercept a. • The standard error s is an unbiased estimator of the true standard deviation of y (sy). Note: df = n-2

  6. Notes! For a study on student drinking and blood alcohol level, sixteen student volunteers at Ohio State University drank a randomly assigned number of cans of beer. Thirty minutes later, a police officer measured their blood alcohol content (BAC). The results are show below: Use your calculator to find a regression equation (ax + b) for this data. State your equation using descriptive notation. What does the value a represent in the context of this problem?

  7. We would like to create a confidence interval for the slope of the regression line. In other words, we want to know .

  8. Conditions for regression inference • For any fixed value of x, the response variable y varies normally about the true regression line. • Check a histogram or boxplot of residuals • The mean response, ,has a straight line relationship with x • Check the scatter plot & residual plot • The standard deviation of y is the same for values of x. • Check the scatter plot & residual plot

  9. For problems involving inference for regression, we use a .

  10. Weight Height What is the slope of a horizontal line? Suppose the LSRL has a horizontal line–would height be useful in predicting weight? A slope of zero – means that there is NO relationship between x & y!

  11. Formulas: • Confidence Interval: df = n -2 Because there are two unknowns a & b the standard error of the least squares slope, b

  12. Interpretation: We are 95% confident that the mean change in BAC per beer is between ___________ and _____________

  13. Back to our Example:For a study on student drinking and blood alcohol level, sixteen student volunteers at Ohio State University drank a randomly assigned number of cans of beer. Thirty minutes later, a police officer measured their blood alcohol content (BAC). The results are show below: • Find the LSRL, correlation coefficient, and coefficient of determination. BAC = -.0127 + 0.018 (Beers) r = 0.8943 r2 = 0.7998

  14. b) Explain the meaning of slope in the context of the problem. There is approximately 1.8% increase in BAC for every Beer c) Explain the meaning of the coefficient of determination in context. Approximately 80% of the variation in BAC can be explained by the regression of BAC on number of Beers drunk.

  15. BAC Residuals Beers Beers Residuals d) Estimate a, b, and s. a = -.0127 b = .0180 s = .0204 e) Create a scatter plot, residual plot and box plot of the residuals for the data.

  16. f) Give a 95% confidence interval for the true slope of the LSRL. • Assumptions: • Have an SRS of students • Since the residual plot is randomly scattered, BAC and # of beers are linear • Since the points are evenly spaced across the LSRL on the scatterplot, sy is approximately equal for all values of BAC • Since the boxplot of residual is approximately symmetrical, the responses are approximately normally distributed. • We are 95% confident that the true slope of the LSRL of weight & body fat is between 0.12 and 0.38. Be sure to show all graphs!

More Related