1 / 17

Simple Linear Regression

Simple Linear Regression. Statistics 515 Lecture.

Télécharger la présentation

Simple Linear Regression

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Simple Linear Regression Statistics 515 Lecture

  2. The human body takes in more oxygen when exercising than when it is at rest. To deliver oxygen to the muscles, the heart must beat faster. Heart rate is easy to measure, but measuring oxygen uptake requires elaborate equipment. If oxygen uptake (VO2) can be accurately predicted from heart rate (HR), the predicted values may replace actually measured values for various research purposes. Unfortunately, not all human bodies are the same, so no single prediction equation works for all people. Researchers can, however, measure both HR and VO2 for one person under varying sets of exercise conditions and calculate a regression equation for predicting that person’s oxygen uptake from heart rate. Example for Illustration Simple Linear Regression

  3. Data From An Individual • Goals in this illustration: • Scatterplot: linear relationship or not? • Obtain the best-fitting line using least-squares. • To test whether the model is significant or not. • To obtain a confidence interval for the regression coefficient. • To obtain predictions. Simple Linear Regression

  4. The Scatterplot Simple Linear Regression

  5. Simple Linear Regression Model 1. Conditional on X=x, the response variable Y has mean equal to m(x) = a + bx. 2. ais the y-intercept; whileb is the slope of the regression line, which could be interpreted as the change in the mean value per unit change in the independent variable. 3. For each X = x, the conditional distribution of Y is normal with mean m(x) and variance s2. 4. Y1, Y2, …, Yn are independent of each other. Shorthand: Yi = a + bxi + ei with ei IID N(0,s2) Simple Linear Regression

  6. Least-Squares (LS) Regression One of the goals in regression analysis is to estimate the parameters a, b, and s2of the regression model. Denote by The estimate of the regression line, so that a estimates a, and b estimates b. Then for the observed values of X, which are x1, x2, …, xn, we may obtain the predicted values of the response variable Y for each of these X-values. These are: Simple Linear Regression

  7. Predicted Values A good estimate of the regression line should produce predicted values that are close to the actual observed values of the response variable. That is, the set of deviations Should ideally be close (if not equal) to zeros. These deviations between observed and predicted values are also called as residuals. Simple Linear Regression

  8. Principle of Least-Squares (LS) In least-squares regression, the best-fitting regression line is that which will make the sum of these squared deviations or residuals as small as possible. Thus, the regression coefficients a and b are chosen in order to minimize the quantity: Using calculus, the values of a and b that will minimize this quantity are given by: Simple Linear Regression

  9. Least-Squares Solution Simple Linear Regression

  10. Estimating the Variance Simple Linear Regression

  11. Interpretations of Quantities • SSE : measures variation not explained by the predictor variable. • SSR : measures the amount of variation explained by the predictor variable. • SYY: total variation in the Y-values. This is partitioned into SSR and SSE. • R2 = (SSR)/(SYY) : coefficient of determination; indicates proportion of variation in Y-values explained by the predictor variable. • MSE = (SSE)/(n-2) : is the mean-squared error. This provides an unbiased estimate of the common variance s2. Simple Linear Regression

  12. Sampling Distributions of Estimators To estimate the variance, s2 is replaced by the MSE. Simple Linear Regression

  13. Testing Hypothesis • To test the null hypothesis H0: b = b0 versus H1: b not equal to b0 we use the t-statistic given by: Which follows a t-distribution with degrees-of-freedom equal to n-2 under the null hypothesis. Thus, we reject H0 if |Tc| > tn-2;a/2. Similarly, for testing H0: a = a0, we use: Simple Linear Regression

  14. Confidence Interval for Mean and Predicting the Value of Y of a new Unit Estimate of Mean and Predicted Value at x0: Variance: CI for m(x0): CI for Y(x0): Simple Linear Regression

  15. Results of Regression Analysis (using Minitab) Prediction Line P-value for regression P-Value Coefficient of Determination (MSR)/(MSE) Simple Linear Regression

  16. Fitted Line on the Scatterplot Simple Linear Regression

  17. Confidence Interval for Mean and Prediction Interval For predicting the mean value For predicting the value of the response Simple Linear Regression

More Related