html5-img
1 / 20

Chapter 4 Section 2

Chapter 4 Section 2. Least-Squares Regression. 1. 2. 3. Chapter 4 – Section 2. Learning objectives Find the least-squares regression line and use the line to make predictions Interpret the slope and the y-intercept of the least squares regression line

Télécharger la présentation

Chapter 4 Section 2

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Chapter 4Section 2 Least-Squares Regression

  2. 1 2 3 Chapter 4 – Section 2 • Learning objectives • Find the least-squares regression line and use the line to make predictions • Interpret the slope and the y-intercept of the least squares regression line • Compute the sum of squared residuals

  3. 2 3 1 Chapter 4 – Section 2 • Learning objectives • Find the least-squares regression line and use the line to make predictions • Interpret the slope and the y-intercept of the least squares regression line • Compute the sum of squared residuals

  4. Chapter 4 – Section 2 • If we have two variables X and Y, we often would like to model the relation as a line • Draw a line through the scatter diagram • We want to find the line that “best” describes the linear relationship … the regression line

  5. Chapter 4 – Section 2 • We want to use a linear model • We want to use a linear model • Linear models can be written in several different (equivalent) ways • y = mx + b • y – y1 = m (x – x1) • y = b1x + b0 • We want to use a linear model • Linear models can be written in several different (equivalent) ways • y = m x + b • y – y1 = m (x – x1) • y = b1x + b0 • Because the slope and the intercept both are important to analyze, we will use y = b1x + b0

  6. Chapter 4 – Section 2 • One difference between math and stat is that statistics assumes that the measurements are not exact, that there is an error or residual • One difference between math and stat is that statistics assumes that the measurements are not exact, that there is an error or residual • The formula for the residual is always Residual = Observed – Predicted • One difference between math and stat is that statistics assumes that the measurements are not exact, that there is an error or residual • The formula for the residual is always Residual = Observed – Predicted • This relationship is not just for this chapter … it is the general way of defining error in statistics

  7. Chapter 4 – Section 2 • For example, say that we want to predict a value of y for a specific value of x • For example, say that we want to predict a value of y for a specific value of x • Assume that we are using y = 10 x + 25 as our model • For example, say that we want to predict a value of y for a specific value of x • Assume that we are using y = 10 x + 25 as our model • To predict the value of y when x = 3, the model gives us y = 10  3 + 25 = 55, or a predicted value of 55 • For example, say that we want to predict a value of y for a specific value of x • Assume that we are using y = 10 x + 25 as our model • To predict the value of y when x = 3, the model gives us y = 10  3 + 25 = 55, or a predicted value of 55 • Assume the actual value of y for x = 3 is equal to 50 • For example, say that we want to predict a value of y for a specific value of x • Assume that we are using y = 10 x + 25 as our model • To predict the value of y when x = 3, the model gives us y = 10  3 + 25 = 55, or a predicted value of 55 • Assume the actual value of y for x = 3 is equal to 50 • The actual value is 50, the predicted value is 55, so the residual (or error) is 50 – 55 = –5

  8. The residual The model line The observed value y The predicted value y The x value of interest Chapter 4 – Section 2 • What the residual is on the scatter diagram

  9. Chapter 4 – Section 2 • We want to minimize the residuals, but we need to define what this means • We want to minimize the residuals, but we need to define what this means • We use the method of least-squares • We consider a possible linear mode • We calculate the residual for each point • We add up the squares of the residuals • We want to minimize the residuals, but we need to define what this means • We use the method of least-squares • We consider a possible linear mode • We calculate the residual for each point • We add up the squares of the residuals • The line that has the smallestis called the least-squaresregressionline

  10. Chapter 4 – Section 2 • The equation for the least-squares regression line is given by y = b1x + b0 • b1 is the slope of the least-squares regression line • b0 is the y-intercept of the least-squares regression line

  11. Chapter 4 – Section 2 • Finding the values of b1 and b0, by hand, is a very tedious process • You should use software for this • Finding the values of b1 and b0, by hand, is a very tedious process • You should use software for this • Finding the coefficients b1 and b0 is only the first step of a regression analysis • We need to interpret the slope b1 • We need to interpret the y-intercept b0 • We need to do quite a bit more statistical analysis … this is covered in Section 4.3 and also in Chapter 14

  12. 1 3 2 Chapter 4 – Section 2 • Learning objectives • Find the least-squares regression line and use the line to make predictions • Interpret the slope and the y-intercept of the least squares regression line • Compute the sum of squared residuals

  13. Chapter 4 – Section 2 • Interpreting the slope b1 • The slope is sometimes defined as as • The slope is also sometimes defined as as • The slope relates changes in y to changes in x

  14. Chapter 4 – Section 2 • For example, if b1 = 4 • If x increases by 1, then y will increase by 4 • If x decreases by 1, then y will decrease by 4 • A positive linear relationship • For example, if b1 = 4 • If x increases by 1, then y will increase by 4 • If x decreases by 1, then y will decrease by 4 • A positive linear relationship • For example, if b1 = –7 • If x increases by 1, then y will decrease by 7 • If x decreases by 1, then y will increase by 7 • A negative linear relationship

  15. Chapter 4 – Section 2 • For example, say that a researcher studies the population in a town (the y or response variable) in each year (the x or predictor variable) • To simplify the calculations, years are measured from 1900 (i.e. x = 55 is the year 1955) • For example, say that a researcher studies the population in a town (the y or response variable) in each year (the x or predictor variable) • To simplify the calculations, years are measured from 1900 (i.e. x = 55 is the year 1955) • The model used is y = 300 x + 12,000 • For example, say that a researcher studies the population in a town (the y or response variable) in each year (the x or predictor variable) • To simplify the calculations, years are measured from 1900 (i.e. x = 55 is the year 1955) • The model used is y = 300 x + 12,000 • A slope of 300 means that the model predicts that, on the average, the population increases by 300 per year

  16. Chapter 4 – Section 2 • Interpreting the y-intercept b0 • Sometimes b0 has an interpretation, and sometimes not • If 0 is a reasonable value for x, then b0 can be interpreted as the value of y when x is 0 • If 0 is not a reasonable value for x, then b0 does not have an interpretation • Interpreting the y-intercept b0 • Sometimes b0 has an interpretation, and sometimes not • If 0 is a reasonable value for x, then b0 can be interpreted as the value of y when x is 0 • If 0 is not a reasonable value for x, then b0 does not have an interpretation • In general, we should not use the model for values of x that are much larger or much smaller than the observed values

  17. Chapter 4 – Section 2 • For example, say that a researcher studies the population in a town (the y or response variable) in each year (the x or predictor variable) • To simplify the calculations, years are measured from 1900 (i.e. x = 55 is the year 1955) • For example, say that a researcher studies the population in a town (the y or response variable) in each year (the x or predictor variable) • To simplify the calculations, years are measured from 1900 (i.e. x = 55 is the year 1955) • The model used is y = 300 x + 12,000 • For example, say that a researcher studies the population in a town (the y or response variable) in each year (the x or predictor variable) • To simplify the calculations, years are measured from 1900 (i.e. x = 55 is the year 1955) • The model used is y = 300 x + 12,000 • An intercept of 12,000 means that the model predicts that the town had a population of 12,000 in the year 1900 (i.e. when x = 0)

  18. 1 2 3 Chapter 4 – Section 2 • Learning objectives • Find the least-squares regression line and use the line to make predictions • Interpret the slope and the y-intercept of the least squares regression line • Compute the sum of squared residuals

  19. Chapter 4 – Section 2 • After finding the slope b1 and the intercept b0, it is very useful to compute the residuals, particularly • Again, this is a tedious computation • All the least-squares regression software would compute this quantity • We will use it in future sections

  20. Summary: Chapter 4 – Section 2 • We can find the least-squares regression line that is the “best” linear model for a set of data • The slope can be interpreted as the change in y for every change of 1 in x • The intercept can be interpreted as the value of y when x is 0, as long as a value of 0 for x is reasonable

More Related