1 / 19

Welcome Back!

Welcome Back!. EDUC 7610. Chapter 2. The Simple Regression Model. Fall 2018 Tyson S. Barrett, PhD. Let’s start with Scatterplots. Each point represents a single observation The red line is the line of best fit The line happens to go through each Conditional Mean

akiko
Télécharger la présentation

Welcome Back!

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Welcome Back!

  2. EDUC 7610 Chapter 2 The Simple Regression Model Fall 2018 Tyson S. Barrett, PhD

  3. Let’s start with Scatterplots • Each point represents a single observation • The redline is the line of best fit • The line happens to go through each Conditional Mean • It goes through the mean at each value of x • E.g. When x = 1, mean of y = 2.5 (the conditional mean of y at x = 1 is 2.5)

  4. Conditional Means and Prediction • The open circles are where the Conditional Means are • In this case, all conditional means run along the line • When this happens (or approx. happens) we have linearity • The line is the linear model’s predicted level of y for each level of x

  5. Why is that line the “best”? That line is the line that minimizes the error between the predicted values and the observed values i.e., “residual” or “error” This approach is called Ordinary Least Squares (OLS) regression

  6. Features of the “Best” Line (Simple Regression) Slope = Intercept = The Line () =

  7. The “Best” Line and Correlation is only affected by variables that influence both X and Y while is affected by variables that only influence Y We unstandardized the by has no scale but is in the units of the outcome is affected by the range of the variables measured is the effect of X on Y while is the relative importance of X on Y

  8. We unstandardized the by That is, is the standardized version of If we standardize our variables before using regression, both a are the same Why?

  9. has no scale but is in the units of the outcome has a range of -1 to 1 is in the range of the outcome (approximately), often is from – to “For a one unit increase in X there is an associated increase of units in the outcome”

  10. is affected by the range of the variables measured The value of is not affected by the range of X (the significance is…) is affected by having a less-than-representative range of X Why?

  11. is affected by the range of the variables measured

  12. is only affected by variables that influence both X and Y while is affected by variables that only influence Y is the effect of X on Y while is the relative importance of X on Y • is a measure of relative importance compared to other variables • If other variables are important, will be relatively smaller • is a measure of the effect of X on Y and therefore shouldn’t change much based on the range of X • The standard error is affected though (we’ll discuss later)

  13. Back to Residuals The estimate of depends on minimizing the residuals so they are kind of a big deal

  14. Back to Residuals Our values can be separated into three parts: The same for everyone (a constant) Unexplained component (residuals) Explained component

  15. Back to Residuals Our values can be separated into three parts: The same for everyone (a constant) Unexplained component (residuals) Explained component

  16. Properties of the Residuals The mean is exactly zero. The correlation with X is exactly zero. The variance is: The proportion of variance in Y not explained by X

  17. Properties of the Residuals The mean is exactly zero. The correlation with X is exactly zero. The variance is: is the proportion of variance in Y explained by X The proportion of variance in Y not explained by X

  18. Residuals tell us stuff Partial relationships because the residual is what is remaining in Y after adjusting for X Residual analysis to detect anomalies Detect non-linearities Assess the homoskedasticity assumption

More Related