1 / 20

Exploring relationships between variables

Exploring relationships between variables. Unit 2 Scatterplots , Associations, and Correlations. Scatterplots. Shows change over time Shows patterns Shows Trends Relationships Outlier values. Scatterplots . Can be positive or negative Show relationship amongst 2 variables

Télécharger la présentation

Exploring relationships between variables

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Exploring relationships between variables Unit 2 Scatterplots, Associations, and Correlations

  2. Scatterplots • Shows change over time • Shows patterns • Shows Trends • Relationships • Outlier values

  3. Scatterplots • Can be positive or negative • Show relationship amongst 2 variables • Can be shown more in depth through the Z-scores of both variables (ZX, ZY)

  4. Z-scores • X-MeanX / Standard Deviation (SX) • Y-MeanY / Standard Deviation (SY) • Calculating standard deviation in the same way as before.

  5. Ratio • Correlation coefficient • Sum of SX * SY / n-1 • Correlation measures the strength of the linear association between 2 variables

  6. variables • Explanatory Variable – X • Response Variable - Y

  7. Least-Squares Line • Y= a + bx • a = y intercept • b = slope • a = y – bx • b = SSxy/SSx • SSx = Sum of squares of x

  8. SSx • This is calculated by obtaining the sum of each squared x • You then subtract the sum of x squared divided by n • You can get SSx on the calculator by squaring the standard deviation then multiplying it by (n-1)

  9. SSxy • Sum of squares of x and y • Take the sum of each x value times each y value. • You then subtract from that total the (Sum of x) * (Sum of y) n

  10. SSxy • SSxy is a more efficient way of computing • Sum of each (x-xbar) * (y-ybar)

  11. Complete problem 4 on page 160

  12. Standard Error of Estimate • Se = square root of E(y-yp)squared/n – 2 • How to calculate square root of SDY – b(SDx * SDy) / n-2

  13. Residuals • You can graph the residual of the equation to see if the regression is accurate • Residuals are the difference between the observed value and the predicted value • R = observed - predicted

  14. Confidence Intervals • Yp – E < y <yp + E • Yp = predicted value of y

  15. What does this mean (better understanding)

  16. Types of data • Outlier • Leverage • Influential Point • Lurking Variable

  17. Outlier • Any data point that stands away from the others

  18. Leverage • Data points with X-values that are far from the mean • Can alter the line of least regression

  19. Influential Point • Omitting this point can drastically alter the regression model

  20. Lurking Variable • A variable that is hidden in the equation • It is not explicitly part of the model but affects the way the variables in the model appear

More Related