1 / 28

Correlation

Correlation. The Problem. Are two variables related? Does one increase as the other increases? e. g. skills and income Does one decrease as the other increases? e. g. health problems and nutrition How can we get a numerical measure of the degree of relationship?. Scatterplots.

yuri
Télécharger la présentation

Correlation

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Correlation

  2. The Problem • Are two variables related? • Does one increase as the other increases? • e. g. skills and income • Does one decrease as the other increases? • e. g. health problems and nutrition • How can we get a numerical measure of the degree of relationship?

  3. Scatterplots • Examples from text • See next three slides • Infant mortality and number of physicians • Life expectancy and health care expenditures • Cancer rate and solar radiation

  4. An Example • An actual course with both a lab and an exam component of final grades • Plotting exam component against lab component • Fairly weak relationship • Relationship is positive

  5. Exams and Labs • Note relationship is weak, but real. • Note most data cluster on right. • Why do we care about relationship? • What would students conclude if there were no relationship? • What if the relationship were near perfect? • What if the relationship were negative?

  6. Heart Disease and Cigarettes • Landwehr & Watkins report data on heart disease and cigarette smoking in 21 developed countries • Data have been rounded for computational convenience. • The results were not affected.

  7. The Data Surprisingly, the U.S. is the first country on the list--the country with the highest consumption and highest mortality.

  8. Scatterplot of Heart Disease • CHD Mortality goes on ordinate • Why? • Cigarette consumption on abscissa • Why? • What does each dot represent? • Best fitting line included for clarity

  9. {X = 6, Y = 11}

  10. What Does the Scatterplot Show? • As smoking increases, so does coronary heart disease mortality. • Relationship looks strong • Not all data points on line. • This gives us “residuals” or “errors of prediction” • To be discussed later

  11. Correlation Coefficient • A measure of degree of relationship. • Sign refers to direction. • Based on covariance • Measure of degree to which large scores go with large scores, and small scores with small scores

  12. Covariance • The formula • How this works, and why • When would covXY be large and positive? • When would covXY be large and negative?

  13. Correlation Coefficient • Symbolized by r • Covariance ÷ (product of st. dev.)

  14. Calculation • CovXY = 11.13 • sX = 2.33 • sY = 6.69

  15. Correlation--cont. • Correlation = .71 • Sign is positive • Why? • If sign were negative • What would it mean? • Would not alter the degree of relationship.

  16. Factors Affecting r • Range restrictions • See next slide • Data only for countries with low consumption • Nonlinearity • e.g. age and size of vocabulary • Heterogeneous subsamples • Everyday examples

  17. Countries With Low Consumptions Data With Restricted Range Truncated at 5 Cigarettes Per Day 20 18 16 14 12 CHD Mortality per 10,000 10 8 6 4 2 2.5 3.0 3.5 4.0 4.5 5.0 5.5 Cigarette Consumption per Adult per Day

  18. Testing r • Population parameter =  • Null hypothesis H0:  = 0 • Test of linear independence • What would a true null mean here? • What would a false null mean here? • Alternative hypothesis (H1)   0 • Two-tailed

  19. Tables of Significance • Table in Appendix E.2 • For N - 2 = 19 df, rcrit = .433 • Our correlation > .433 • Reject H0 • Correlation is significant. • Greater cigarette consumption associated with higher CHD mortality.

  20. Computer Printout • Printout gives test of significance. • See next slide. • Double asterisks with footnote indicate p < .01.

  21. SPSS Printout

  22. Intercorrelation Matrix • Matrix of correlations of several variables at once. • Example from Kliewer et al (1998) JCCP • 99 young children • Measured level of • Witness violence, Intrusive thoughts, Social support, and Internalizing symptoms • Define these variables

  23. Cont.

  24. Intercorrelation Matrix--cont. • Describe the table. • What does this tell us about the effects of witnessing violence? • What role does social support play?

More Related