1 / 19

Correlation

Correlation. Correlation. Key Concept. This section introduces the linear correlation coefficient r , which is a numerical measure of the strength of the relationship between two variables representing quantitative data.

Télécharger la présentation

Correlation

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Correlation

  2. Correlation Key Concept This section introduces the linear correlation coefficient r, which is a numerical measure of the strength of the relationship between two variables representing quantitative data. Because technology can be used to find the value of r, it is important to focus on the concepts in this section, without becoming overly involved with tedious arithmetic calculations.

  3. Acorrelation exists between two variables when one of them is related to the other in some way. Definition

  4. Definition The linear correlation coefficient rmeasures the strength of the linear relationship between paired x- and y- quantitative values in a sample.

  5. Scatterplots of Paired Data

  6. Scatterplots of Paired Data

  7. nrepresents the number of pairs of data present. denotes the addition of the items indicated. xdenotes the sum of all x-values. x2 indicates that each x-value should be squared and then those squares added. (x)2indicates that the x-values should be added and the total then squared. xyindicates that each x-value should be first multiplied by its corresponding y-value. After obtaining all such products, find their sum. rrepresents linear correlation coefficient for a sample. represents linear correlation coefficient for a population. Notation for the Linear Correlation Coefficient

  8. Formula The linear correlation coefficientr measures the strength of a linear relationship between the paired values in a sample.

  9. Example: Calculating r Using the simple random sample of data below, find the value of r.

  10. Example: Calculating r - cont

  11. Example: Calculating r - cont

  12. 1. –1 r 1 2. The value of rdoes not change if all values of either variable are converted to a different scale. 3. The value of r is not affected by the choice of x and y. Interchange all x- and y-valuesand the value of rwill not change. 4. r measures strength of a linear relationship. Properties of the Linear Correlation Coefficient r

  13. The value of r2 is the proportion of the variation (coefficient of variation) inythat is explained by the linear relationship between x and y. Explained Variation (coefficient of variation)

  14. Example: Using the duration/interval data an Example, we have found that the value of the linear correlation coefficient r = 0.926. What proportion of the variation? With r = 0.926, we get r2 = 0.857. We conclude that 0.857 (or about 86%) of the variation (y) can be explained by the linear relationship between x and y. This implies that 14% of the variation in y after cannot be explained by the linear relationship between x and y.

  15. 1. Averages: Averages suppress individual variation and may inflate the correlation coefficient. 2. Linearity: There may be some relationship between x and y even when there is no linear correlation. Common Errors Involving Correlation

  16. Note: confidence interval A confidence interval (or interval estimate) is a range (or an interval) of values used to estimate the true value of a population parameter.

  17. Confidence Interval for Population Proportion where

  18. Example The U.S. Bureau of Labor Statistics collects information on the ages of people in the civilian labor force and publishes the results in Employment and Earnings. Fifty people in the civilian labor force are randomly selected. Find a 95% confidence interval for the mean age, μ=36.38, of all people in the civilian labor force. Assume that the population standard deviation of the ages is 12.1 years.

More Related