240 likes | 354 Vues
Chapter 4. The Description of Data: Measures of Variation and Dispersion. Measures of Variation. We have looked at measures of the center, or location, of data. We also need a measure of the dispersion of data. Range. The range is the distance spanned by the data .
E N D
Chapter 4 The Description of Data:Measures of Variation and Dispersion
Measures of Variation • We have looked at measures of the center, or location, of data. • We also need a measure of the dispersion of data.
Range • The range is the distance spanned by the data. • The range is calculated by subtracting the smallest data value from the largest. • The range is sensitive to outliers. • The range does not provide any information regarding the data between the minimum and maximum.
Interquartile Range • The interquartile range is the distance spanned by the middle 50% of the data. • The interquartile range is calculated by subtracting Q1 from Q3. • The interquartile range is not sensitive to outliers, but still gives insight into the dispersion of the data.
Mean Absolute Deviation • The mean absolute deviation is the mean distance to the mean. In other words, it’s the average distance from the data to µ.
Variance andStandard Deviation • The variance is the average squared distance to the mean. • The standard deviation is the square root of the variance.
Variance andStandard Deviation • For samples, we divide by n-1 to avoid bias. • The standard deviations of populations and samples are available from your calculator. Variance can be calculated as the square of the standard deviation.
Chebyshev’s Theorem • The minimum proportion of data that can be found within k standard deviations from the mean is:
Chebyshev’s Theorem • Chebyshev’s Theorem works for any distribution, but it does not work very well. • This theorem gives the minimum proportion of data that will be found in a given interval, but in reality, the actual amount is usually much higher than Chebyshev predicts.
The Empirical Rule • If the distribution of data is normal (bell shaped), then: • 68% of the data will be found within one standard deviation of the mean. • 95% of the data will be found within two standard deviations of the mean. • 99.7% of the data will be found within three standard deviations of the mean.
The Empirical Rule • The empirical rule only works for distributions that are normal (bell shaped). • The empirical rule is much more accurate than Chebyshev’s Theorem.
Coefficient of Variation • The coefficient of variation measures the relative variation of a distribution. • Since this is a relative measure, there are no units, making it easier to compare the variation of two different populations.
Skewness • Distributions with a long right tail are positively skewed. • Distributions with a long left tail are negatively skewed. • Distributions that are not skewed are symmetric.
Pearson’s Coefficient of Skewness • Pearson’s coefficient of skewness gives a numeric measurement of the skewness of a distribution. • Distributions with an SK of 0 are symmetric. • Distributions with a positive SK are positively skewed, while distributions with a negative SK are negatively skewed.
Try it! • The median price of a home selling in San Diego during 1991 was $195,000. The first and third quartile prices were $170,500 and $232,000 respectively. What was the semi-interquartile range for the cost of a home in San Diego in 1991? • $30,750
Try it! • A sample of 6 prices quoted for a particular television set are $326, $299, $345, $295, $310, and $345. • Find the range of this sample. • $50
Try it! • A sample of 6 prices quoted for a particular television set are $326, $299, $345, $295, $310, and $345. • Find the variance for the quoted price of the TV. • $490.40
Try it! • A sample of 6 prices quoted for a particular television set are $326, $299, $345, $295, $310, and $345. • Find the standard deviation for the quoted price of the TV. • $22.14
Try it! • Given a set of data with a mean of 220.8 and a standard deviation of 17.0, find the k, or z, value of: • 200 • k = -1.2235
Try it! • Given a set of data with a mean of 220.8 and a standard deviation of 17.0, find the k, or z, value of: • 238.4 • k = 1.0353
Try it! • Given a set of data with a mean of 220.8 and a standard deviation of 17.0, find the k, or z, value of: • 229 • k = .4824
Try it! • Given a set of data with a mean of 220.8 and a standard deviation of 17.0, find the k, or z, value of: • 198.1 • k = -1.3353
Try It! • Exercise 4.12 • SK = -.5430