1 / 50

Describing Data: Numerical Measures

Describing Data: Numerical Measures. Chapter 3. Modified by Boris Velikson, 2009. GOALS. Calculate the arithmetic mean, weighted mean, median, and mode. Explain the characteristics, uses, advantages, and disadvantages of each measure of location.

keefe-bray
Télécharger la présentation

Describing Data: Numerical Measures

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Describing Data:Numerical Measures Chapter 3 Modified by Boris Velikson, 2009

  2. GOALS • Calculate the arithmetic mean, weighted mean, median, and mode. • Explain the characteristics, uses, advantages, and disadvantages of each measure of location. • Identify the position of the mean, median, and mode for both symmetric and skewed distributions. • Compute and interpret the range, mean deviation, variance, and standard deviation. • Understand the characteristics, uses, advantages, and disadvantages of each measure of dispersion. • Understand Chebyshev’s theorem and the Empirical Rule as they relate to a set of observations.

  3. 3- 3 Two numerical ways of describing data: • Measures of location (averages, measures of central tendency) • To pinpoint the center (in some sense) of a set of values • Measures of distribution • They characterize how wide or how narrow, or how symmetric or how asymmetric the distribution of data values is

  4. Characteristics of the Mean The arithmetic mean is the most widely used measure of location. It requires the interval or ratio scale. Its major characteristics are: • All values are used. • It is unique. • The sum of the deviations from the mean is 0. • It is calculated by summing the values and dividing by the number of values. Ex.: we have 5 values: 1, 10, 5, 4, 7. The arithmetic mean is (1+10+5+4+7)/5=27/5=5.4

  5. The Population Mean and the Sample Mean Population Mean For ungrouped data, the population mean is the sum of all the population values divided by the total number of population values:

  6. The Population Mean and the Sample Mean EXAMPLE – Population Mean

  7. The Population Mean and the Sample Mean Sample Mean • For ungrouped data, the sample mean is the sum of all the sample values divided by the number of sample values:

  8. The Population Mean and the Sample Mean EXAMPLE – Sample Mean

  9. Properties of the Arithmetic Mean • Every set of interval-level and ratio-level data has a mean. Non-numeric data does not have a mean! • All the values are included in computing the mean. • A set of data has a unique mean. • The mean is affected by unusually large or small data values. • The arithmetic mean is the only measure of central tendency where the sum of the deviations of each value from the mean is zero.

  10. 3- 10 Illustrating the 4th property: Consider the set of values: 7, 8, 5, 3, and 400. The mean is 84.6. The mean is affected by unusually large or small data values. Here, it is affected by the very large value of 400.

  11. 3- 11 Example 4: the sum of deviations from the mean is zero Illustrating the 5thth property: Consider the set of values: 3, 8, and 4. The mean is 5.

  12. Class work: • Do Ex. 9 and 11 p.60.

  13. Weighted Mean We use the weighted mean when we want to assign different weights to different data. For example: we want to calculate the average grade. The student has obtained the grades of 70, 85, and 90 in three quizzes, 80 in the midterm, and 95 in the final.The syllabus specifies that the midterm is 5 times more important than each quiz, and the final, 8 times more important. We assign these weights to the grades: To calculate the average grade, we do the following: average grade=

  14. Weighted Mean Or, we can express the weights as percentages, so that they add up to 100%. Then we do not have to divide by their sum (because it is 100%=1). We obtain the same result by calculating the average grade as

  15. Weighted Mean The Weighted Meanof a set of numbers X1, X2, ..., Xn, with corresponding weights w1, w2, ...,wn, is computed from the following formula:

  16. Weighted Mean & Grouped Data During a one hour period on a hot Saturday afternoon cabana boy Chris served 50 drinks. He sold: five drinks for $0.50, fifteen for $0.75, fifteen for $0.90, and fifteen for $1.15. Compute the weighted mean of the price of the drinks (how much did his average customer pay for a drink?)

  17. Weighted Mean & Grouped Data This can be done in 2 equivalent ways. 1) We can consider the data as representing an ungrouped set of 50 values, Then the answer is the arithmetic mean (is it a sample mean or a population mean?) Then the average we seek is the arithmetic mean of all these numbers:

  18. Weighted Mean & Grouped Data 2) It is much more convenient to consider these figures to be grouped data. Then we calculate the average price of a drink as a weighted mean: Most certainly, we get the same result.

  19. Class work: • Do Ex. 13 and 15 p.62.

  20. The Median • The Medianis themidpoint of the values after they have been ordered from the smallest to the largest. • There are as many values above the median as below it in the data array. • For an even set of values, the median will be the arithmetic average of the two middle numbers.

  21. Properties of the Median • There is a unique median for each data set. • It is not affected by extremely large or small values and is therefore a valuable measure of central tendency when such values occur. • It can be computed for ratio-level, interval-level, and ordinal-level data – i.e. for all types of data except nominal.

  22. EXAMPLES - Median The heights of four basketball players, in inches, are: 76, 73, 80, 75 Arranging the data in ascending order gives: 73, 75, 76, 80. Thus the median is 75.5 The ages for a sample of five college students are: 21, 25, 19, 20, 22 Arranging the data in ascending order gives: 19, 20, 21, 22, 25. Thus the median is 21.

  23. The Mode • The mode is the value of the observation that appears most frequently.

  24. Example - Mode

  25. Another example - Mode Data can have more than one mode. If it has two modes, it is referred to as bimodal, three modes, trimodal, and the like. Example: The exam scores for 12 students are: 93, 68, 75, 81, 68, 81, 81, 68, 81, 87, 93, 93. Most probably, you will decide there are 3 modes here.

  26. A 3d example - Mode Data may have no clearly expressed mode. Example: The exam scores for 10 students are: 93, 68, 75, 81, 87, 93, 68, 81, 87, 84. You may decide you see 2 modes here, but you may as well decide you see none.

  27. Mean, Median, Mode Using Excel Table 2–4 in Chapter 2 shows the prices of the 80 vehicles sold last month at Whitner Autoplex in Raytown, Missouri. Determine the mean and the median selling price. The mean and the median selling prices are reported in the following Excel output. There are 80 vehicles in the study. So the calculations with a calculator would be tedious and prone to error.

  28. Mean, Median, Mode Using Excel For explanations see p. 95 See file Whitner-2005.xls

  29. The Relative Positions of the Mean, Median and the Mode

  30. Class work: • Do Ex. 21 p.65.

  31. Dispersion Why Study Dispersion? • A measure of location, such as the mean or the median, only describes the center of the data. It is valuable from that standpoint, but it does not tell us anything about the spread of the data. • For example, if your nature guide told you that the river ahead averaged 3 feet in depth, would you want to wade across on foot without additional information? Probably not. You would want to know something about the variation in the depth. • A second reason for studying the dispersion in a set of data is to compare the spread in two or more distributions.

  32. Samples of Dispersions

  33. Measures of Dispersion • Range • Mean Deviation (if population, replace n by N) • Variance and Standard Deviation (for population; similar expressions exist for a sample)

  34. EXAMPLE – Range The number of cappuccinos sold at the Starbucks location in the Orange Country Airport between 4 and 7 p.m. for a sample of 5 days last year were 20, 40, 50, 60, and 80. Determine the range for the number of cappuccinos sold. Range = Largest – Smallest value = 80 – 20 = 60

  35. EXAMPLE – Mean Deviation The number of cappuccinos sold at the Starbucks location in the Orange Country Airport between 4 and 7 p.m. for a sample of 5 days last year were 20, 40, 50, 60, and 80. Determine the mean deviation for the number of cappuccinos sold.

  36. EXAMPLE – Variance and Standard Deviation The number of traffic citations issued during the last five months in Beaufort County, South Carolina, is 38, 26, 13, 41, and 22. We consider just these traffic citations, they are population. What is the population variance? The standard deviation:

  37. EXAMPLE – Sample Variance and SD The hourly wages for a sample of part-time employees at Home Depot are: $12, $20, $16, $18, and $19. What is the sample variance? The sample standard deviation:

  38. Class work: • Do Ex. 51 p.81.

  39. Chebyshev’s Theorem The arithmetic mean biweekly amount contributed by the Dupree Paint employees to the company’s profit-sharing plan is $51.54, and the standard deviation is $7.51. At least what percent of the contributions lie within plus 3.5 standard deviations and minus 3.5 standard deviations of the mean?

  40. The Empirical Rule

  41. Class work: • Do Ex. 56 p.83

  42. The Arithmetic Mean of Grouped Data

  43. The Arithmetic Mean of Grouped Data Because we replace each individual value Xi by the midpoint of the class to which this value belongs, the resulting mean is not exact. (By grouping the data, we lost some information). But it won’t be far off.

  44. The Arithmetic Mean of Grouped Data - Example Recall in Chapter 2, we constructed a frequency distribution for the vehicle selling prices. The information is repeated below. Determine the arithmetic mean vehicle selling price.

  45. The Arithmetic Mean of Grouped Data - Example If you open the Excel file Whitner-2005.xls and calculate the exact average, you’ll obtain $23218.1625. By grouping the data, you are off by 0.6%.

  46. Standard Deviation of Grouped Data

  47. Standard Deviation of Grouped Data - Example Refer to the frequency distribution for the Whitner Autoplex data used earlier. Compute the standard deviation of the vehicle selling prices (thousands) If you open the Excel file Whitner-2005.xls and calculate the exact standard deviation, you’ll obtain $4354,438. By grouping the data, you are off by 1.1%.

  48. Class Exercises • ex. 75, 77, 82

  49. Homework Chapter 3 • First part 64,68,72,76,78,80,81(only a and b),87 • Second part 81d,83

  50. End of Chapter 3

More Related