Summary Measures of Ungrouped Numerical Data

Summary Measures of Ungrouped Numerical Data

Comparing Data Sets with respect to: Location Variability (Dispersion) Shape

Calculation and Properties of Some Measures of Location of Ungrouped Numerical Data

Location Measures Measures of Central Tendency Measures of Noncentral Tendency

Find the following measures of LOCATION. Central Tendency Measures: _ 1) mean (X) 2) median (M) 3) mode Noncentral Tendency Measures: (Quantiles or Fractiles) 1) First Quartile (Q1) 2) Third Quartile (Q3) 3) Sixth Decile (D6) 4) Eighty-third percentile (P83) First, the mean.

Calculation of the Sample Mean The following sample (n = 7) of data contains the number of years that a company was in business. For the small sample (n = 7), the mean will be calculated. The symbol used for the sample mean is called X-bar. For simplicity, the formula will be written as follows:

Properties of the Sample Mean The following sample (n = 7) of data contains the number of years that a company was in business. For the small sample (n = 7), X-bar = 9.

Properties of the Sample Mean The following sample (n = 7) of data contains the number of years that a company was in business. For the small sample (n = 7), X-bar = 9. The mean is a measure of central tendency (the middle). Q: In what “sense” is the mean in the middle of a set of data?

Properties of the Sample Mean The following sample (n = 7) of data contains the number of years that a company was in business. For the small sample (n = 7), X-bar = 9. The mean is a measure of central tendency (the middle). Q: In what “sense” is the mean in the middle of a set of data? A: It has to do with the deviation from the mean. The mean is the place where the deviations “balance.” It is easier to see by examining the dot plot.

Properties of the Sample Mean The following sample (n = 7) of data contains the number of years that a company was in business. Deviations from the mean. Sum = -15 Sum = +15 Deviations below the mean “balance” the deviations above the mean. The sum of the deviations from the mean EQUALS zero!

Properties of the Sample Mean The following sample (n = 7) of data contains the number of years that a company was in business. Note that the order does not matter. The sum of the deviations from the sample mean is ALWAYS zero!

Properties of the Sample Mean The sample mean is in the “middle” of a sample because the sum of the deviations from the sample mean equals zero. In order for the mean to represent the “middle,” the difference, must be well defined for each value of x. In other words, for the mean to be a useful calculation, the data must be of at least interval level. Next, a LARGE sample is considered.

A Large Sample Example (presented as Raw Data) The following sample (n = 120) of data contains the number of years that a company was in business. Q: What does one need to know to find the mean of the sample above? A: The sum of the values and the sample size.

A Large Sample Example (presented as Raw Data) The following sample (n = 120) of data contains the number of years that a company was in business. If , then

A Large Sample Example (presented as Raw Data) The following sample (n = 120) of data contains the number of years that a company was in business. If , then what is the “typical” number of years that a value in the sample differs from the mean?

Find the following measures of LOCATION. Central Tendency Measures: _ 1) mean (X) 2) median (M) 3) mode Noncentral Tendency Measures: (Quantiles or Fractiles) 1) First Quartile (Q1) 2) Third Quartile (Q3) 3) Sixth Decile (D6) 4) Eighty-third percentile (P83) Next, the median.

Determination of the Sample Median The following sample (n = 7) of data contains the number of years that a company was in business. For the small sample (n = 7), the median will be found. The symbol used for the sample median is M. The median of the sample is a value for which the same number of sample values are less than (or equal) as sample values that are greater than (or equal). This is the “sense” in which the median is in the “middle” of the sample. To determine the value of the median, the values in the sample must be sorted. Use the ordered array. In this example, it is easy to see that the median (M) is 10.

Determination of the Sample Median If there is an odd number of values in the sample, there will be a value in the middle (when sorted). That value is the median. If there is an even number of values in the sample, then any value that is between the two values “in the middle” (when sorted) could serve as the median. However, the convention is to go “half way” between those two middle values. For example, remove a 3 from the sample.

Determination of the Sample Median If there is an odd number of values in the sample, there will be a value in the middle (when sorted). That value is the median. If there is an even number of values in the sample, then any value that is between the two values “in the middle” (when sorted) could serve as the median. However, the convention is to go “half way” between those two middle values. For example, remove a 3 from the sample. The value 10.25 could be used as the median. The value 11.75 could be used as the median. However, the convention is to use (10 + 12)/2 = 11 as the value of the median.

Determination of the Sample Median • General Process for determining the median of a sample: • Form the ordered array. • Calculate the position of the median. • Find the sample value at that position (if n is odd) • or average the two values on each side of that position • (if n is even). Note: Later on when finding quantiles (fractiles), a similar process will be used. Next, a LARGE sample is considered.

A Large Sample Example (presented as Raw Data) The following sample (n = 120) of data contains the number of years that a company was in business. Q: What does one need to know to find the median of the sample above? A: The ordered array and the sample size (n).

A Large Sample Example (presented as an Ordered Array) The following sample (n = 120) of data contains the number of years that a company was in business.

A Large Sample Example (presented as an Ordered Array) The following sample (n = 120) of data contains the number of years that a company was in business. Position 60.

A Large Sample Example (presented as an Ordered Array) The following sample (n = 120) of data contains the number of years that a company was in business. Position 60. Position 61.

Find the following measures of LOCATION. Central Tendency Measures: _ 1) mean (X) 2) median (M) 3) mode Noncentral Tendency Measures: (Quantiles or Fractiles) 1) First Quartile (Q1) 2) Third Quartile (Q3) 3) Sixth Decile (D6) 4) Eighty-third percentile (P83) Next, the mode.

Determination of the Sample Mode The following sample (n = 7) of data contains the number of years that a company was in business. For the small sample (n = 7), the mode will be found. The mode of the sample is the value that occurs most frequently. It is in the “middle” in a probabilistic sense. If one value is selected randomly from the sample, which value is most likely to be selected? (In a normal distribution, mean = median = mode.) It is easier to determine the value of the mode from the ordered array (below) than from the raw data (above). Next, a LARGE sample is considered. In this example, it is easy to see that the mode is 3.

A Large Sample Example (presented as Raw Data) The following sample (n = 120) of data contains the number of years that a company was in business. Q: What does one need to know to find the mode of the sample above? A: It is a lot easier to find from the ordered array.

A Large Sample Example (presented as an Ordered Array) The following sample (n = 120) of data contains the number of years that a company was in business. Mode = 4. The value 4 occurs most frequently (17 times). The mode is even easier to find from the dot plot.

Dot Plot of Years Datafor Large Sample (n=120)

Quasi-Normal Dot Plot mean = median = mode = 15

Find the following measures of LOCATION. Central Tendency Measures: _ 1) mean (X) 2) median (M) 3) mode Noncentral Tendency Measures: (Quantiles or Fractiles) 1) First Quartile (Q1) 2) Third Quartile (Q3) 3) Sixth Decile (D6) 4) Eighty-third percentile (P83) Next, the noncentral tendency measures. Note: The process is essentially the same as that for finding the median.

Concept of the Quantile (Fractile) M = Median Position 1 n Smallest M = median Largest

Concept of the Quantile (Fractile) M = Median Position 1 n Smallest M = median Largest 1/2 of the values ≤ M 1/2 of the values ≥ M

Concept of the Quantile (Fractile) M = Median Position 1 n Smallest M = median Largest 1/2 of the values ≤ M 1/2 of the values ≥ M Q1 = First Quartile Position 1 n Smallest Q1 = first quartile Largest

Concept of the Quantile (Fractile) M = Median Position 1 n Smallest M = median Largest 1/2 of the values ≤ M 1/2 of the values ≥ M Q1 = First Quartile Position 1 n Smallest Q1 = first quartile Largest 1/4 of the values ≤ Q1 3/4 of the values ≥ Q1

Concept of the Quantile (Fractile) M = Median Position 1 n Smallest M = median Largest 1/2 of the values ≤ M 1/2 of the values ≥ M Q3 = Third Quartile Position 1 n Smallest third quartile = Q3 Largest

Concept of the Quantile (Fractile) M = Median Position 1 n Smallest M = median Largest 1/2 of the values ≤ M 1/2 of the values ≥ M Q3 = Third Quartile Position 1 n Smallest third quartile = Q3 Largest 3/4 of the values ≤ Q3 1/4 of the values ≥ Q3

Concept of the Quantile (Fractile) M = Median Position 1 n Smallest M = median Largest 1/2 of the values ≤ M 1/2 of the values ≥ M D6 = Sixth Decile Position 1 n sixth decile = D6 Smallest Largest

Concept of the Quantile (Fractile) M = Median Position 1 n Smallest M = median Largest 1/2 of the values ≤ M 1/2 of the values ≥ M D6 = Sixth Decile Position 1 n sixth decile = D6 Smallest Largest 6/10 of the values ≤ D6 4/10 of the values ≥ D6

Concept of the Quantile (Fractile) M = Median Position 1 n Smallest M = median Largest 1/2 of the values ≤ M 1/2 of the values ≥ M P83 = 83rd Percentile Position 1 n 83rd percentile = P83 Smallest Largest

Concept of the Quantile (Fractile) M = Median Position 1 n Smallest M = median Largest 1/2 of the values ≤ M 1/2 of the values ≥ M P83 = 83rd Percentile Position 1 n 83rd percentile = P83 Smallest Largest 83% of the values ≤ P83 17% of the values ≥ P83

Determination of Quantiles (Fractiles) • General Process for determining a Quantile (Fractile)of a sample: • Form the ordered array. • Calculate the position of the quantile (fractile). • Find the sample value closer to that position • (if fraction is NOT .5) or average the two values • on each side of that position (if fraction is .5). Next, the process is shown for a LARGE sample.

A Large Sample Example (presented as Raw Data) The following sample (n = 120) of data contains the number of years that a company was in business. Q: What does one need to know to find the 1st quartile of the sample? A: The ordered array and the sample size (n).

A Large Sample Example (presented as an Ordered Array) The following sample (n = 120) of data contains the number of years that a company was in business.

A Large Sample Example (presented as an Ordered Array) The following sample (n = 120) of data contains the number of years that a company was in business. Position 30.

A Large Sample Example (presented as an Ordered Array) The following sample (n = 120) of data contains the number of years that a company was in business. Position 30. Position 31.

A Large Sample Example (presented as an Ordered Array) The following sample (n = 120) of data contains the number of years that a company was in business. Position 30. Position 31. The position is closer to 30 than 31, so the value of Q1 is the value at position 30.

A Large Sample Example (presented as Raw Data) The following sample (n = 120) of data contains the number of years that a company was in business. Q: What does one need to know to find the 3rd quartile of the sample? A: The ordered array and the sample size (n).

Summary Measures of Ungrouped Numerical Data

Summary Measures of Ungrouped Numerical Data

Presentation Transcript

Describing Data: Numerical Measures

Describing Data Using Numerical Measures

Describing Data: Numerical Measures

Measures of Central Tendency for Ungrouped Data

Describing Data: Numerical Measures

Numerical Measures of Variability

Numerical Summary measures

Numerical Measures of Variability

Summary Measures of Ungrouped Numerical Data

Numerical Measures of Position

Presentation of Ungrouped Numerical Data

Chapter 3. Describing Data: Numerical Measures

Numerical Measures

Summary: Numerical Measures of a Distribution ’ s Spread

Numerical Summary Measures

Describing Data: Numerical Measures

Describing Data: Summary Measures

Numerical Measures

Numerical Measures

Describing Data with Numerical Measures

Describing Data: Numerical Measures

Describing Data: Numerical Measures