200 likes | 388 Vues
Chapter 3 Descriptive Statistics: Numerical Measures Part A. Measures of Location Central Tendency Measures Percentiles and Quartiles. Mean. The mean of a data set is the average of all the data values. The sample mean is the point estimator of the population mean m. Sample Mean.
E N D
Chapter 3 Descriptive Statistics: Numerical MeasuresPart A • Measures of Location • Central Tendency Measures • Percentiles and Quartiles
Mean • The mean of a data set is the average of all the data values. • The sample mean is the point estimator of the population mean m.
Sample Mean Sum of the values of the n observations Number of observations in the sample
Population Mean m Sum of the values of the N observations Number of observations in the population
Sample Mean • Example: Apartment Rents Seventy efficiency apartments were randomly sampled in a small college town. The monthly rent prices for these apartments are listed in ascending order on the next slide.
Considerations in Using the Mean • Requires interval/ratio level • Influenced by extreme scores • Balancing point of the distribution (+ and -deviations cancel out) • Minimizes the “sum of squares” (sum of squared deviations around mean is smaller than around any other number) • Mean is our “best guess” or estimate – minimizes errors in prediction
Median • The median is the middle score or midpoint of • a set of scores when scores are arranged in order. • For an odd number of observations: 26 18 27 12 14 27 19 7 observations 27 12 14 18 19 26 27 in ascending order Median = 19
Median • For an even number of observations: 26 18 27 12 14 27 30 19 8 observations 27 30 12 14 18 19 26 27 in ascending order the median is the average of the middle two values. Median = (19 + 26)/2 = 22.5
Median Averaging the 35th and 36th data values: Median = (475 + 475)/2 = 475
Considerations in using Median • Requires ordinal level or higher • Not sensitive to extreme scores – good for skewed distributions • Examples: annual income and property values
Mode • The mode of a data set is the value that occurs with greatest frequency. • The greatest frequency can occur at two or more • different values. • If the data have exactly two modes, the data are • bimodal. • If the data have more than two modes, they are • multimodal.
Mode 450 occurred most frequently (7 times) Mode = 450
Considerations in Using Mode • May be used at any level of measurement • May not imply “majority “ or “most” • “Most common” score or value may not be representative of most cases
Percentiles • The pth percentile of a data set is a value such that at least p percent of the items take on this value or less and at least (100 - p) percent of the items take on this value or more. • Admission test scores for colleges and universities • are frequently reported in terms of percentiles.
Percentiles Arrange the data in ascending order. Compute index i, the position of the pth percentile. i = (p/100)n If i is not an integer, round up. The pth percentile is the value in the ith position. If i is an integer, the pth percentile is the average of the values in positions i and i+1.
90th Percentile i = (p/100)n = (90/100)70 = 63 Averaging the 63rd and 64th data values: 90th Percentile = (580 + 590)/2 = 585
90th Percentile “At least 10% of the items take on a value of 585 or more.” “At least 90% of the items take on a value of 585 or less.” 63/70 = .9 or 90% 7/70 = .1 or 10%
Quartiles • Quartiles are specific percentiles. • First Quartile = 25th Percentile • Second Quartile = 50th Percentile = Median • Third Quartile = 75th Percentile
Third Quartile Third quartile = 75th percentile i = (p/100)n = (75/100)70 = 52.5 = 53 Third quartile = 525