1 / 58

Section 3.3

Section 3.3. Measures of Relative Position. With some added content by D.R.S., University of Cordele. Measures of Relative Position. “How do I compare with everybody else?” nth place Percentiles Given percentile P, find data value there. Given data value, what’s its percentile? Quartiles

demi
Télécharger la présentation

Section 3.3

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Section 3.3 Measures of Relative Position With some added content by D.R.S., University of Cordele

  2. Measures of Relative Position “How do I compare with everybody else?” nth place Percentiles • Given percentile P, find data value there. • Given data value, what’s its percentile? Quartiles Five Number Summary and the Box Plot diagram Standard Score (also known as z-score) Outliers

  3. Nth Place The highest and the lowest 2nd highest, 3rd highest, etc. “I earned $41,246. I’m in ___th place out of ___.”

  4. Percentiles “My salary is the same or higher than ____% of the population.” If your test score were at this percentile, do you consider it to be good or bad or what? 90th percentile is _______________ 70th percentile is _______________ 40thpercentile is _______________ 10thpercentile is _______________

  5. “What is the data value the Pth percentile?” Formula: Location (=number of data values) Your data values are in order from lowest to highest. If exact integer, take the average of the values in positions and . If not exact integer, bump up (never round down, but always bump up) and take value in that position.

  6. Example 3.18: Finding Data Values Given the Percentiles A car manufacturer is studying the highway miles per gallon (mpg) for a wide range of makes and models of vehicles. The stem-and-leaf plot on the next slides contains the average highway mpg for each of the 135 different vehicles the manufacturer tested. a. Find the value of the 10th percentile. b. Find the value of the 20th percentile.

  7. Example 3.18 data as a stem-and-leaf diagram KEY: 12 | 1 means 12.1 mpg.

  8. Example 3.18: (a) Find the mpg value for the 10th percentile a. There are ____ values in this data set, thus n = ___. We want the 10th percentile, so P = ___. Is it an exact integer? No. ALWAYS BUMP UP, so take the data value in position # ______, which is ______ mpg. Answer: “The 10th percentile is ___ mpg.”

  9. Example 3.18: (b) Find the mpg value for the 20th percentile a. There are ____ values in this data set, thus n = ___. We want the 20thpercentile, so P = ___. Is it an exact integer? ________. so take the data values in position # ______ and #______, and average them. Answer: “The 20th percentile is ___ mpg.” Location Your calculation here:

  10. If you know the value, what’s its percentile? Pth Percentile of a Data Value Figure out how many values < or = your data value. Formula: where P is the percentile rounded to the nearest whole number, Lis the number of values in the data set less than or equal to the given value, and n is the number of data values in the data set. For this formula, always ROUND in the usual rounding way of rounding (5 or higher round up; 4 or lower chop down)

  11. Example 3.19: Finding the Percentile of a Given Data Value In the data set from the previous example, the Nissan Xterra averaged 21.1 mpg. In what percentile is this value? Solution We begin by making sure that the data are in order from smallest to largest. We know from the previous example that they are, so we can proceed with the next step.

  12. Example 3.19: Finding the Percentile of a Given Data Value (cont.) The Xterra’s value of 21.1 mpg is repeated in the data set, in both the 48th and 49th positions, so we will pick the one with the largest location value, which is the 49th. Using a sample size of n = 135 and a location of l = 49, we can substitute these values into the formula for the percentile of a given data value, which gives us the following.

  13. Example 3.19: Finding the Percentile of a Given Data Value (cont.) Since we always need to round a percentile to a whole number, we round 36.296 to 36. Thus, approximately 36% of the data values are less than or equal to the Xterra’s mpg rating. That is, 21.1 mpg is in the 36th percentile of the data set. • Avoid this common error: • If your answer is “36%”, you are WRONG. • The correct answer is “The 36th Percentile”. • Percents and Percentiles are related, sure. • But good grammar and proper usage matter.

  14. Summary: Two kinds of percentile problems They give you the percentile, you find the data value • Formula: position = L = n * P / 100 • If remainder, bump up to next whole and take data out of that position • If exact whole, take average of Lthand L + 1th They give you the data value, you find its percentile • L= how many data values are ≤ your value • P = (L / n) * 100 and do traditional rounding

  15. Excel gives different answers Excel does some fancy interpolation

  16. Quartiles Quartiles Q1 = First Quartile: 25% of the data are less than or equal to this value. Q2 = Second Quartile: 50% of the data are less than or equal to this value. Q3 = Third Quartile: 75% of the data are less than or equal to this value.

  17. Example 3.20: Finding the Quartiles of a Given Data Set – TWO DIFFERENT WAYS Using the mpg data from the previous examples, find the quartiles. a. Use the percentile method to find the quartiles. b. Use the approximation method to find the quartiles. c. How do these values compare? Solution The data are already in order from smallest to largest. We also know that n = 135.

  18. Example 3.20: Finding the Quartiles of a Given Data Set with the Percentile Method Percentile Method First quartile is 25th percentile Position Second quartile is 50th percentile Position Third quartile is 75thpercentile Position Count up to 34th position: “Q1 is 19.8 mpg” Count up to 68th position: “Q2 is 23.6 mpg” “Median is 23.6 mpg” Count up to 102nd position: “Q3 is 25.3 mpg”

  19. Example 3.20: Finding the Quartiles of a Given Data Set (cont.) Approximation Method (probably more common in this course, and also same as TI-84’s 1-Var Stats) First find the Median, that’s same as Q2. Q1 = median of these Q3 = median of these Positions #1, 2, 3, …, 67 Position #___ Positions #69, 70, 71, …, 135 ____ mpg Positions Position Positions # #1, 2, …, #_____ is #____, #____ _____mpg …, 135. Positions #1, Pos #___ Positions # #2, …, #___ is ______ ____, …, mpg 66, 67. and of course there is the usual complication if there are an even number of numbers in a data set.

  20. Example 3.20: Finding the Quartiles of a Given Data Set (cont.) c. These two methods result in the same values, which are also the values given by a TI‑83/84 Plus calculator, as shown below. This will always be true for any data set with an even number of data values. For a data set with an odd number of data values (like this one), the larger the data set, the closer the approximations will be to the percentile method’s values.

  21. Example 3.21: Finding the Quartiles of a Given Data Set The following speeds of motorists (in mph) were obtained by a Highway Patrol officer on duty one weekend. Determine the quartiles of each data set using the approximation method. a. 60, 62, 63, 65, 65, 67, 70, 71, 71, 75, 78, 79, 80, 81 Q2 is ______ mph Q1 is ______ mph Q3 is ______ mph

  22. Example 3.21: Finding the Quartiles of a Given Data Set The following speeds of motorists (in mph) were obtained by a Highway Patrol officer on duty one weekend. Determine the quartiles of each data set using the approximation method. b. 59, 66, 67, 67, 72, 74, 75, 75, 75, 76, 78, 79, 80, 81, 85 Q2 is ______ mph Q1 is ______ mph Q3 is ______ mph

  23. Quartiles in Excel =QUARTILE.INC(cells, 1 or 2 or 3) seems to give the same results as the old QUARTILE function There’s new =QUARTILE.EXC(cells, 1 or 2 or 3) Excel does fancy interpolation stuff and may give different Q1 and Q3 answers compared to the TI-84 and our by-hand methods.

  24. Quintiles and Deciles You might also encounter • Quintiles, dividing data set into 5 groups. • Deciles, dividing data set into 10 groups. Reconcile everything back with percentiles: • Quartiles correspond to percentiles 25, 50, 75 • Deciles correspond to percentiles 10, 20, …, 90 • Quintiles correspond to percentiles 20, 40, 60, 80

  25. Example 3.22: Writing the Five-Number Summary of a Given Data Set Write the five-number summary for the data from Example 3.20 (the miles per gallon ratings). Solution The five-number summary is five numbers, separated by commas:

  26. Five-Number Summary and Box Plots Interquartile Range (IQR) The interquartile range is the range of the middle 50% of the data, given by IQR = Q3-Q1 where Q3 is the third quartile and Q1 is the first quartile. How “wide” is the “middle half” of the data set? For the vehicle mpg ratings example, IQR = _____ - _____ = _____ mpg

  27. Five-Number Summary and Box Plots Creating a Box Plot 1. Begin with a horizontal (or vertical) number line that contains the five-number summary. Draw a small line segment above (or next to) the number line torepresent each of the numbers in the five-number summary. Connect the line segment that represents the first quartile to the line segment representing the third quartile, forming a box with the median’s line segment in the middle. Connect the “box” to the line segments representing the minimum and maximum values to form the “whiskers.”

  28. Example 3.23: Creating a Box Plot Draw a box plot to represent the five-number summary from the previous example. Recall that the five-number summary was 12.1, 19.8, 23.6, 25.3, 35.9. Solution Step 1: Label the horizontal axis at even intervals.

  29. Example 3.23: Creating a Box Plot (cont.) Step 2: Place a small line segment above each of the numbers in the five‑number summary.

  30. Example 3.23: Creating a Box Plot (cont.) Step 3: Connect the line segment that represents Q1 to the line segment that represents Q3, forming a box with the median’s line segment in between.

  31. Example 3.23: Creating a Box Plot (cont.) Step 4: Connect the “box” to the line segments representing the minimum and maximum to form the “whiskers.”

  32. Example 3.24: Interpreting Box Plots The box plots below are from the US Geological Survey website. Use them to answer the following questions.

  33. Example 3.24: Interpreting Box Plots (cont.) Note: Box plots showing the distribution of average Spring (April and May) total phosphorous concentrations, for the years 1979 to 2008, for four of the five large subbasins that comprise the Mississippi-Atchafalaya River Basin. (The Lower Mississippi River subbasin was excluded due to the large errors in estimating the average concentrations.) Source: US Geological Survey. “2009 Preliminary Mississippi-Atchafalaya River Basin Flux Estimate.” US Department of the Interior. 2009. http://toxics.usgs.gov/ hypoxia/mississippi/oct_jun/images/figure9.png (9 Aug. 2010).

  34. Example 3.24: Interpreting Box Plots (cont.) a. What do the top and bottom bars represent in these box plots according to the key? b. Which subbasin had the highest median average spring total phosphorus concentration? c. Which subbasin had the lowest average spring total phosphorus concentration? (Note: Each data value is an average of April’s and May’s totals, and the lowest average shown for each subbasin is the 10th percentile.) d. Which subbasin had the largest interquartile range?

  35. Example 3.24: Interpreting Box Plots (cont.) a. In each box plot, the top bar represents the 90th percentile of average spring total phosphorous concentration, and the bottom bar represents the 10thpercentile. b. The subbasin with the highest median average spring total phosphorus concentration was the Missouri. The subbasin with the lowest average spring total phosphorus concentration was the Ohio/Tennessee. The subbasin with the largest interquartile range was the Missouri.

  36. Example T.2: Using a TI-83/84 Plus Calculator to Create a Box Plot Create a box plot, using a TI-83/84 Plus calculator, for the following data values, which represent the breathing rates for a sample of adults (in breaths per minute). 12, 14, 15, 19, 12, 10, 13, 19, 20, 12, 23 Solution Begin by entering the data in the first list, L1. Next, go to the STAT PLOTS menu by pressing . Select option 1:Plot1; turn on Plot1 by highlighting On and pressing .

  37. Example T.2: Using a TI-83/84 Plus Calculator to Create a Box Plot (cont.) Then choose the second of the two box plot options. Next, enter L1 in the Xlist. Then press . You should see the box plot shown in the following screenshot. If it does not appear that way at first, pressing and choosing option 9:ZoomStat should correct the problem.

  38. Example T.2: Using a TI-83/84 Plus Calculator to Create a Box Plot (cont.) See also separate handout for TI-84 Box Plot. It discusses TRACE, WINDOW, and how to fix some common STAT PLOT problems.

  39. Standard Scores Standard Score The standard score for a populationvalue is given by where x is the value of interest from the population, μ is the population _____________ σ is the population ___________________.

  40. Standard Scores The standard score for a sample value is given by where x is the value of interest from the sample, is the sample _____________ s is the sample ____________________.

  41. Standard Score answers the question“How does my compare to the mean?” “Am I in the middle of the pack?” “Am I above or below the middle?” “Am I extremely high or extremely low?” Score is the measuring stick If z= 0, then I’m ________________________. If z > 0,then I’m ________________________. If z < 0, then I’m ________________________. z is almost always between _____ and _____.

  42. Example 3.25: Calculating a Standard Score If the mean score on the math section of the SAT test is 500 with a standard deviation of 150 points, what is the standard score for a student who scored a 630? Solution μ = 500 and σ = 150. The value of interest is x = 630, so we have the following.

  43. Excel STANDARDIZE function to convert a data value (x) to a standard score (z)

  44. Example 3.26: Comparing Standard Scores Jodi scored an 87 on her calculus test and was bragging to her best friend about how well she had done. She said that her class had a mean of 80 with a standard deviation of 5; therefore, she had done better than the class average. Her best friend, Ashley, was disappointed. She had scored only an 82 on her calculus test. The mean for her class was 73 with a standard deviation of 6. Who really did better on her test, compared to the rest of her class, Jodi or Ashley?

  45. Example 3.26: Comparing Standard Scores (cont.) Solution Let’s calculate each student’s standard score. Jodi’s standard score : Ashley’s standard score: Who did “better”, relative to the rest of her class?

  46. Score: is how many standard deviations away from the mean? If you know the x value To work backward from z to x Population Sample • Population: • Sample

  47. score is also called “Standard Score” No matter what is measured in or how large or small the values are…. The score of the mean will be 0 • Because numerator turns out to be 0. If is above the mean, its is positive. • Because numerator turns out to be positive If is below the mean, its is negative. • Because numerator turns out to be negative

  48. score values Typically round to two decimal places. • Don’t say “0.2589”, say “0.26” If not two decimal places, pad • Don’t say “2”, say “2.00” • Don’t say “-1.1”, say “-1.10” scores are almost always in the interval . Be very suspicious if you calculate a score that’s not a small number.

  49. Practice: Given x, compute z Find the scores corresponding to the salary values, given that the mean, and the standard deviation .

  50. Practice: Given z, compute x Find the scores (salaries) corresponding to these standard scores, given that the mean, and the standard deviation . and and and

More Related