Chapter Two
E N D
Presentation Transcript
Chapter Two Organizing Data
A graphical display should • Show the data. • Induce the viewer to think about the substance of the graphic. • Avoid distorting the message.
Frequency Table • Partitions data into classes or intervals • Shows how many data values are in each class • Each data value falls into exactly one class
Frequency Table • Shows the limits of each class • Shows the frequency of each data value • Shows the midpoint of each class
To make a frequency table: First determine the number of classes and determine the class width. Five to fifteen classes are most commonly used.
Finding class width • Compute: • Increase the value computed to the next highest whole number
Raw Data: 10.2 18.7 22.3 20.0 6.3 17.8 17.1 5.0 2.4 7.9 0.3 2.5 8.5 12.5 21.4 16.5 0.4 5.2 4.1 14.3 19.5 22.5 0.0 24.7 11.4 Use 5 classes. 24.7 – 0.0 5 = 4.94 Round class width up to 5. Determining the Class Width
Class limits • The lower class limit is the lowest data value that can fit in a class. • The upper class limit is the highest data value that can fit in a class.
Making a frequency table: Create the distinct classes. • As a convenience, the lower class limit of the first class may be the smallest data value. • Add the class width to the each lower class limit to get the lower class limits of successive classes. • Fill in upper class limits to create distinct classes that accommodate all possible data values.
Raw Data: 10.2 18.7 22.3 20.0 6.3 17.8 17.1 5.0 2.4 7.9 0.3 2.5 8.5 12.5 21.4 16.5 0.4 5.2 4.1 14.3 19.5 22.5 0.0 24.7 11.4 Creating the classes Classes: 0.0 – 4.9 5.0 – 9.9 10.0 – 14.9 15.0 – 19.9 20.0 – 24.9
To make a frequency table: Tally the data into classes. • Each data value falls into exactly one class. • Total the tallies to obtain each class frequency.
Tallying the data Raw Data: 10.2 18.7 22.3 20.0 6.3 17.8 17.1 5.0 2.4 7.9 0.3 2.5 8.5 12.5 21.4 16.5 0.4 5.2 4.1 14.3 19.5 22.5 0.0 24.7 11.4 Classes: Tally 0.0 – 4.9 |||| | 5.0 – 9.9 |||| 10.0 – 14.9 |||| 15.0 – 19.9 |||| 20.0 – 24.9 ||||
Class frequencies Classes: Tally f 0.0 – 4.9 |||| | 6 5.0 – 9.9 |||| 5 10.0 – 14.9 |||| 4 15.0 – 19.9 |||| 5 20.0 – 24.9 |||| 5
To make a frequency table: Compute the midpoint for each class. • The midpoint is also known as the class mark.
Finding Class Midpoints # of miles f class midpoints 0.0 - 4.9 6 2.45 5.0 - 9.9 5 7.45 10.0 - 14.9 4 12.45 15.0 - 19.9 5 17.45 20.0 - 24.9 5 22.45
To make a frequency table: Determine the class boundaries. For integer data: • Upper class boundary = upper class limit + 0.5 units. • Lower class boundary = lower class limit 0.5 units.
Finding Class Boundaries # of miles f class boundaries 0.0 - 4.9 6 -0.05 - 4.95 5.0 - 9.9 5 4.95 - 9.95 10.0 - 14.9 4 9.95 - 14.95 15.0 - 19.9 5 14.95 - 19.95 20.0 - 24.9 5 19.95 - 24.95
Relative Frequency • The relative frequency of a class is the proportion of all data that fall into that class. • To find relative frequency of a class divide the class frequency (f) by the total of all frequencies (n).
Finding relative frequencies # of miles f Relative frequencies 0.0 - 4.9 6 6/25 = 0.24 5.0 - 9.9 5 5/25 = 0.20 10.0 - 14.9 4 4/25 = 0.16 15.0 - 19.9 5 5/25 = 0.20 20.0 - 24.9 5 5/25 = 0.20 25
Histogram • A visual display of data organized into a frequency table • Bars represent each class • Height of each bar represents class frequency (or relative frequency) • Width of each bar represents class width
To construct a histogram • Make a frequency table • Place class boundaries on the horizontal axis • Place frequencies or relative frequencies on the vertical axis • For each class draw a bar whose width extends between corresponding class boundaries. The height of each bar is the appropriate frequency or relative frequency.
Common Shapes of Histograms • Symmetrical • Uniform or rectangular • Skewed left • Skewed Right • Bimodal
Other Common Graphs • Bar Graphs • Pareto Charts • Circle Graphs • Time-Series Graphs
Bar Graph • Bars are of uniform width and are uniformly spaced. • Bars may be vertical or horizontal. • Lengths of bars represent values being displayed, frequency or percentage of occurrence. • Graph annotated with title, labels and scale or value for each bar
Changing Scale • Whenever you use a change of scale in a graphic, use a squiggle on the changed axis. • A squiggle:
Pareto Chart • Tool of quality control • Start with a bar chart • Arrange bars in decreasing order of frequency • Frequently used to investigate causes of problems
Circle Graph (Pie Chart) • Shows division of whole into component parts • Label parts with appropriate percentages of the whole
Time Series • Data sets composed of similar measurements taken at regular intervals over time
Time Series Graph • Shows data values in chronological order • Place time on horizontal scale • Place other variable on vertical scale • Connect data points with line segments
Rules For Any Graph • Provide a title. • Label axes. • Identify units of measure. • Present information clearly.
Exploratory Data Analysis • Technique particularly useful for detecting patterns and extreme values. • Also called EDA. • Makes use of histograms and other graphics.
Stem and Leaf Display • Organizes and groups data. • Allows recovery of original data. • Data values must have at least two digits.
To Construct a Stem and Leaf Display • Divide digits of each data value into two parts: “stem” and “leaf.” • Align stems in vertical column to left of a vertical line. • Place leaves with same stem on same row as that stem, arranged in increasing order. • Label to include magnitude or decimal point.
Stem and Leaf Display Raw Data: 35, 45, 42, 45, 41, 32, 25, 56, 67, 76, 65, 53, 53, 32, 34, 47, 43, 31
Stem and Leaf Display First data value = 35 2 3 4 5 6 7 5 leaf stems
Stem and Leaf Display Data value = 45 2 3 4 5 6 7 5 5
Stem and Leaf Display Data value = 42 2 3 4 5 6 7 5 5 2