Télécharger la présentation
## Chapter 1

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -

**Chapter 1**Picturing Distributions with Graphs**Statistics**Statistics is a science that involves the extraction of information from numerical data obtained during an experiment or from a sample. It involves: • the design of the experiment or sampling procedure, • the collection and analysis of the data, and • making inferences (statements, assumptions, guesses) about the population based upon information in a sample.**Individuals and Variables**• Individuals • the objects described by a set of data • may be people, animals, or things • Variable • any characteristic of an individual • can take different values for different individuals**Variables**• Categorical • Places an individual into one of several groups or categories • Quantitative (Numerical) • Takes numerical values for which arithmetic operations such as adding and averaging make sense**Example 1.1**• Individuals? • Variables? • Categorical? • Quantitative?**Distribution**• Tells what values a variable takes and how often it takes these values • Can be a table, graph, or function**Displaying Distributions**• Categorical variables • Pie charts • Bar graphs • Quantitative variables • Histograms • Stemplots (stem-and-leaf plots) • Time Plots**Example: Class Make-up on First Day**Data Table**Example: Class Make-up on First Day**Pie Chart**Example: Class Make-up on First Day**Bar Graph**Example: U.S. Solid Waste (2000)**Data Table**Example: U.S. Solid Waste (2000)**Pie Chart**Example: U.S. Solid Waste (2000)**Bar Graph**Example: U.S. Solid Waste (2000)**Bar Graph**Histograms**• For quantitative variables that take many values • Divide the possible values into class intervals (we will only consider equal widths) • Count how many observations fall in each interval (may change to percents) • Draw picture representing distribution**Case Study**Weight Data Introductory Statistics classSpring, 1997 Virginia Commonwealth University**Weight Data: Frequency Table**sqrt(53) = 7.2, or 8 intervals; range (260100=160) / 8 = 20 = class width**Number of students**100 120 140 160 180 200 220 240 260 280 Weight Data: Histogram Weight * Left endpoint is included in the group, right endpoint is not.**Histograms: Class Intervals**• How many intervals? • One rule is to calculate the square root of the sample size, and round up. • Size of intervals? • Divide range of data (maxmin) by number of intervals desired, and round to convenient number • Pick intervals so each observation can only fall in exactly one interval (no overlap)**Examining the Distribution of Quantitative Data**• Overall pattern of graph • Shape of the data • Center of the data • Spread of the data (Variation) • Deviations from overall pattern • Outliers**Shape of the Data**• Symmetric • bell shaped • other symmetric shapes • Asymmetric • right skewed • left skewed • Unimodal, bimodal**SymmetricBell-Shaped**SymmetricMound-Shaped SymmetricUniform**AsymmetricSkewed to the Left**AsymmetricSkewed to the Right**Outliers**• Extreme values that fall outside the overall pattern • May occur naturally • May occur due to error in recording • May occur due to error in measuring • Observational unit may be fundamentally different**Stemplots(Stem-and-Leaf Plots)**• For quantitative variables • Separate each observation into a stem (first part of the number) and a leaf (the remaining part of the number) • Write the stems in a vertical column; draw a vertical line to the right of the stems • Write each leaf in the row to the right of its stem; order leaves if desired**10**11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 192 5 152 2 135 2 Weight Data:Stemplot(Stem & Leaf Plot) Key 20|3 means203 pounds Stems = 10’sLeaves = 1’s**10 0166**11 009 12 0034578 13 00359 14 08 15 00257 16 555 17 000255 18 000055567 19 245 20 3 21 025 22 0 23 24 25 26 0 Weight Data:Stemplot(Stem & Leaf Plot) Key 20|3 means203 pounds Stems = 10’sLeaves = 1’s**Extended Stem-and-Leaf Plots**If there are very few stems (when the data cover only a very small range of values), then we may want to create more stems by splitting the original stems.**151516161717**Extended Stem-and-Leaf Plots Example: if all of the data values were between 150 and 179, then we may choose to use the following stems: Leaves 0-4 would go on each upper stem (first “15”), and leaves 5-9 would go on each lower stem (second “15”).**Time Plots**• A time plot shows behavior over time. • Time is always on the horizontal axis, and the variable being measured is on the vertical axis. • Look for an overall pattern (trend), and deviations from this trend. Connecting the data points by lines may emphasize this trend. • Look for patterns that repeat at known regular intervals (seasonal variations).