1 / 15

1 st Semester Final Review Day 1: Exploratory Data Analysis

Learn how to analyze and interpret data using graphs, including stemplots, boxplots, histograms, and bar charts. Understand the measures of center, shape, and spread, and how to identify outliers.

jholland
Télécharger la présentation

1 st Semester Final Review Day 1: Exploratory Data Analysis

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. 1st Semester Final Review Day 1: Exploratory Data Analysis Hardest title, easiest problems!

  2. Graphs – Quantitative Data Stemplot Back to Back Stemplot Calf Weight Supplement No Supplement Parts per million ClO2

  3. Graphs – Quantitative Data Boxplot Histogram Modified Boxplot

  4. Graphs – Quantitative Data Side by Side Boxplots

  5. Graphs – Categorical Data Pie Chart Bar Chart 29.0% 24.8% 70 60 50 40 46.2% Frequency 30 20 10 0 Liberal Moderate Conservative Political Identification

  6. Graphs – Categorical Data Side by Side Bar Chart Segmented Bar Chart

  7. Describing or comparing distributions • Center – mean, median (generally use median when looking at a graph) • Shape (symmetry/skew, modes/peaks, unusual features) • Spread – standard deviation, range, IQR (generally use range or IQR when looking at a graph)

  8. Center • Mean: add up and divide by n • Strongly affected by outliers & skew (pulled in the direction of the skew or outliers) • Median: order the numbers and find the middle • Resistant to skew/outliers (not strongly affected)

  9. Shape Approximately Symmetric Skewed Left Skewed Right

  10. Shape Unimodal Bimodal Uniform Multimodal

  11. Spread • Range: Max – Min (affected by outliers) • Quartiles: (resistant to skew/outliers) Q1 = 25th percentile (median of the bottom half) Median = 50th percentile Q3 = 75th percentile (median of the top half) • IQR: Interquartile range = Q3 – Q1 • Standard Deviation: average distance of individual values away from the mean • Strongly affected by outliers/skew (not resistant)

  12. Are there outliers in a data set? • Outliers are values which are outside the upper or lower fence • Upper fence = Q3 + 1.5(IQR) • Lower fence = Q1 – 1.5(IQR)

  13. What happens to summary statistics when we… • Add a constant to a data set? • Measures of position (center, minimum, maximum, quartiles) change • Measures of spread (standard deviation, range, IQR) do not change • Multiplya data set by a constant? • Measures of position and measures of spread change

  14. Standardized Score • z-score = the number of standard deviations a value falls above or below the mean • For an Individual:

  15. Normal Distribution (68-95-99.7 Rule) • Also called the Empirical Rule Standard deviation = 1st inflection point

More Related