1 / 23

Stats Review

Stats Review . Chapter 1. 1.1 Displaying Distributions. Definitions: Individuals – objects described by a set of data Variable – any characteristic of an individual Categorical – places an individual into a group Quantitative – numerical data about the individual.

tyler
Télécharger la présentation

Stats Review

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Stats Review Chapter 1

  2. 1.1 Displaying Distributions Definitions: Individuals – objects described by a set of data Variable – any characteristic of an individual Categorical – places an individual into a group Quantitative – numerical data about the individual.

  3. Examining a Distribution • Look for the overall pattern and for deviations from that pattern • Describe using shape (symmetric, skewed), center (median, mode, mean) and spread (variation, standard deviation, IQR) • Look for outliers and skewness

  4. Stemplots: • Separate the data into different classes • Write the stems in a vertical column • Write each leaf as a single digit to the side of each stem

  5. Histograms: • Separate the data into different classes of equal width • Write the classes along the horizontal axis • Write the relative frequency (count or percentage) along the vertical axis • Create a bar for each class, with no space between

  6. Time plots: • Write the time or order along the horizontal axis • Write the count along the vertical axis • Plot each observations value in the order they occurred

  7. 1.2 Number Summaries Measures of Center: Mean – Average. Susceptible to influence by outliers and skewness Median – The middle value (the average of the middle two if n is even). Not greatly affected by outliers.

  8. Quartiles: • Arrange the data in increasing order and locate the median M. • The first quartile Q1 is the median of the observations that lie below the median M. • The third quartile Q3 is the median of the observations that lie above the median M.

  9. Interquartile Range (IQR) The interquartile range is the distance between the first and third quartiles. IQR=Q3 – Q1

  10. 1.5 x IQR Criterion for Outliers Call an observation a suspected outlier if it falls more than 1.5 x IQR above Q3 or below Q1. • Observations below Q1 – (1.5 x IQR) • Observations above Q3 + (1.5 x IQR) are considered possible outliers

  11. 5 Number Summaries Minimum, Q1, Median, Q3, Maximum

  12. Boxplots SUSPECTEDOUTLIERS (1.5 X IRQ RULE) MAXIMUM NON OUTLIER THIRD QUARTILE MEDIAN FIRST QUARTILE MINIMUM NON OUTLIER

  13. Variance The average of the squares of the differences between the observation and the mean. FORMULA:

  14. Standard Deviation s The square root of the variance. FORMULA:

  15. Properties of the Standard Deviation • s is a measure of spread about the mean • Only use when mean is measure of center • s=0 implies that there is no spread and all observations are the same value • s is not resistant and will become very large when there are a few outliers

  16. Linear Transformations • Multiplying each observation by a positive number b, multiplies the mean, median, IRQ and standard deviation by b. • Adding the same number a to each observation, adds a to mean and median but does not change IRQ or standard deviation.

  17. 1.3 Normal Distributions Strategies For Exploring Quantitative Data • Always plot your data (usually a stemplot or histogram). • Look for overall pattern and for striking deviations such as outliers. • Calculate a numerical summary to briefly describe center and spread (5 number, mean & standard deviation).

  18. Density Curves • Always on or above the horizontal axis • Area under the curve always equals one Skew • Skew refers to the tail not the bump • The mean (balance point) is always closer to the tail than the median (cuts area in half).

  19. Standard Deviation of Normal Density curves • Points of inflection on the normal density curve lie 1 σ away from the mean on each side

  20. 68-95-99.7 Rule • 68% of observations fall with in σof the mean μ. • 95% of observations fall with in 2σof the mean μ. • 99.7% of observations fall with in 3σof the mean μ.

  21. Z-score Normal distributions can be standardized by the following formula:

  22. Normal Quartile Plots Used to assess the normality of a distribution • Arrange data from smallest to largest. Record what percentile of the data each value occupies. Example, the smallest observation of a set of 20 is at the 5% point. • Find the z-score from Table A that corresponds to each percentile. Example, z=-1.645 for the 5% point • Plot each data point x against its corresponding z-score. • If the plotted points lie close to a straight line then the distribution is approximately normal. • If the line bends up at the right, then skewed right. If bends down on the left, then the distribution is skewed left. • Outliers appear as points for away from the overall pattern of points

  23. Review Exercises: 1.106, 1.114, 1.116, 1-119, 1.123

More Related