00:00

Statistical Inference: IQ Scores and CO Emissions Analysis

This overview covers the normal distribution of IQ scores and CO emissions, calculations of probabilities for specific IQ levels, conditions for distribution of sample means, and use of Student's t-distribution for statistical inference about means. It includes examples for calculating probabilities, checking conditions for sample means, and conducting t-tests and intervals. Additionally, it explores constructing confidence intervals and performing hypothesis tests for population means based on sample data.

artiz
Télécharger la présentation

Statistical Inference: IQ Scores and CO Emissions Analysis

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Warm up: IQ scores are normally distributed with a mean of 120 and a standard deviation of 15. (a) Sketch & label the normal model. (b) What is the probability that a person has an IQ over 140? (c) What is the probability that a person has an IQ between 105 and 130? (d) If I take a sample of 30 people, what will happen to the spread of the distribution?

  2. Ch. 18 again... DISTRIBUTION OF SAMPLE MEANS •A sample of size n is taken from a population with mean, µ, and standard deviation, σ. Then for the distribution of the samples means : •The mean is the mean of the population: •The standard deviation is the population standard deviation divided by the square root of sample size, n.

  3. Conditions/Assumptions: There are two conditions that must be met to use the Normal model for the distribution of sample means: 1.Randomization: (or representative) 2.Large enough sample: * Normal population stated  any sample size is fine * Otherwise, n > 30 If met then from the CLT the distribution of the sample means will be approximately Normal.

  4. EXAMPLE: Carbon monoxide (CO) emissions for a certain kind of car vary with a mean of 2.9 g/mi and standard deviation of 0.4 g/mi. It is known that these readings follow a normal model. 1. What is the probability that a car would have an emission level higher than 3.1 g/mi? 2. g/mi? What is the probability that a car would have an emission level less than 2.87 3. 2.945 g/mi? What is the probability that a car would have an emission level between 2.895 and

  5. EXAMPLE: Carbon monoxide (CO) emissions for a certain kind of car vary with a mean of 2.9 g/mi and standard deviation of 0.4 g/mi. It is known that these readings follow a normal model. 4. A random sample of 40 cars is taken. Check the conditions to create the model for the distribution of sample means. 5. What is the model and parameters? 6. What is the probability that the sample mean of the 40 cars would be higher than 3.1 g/mi? 7. What is the probability that the sample mean of the 40 cars would be less than 2.87 g/mi? 8. What is the probability that the sample mean of the 40 cars would be between 2.895 and 2.945 g/mi?

  6. Ch. 23: Inference about means Inference: * Confidence intervals and tests of significance * Definition: Making conclusions about a population from statistics with a known degree of confidence Unit 5 = Ch. 19 - 22 = Inference about proportions Unit 6 = Ch. 23 - 25 = Inference about means

  7. * We will be estimating and testing a population mean * We always list standard deviation with the mean * If we don't know the mean, we CAN'T know the standard deviation. * So we will be estimating 2 things: mean and std. deviation * What statistics do we use to estimate mean and std. deviation? * When we do this, we cannot use Z-scores. We need a new model. Student's t-distribution

  8. BOOK READING

  9. Student's t-Model: • Family of distributions (there's more than one distribution) • Similar to Normal model (normal = only 1 distribution) • center at 0 • unimodal, symmetric Model changes based on sample size and degrees of freedom (because S changes based on sample size) Degrees of freedom (df) = n - 1 (p. 549) Generally wider than Normal model • 2 statistics => more variation and… • S varies in each sample. s does not! The larger the sample size (and thus, df) the closer the Student's t-model approaches the Normal model. • • • •

  10. t-interval and t-test CONDITIONS: 1) Randomization 2) Normality Checking #3 (OPTIONS): • Normal population stated • n > 30 => CLT kicks in and says sampling distribution will be approx. normal • Look at a histogram of the sample values • want normal shape (symmetric and unimodal) • Look at a normal probability plot (on calculator) • want straight line Conditions met  1 sample t-Test/Interval

  11. 1 sample t Interval: MECHANICS: • Calculate (or given): x, s, n Find degrees of freedom: df = n-1 Find critical value t* (invT or Table B on formula sheet) • • Formula for interval: statistic + (critical value)(std. dev of statistic)

  12. Interpret: We are ___% confident that the true mean of _____ is between ____ and ______ units. Shortcut: STAT  TESTS  #8: T-Interval

  13. on calculator: 2nd DISTR --> #4:invT --> ENTER invT(% below, df) --> ENTER Examples: n = 30, 95% confidence n = 85, 92% confidence

  14. Example: A coffee vending machine dispenses coffee into a paper cup. You’re supposed to get 10 ounces of coffee, but the amount varies slightly from cup to cup. Below are the amounts measured in a random sample of 20 cups. Is there evidence that the machine is shortchanging customers? Construct a 95% confidence interval.

  15. 1-sample t-Test * Inference about the mean of a population HYPOTHESES: Ho: m = # Ha: m # CONDITIONS: * same as interval * Conditions met  1 sample t test

  16. MECHANICS: • Calculate (or given): x, s, n • Find: df = n - 1 Formula: t = statistic - parameter std. dev. of statistic (SE)

  17. P-Value: P(t test statistic) = tcdf(LB, UB, df) CONCLUSION: - We reject/fail to reject Ho... - We have sufficient/insufficient evidence that the true mean of ____ is ... (use Ha) Shortcut: STAT  TESTS  #2: T-test

  18. Example: The EPA wants to show that “the mean carbon monoxide level of air pollution is higher than 4.9 parts per million (ppm).” Does a random sample of 50 readings (with sample mean of 5.1 ppm and sample std. deviation of 1.17 ppm) present sufficient evidence at the 0.05 level of significance to support the EPA’s claim? Previous studies have indicated that such readings have an approximately normal distribution.

  19. Practice! 1. A survey was conducted involving 250 families living in a city. The average amount of income tax paid per family in the sample was $3540 with a standard deviation of $1150. (A) Establish and interpret a 99% confidence interval estimate for the taxes paid by families in this city. (B) What does 99% confidence mean in this context?

  20. Practice! 1. A survey was conducted involving 250 families living in a city. The average amount of income tax paid per family in the sample was $3540 with a standard deviation of $1150. (A) Establish and interpret a 99% confidence interval estimate for the taxes paid by families in this city. (B) What does 99% confidence mean in this context?

  21. 2) The estimated U.S. intake of trans-fatty acids is 8 g per day. Consider a research project involving 150 individuals in which their daily intake of trans-fatty acids was measured. Suppose the average fatty acid intake from this sample was 12.5 g, with a standard deviation of 7.7 g. (A) Test the research hypothesis that the average intake has increased at  = 0.05. (B) What is a Type I error? (C) What is a Type II error? (D) What is Power? (E) Interpret the P-value

  22. 2) The estimated U.S. intake of trans-fatty acids is 8 g per day. Consider a research project involving 150 individuals in which their daily intake of trans-fatty acids was measured. Suppose the average fatty acid intake from this sample was 12.5 g, with a standard deviation of 7.7 g. (A) Test the research hypothesis that the average intake has increased at  = 0.05. (B) What is a Type I error? (C) What is a Type II error? (D) What is Power? (E) Interpret the P-value

  23. 3) Suppose that in a sample of 36 bottles from a certain bottling machine, the machine filled the bottles with an average of 16.1 ounces of cola. The sample had a standard deviation of 0.11 ounces. Give a 90% confidence interval for the mean number of ounces. Interpret this interval.

  24. 3) Suppose that in a sample of 36 bottles from a certain bottling machine, the machine filled the bottles with an average of 16.1 ounces of cola. The sample had a standard deviation of 0.11 ounces. Give a 90% confidence interval for the mean number of ounces. Interpret this interval.

  25. 4) The average stay in days for nongovernmental not-for-profit hospitals is given to be 7.2 days. A sample of 50 such hospitals was selected to test the hypothesis that the average stay is different from the national average. The data collected is below. Is this sufficient evidence to reject the null hypothesis? Use  = 0.01.

  26. 4) The average stay in days for nongovernmental not-for-profit hospitals is given to be 7.2 days. A sample of 50 such hospitals was selected to test the hypothesis that the average stay is different from the national average. The data collected is below. Is this sufficient evidence to reject the null hypothesis? Use  = 0.01.

More Related