220 likes | 338 Vues
Chapter 21 – More About Tests. Review of Hypothesis Testing. Claim made about population proportion Set up H 0 and H A Choose model and check conditions One-proportion z-test Draw Normal curve and find P-value State conclusion in context of problem. More on the Null Hypothesis.
E N D
Review of Hypothesis Testing Claim made about population proportion Set up H0 and HA Choose model and check conditions One-proportionz-test Draw Normal curve and find P-value State conclusion in context of problem
More on the Null Hypothesis • The null hypothesis is always a statement about the population proportion. • Since we can never really prove the null hypothesis, usually it’s better to have what you want to show be true as the alternative hypothesis • Then you can reject the null hypothesis in favor of the alternative hypothesis
More on P-values • P-value is a conditional probability: • The probability that we will get results at least as unusual as the ones we saw given that the null hypothesis true. • The lower the P-value, the more confident we are in rejecting the null hypothesis • A high P-value means we aren’t surprised by what we observed. • Doesn’t prove null hypothesis, but gives no reason to reject it • Essentially the P-value is the probability that the findings were due to random sampling variation or chance
Example: Reading Program • A new reading program may reduce the number of elementary school students who read below grade level. Statistical analysis of the results of a large-scale test showed that the percentage of students who did not attain the grade-level standard was reduced from 15.9% to 15.1%. The hypothesis that the new reading program produced no improvement was rejected with a P-value of 0.023 • Explain what the P-value means. • There’s only a 2.3% probability of seeing a sample proportion as low as 15.1% by natural sampling variation, if the true percentage of children who did not attain the grade-level standard is 15.9% • Would you recommend the reading program to your local school? Example from DeVeaux, Intro to Stats
Alpha Levels • We talked about wondering how low a P-value we need to decide to reject the null hypothesis • We can use an alpha level or to set a threshold on our P-value • Alpha level is also called the significance level • If our P-value is less than our alpha level, we will reject the null hypothesis • We would then say that the results are statistically significant
More on Alpha Levels • Alpha levels are represented using the symbol α • Typically we use α = 0.1, 0.05, or 0.01 • When in doubt, we use α = 0.05 • Partially depends on importance of claim being made • The more important the claim or higher the stakes, the lower an alpha level you would use
Statistically Significant • When we get a P-value below our alpha level (let’s assume 0.05), we can say “we reject the null hypothesis at the 5% level of significance” • Sometimes, statistical significance doesn’t mean the difference is important in the context of the situation • On the other hand, sometimes a significant difference may turn out to not be statistically significant • Sometimes a larger sample size can rectify this
One-sided Alternative Hypothesis We might have a one-sided or two-sided alternative hypothesis one-sided Figure from DeVeaux, Intro to Stats
Two-sided Alternative Hypothesis two-sided Notice we have α/2 in each tail Figure from DeVeaux, Intro to Stats
Critical Values for Hypothesis Testing Just like we used critical values in confidence intervals, we will use them with alpha levels We could always get these from our z-table, but they are used commonly and will be provided from here on out. If our z-score is more extreme than the critical value, then we will have a P-value smaller than our alpha level Table from DeVeaux, Intro to Stats
Confidence Intervals and Hypothesis Tests Confidence intervals and hypothesis tests are built on the same calculations with the same assumptions and conditions Our conclusion about the null should be consistent with whether or not the proportion in the claim falls within the confidence interval A 95% confidence interval corresponds with a two-sided hypothesis test with α = 5%
Example: Is Euro a fair coin? • Soon after the Euro was introduced as currency in Europe, it was widely reported that someone had spun a Euro 250 times and gotten heads 140 times. • Estimate the true proportion of heads using a 95% confidence interval. (remember to check conditions) • Does your confidence interval provide evidence that the coin is unfair when spun? Explain. • What is the significance level? Example from DeVeaux, Intro to Stats
Errors in Hypothesis Testing • Even with our careful analysis and lots of evidence, we can make an incorrect decision. • Two ways we can make mistakes with hypothesis testing: Type I: null hypothesis is true, but we reject it Type II: null hypothesis is false, but we fail to reject it • Which error is more serious depends on the situation.
Type I Error • In medical terms, this would be a false positive • A healthy person is diagnosed with a disease incorrectly • Penalty for mistake? • In jury terms, this would mean an innocent person is convicted • Penalty for mistake? • Setting α determines the probability of a Type I Error
Type II Error • In medical terms, this would be a false negative • An infected person goes undiagnosed • Penalty for mistake? • In jury terms, this would mean an guilty person is not convicted • Penalty for mistake? • Much more difficult to determine the probability of a Type II Error (designated by β)
Example: Spam Filter Suppose a spam filter uses a point system to score each email based on sender, subject, and keywords. The higher the point total, the more likely that the message is spam. We can think of the filter’s decision as a hypothesis test. The null hypothesis is that the email is a real message. A high point score would be evidence that it is junk and will therefore reject the null hypothesis and classify it as spam. When the filter allows spam to slip through into your inbox, which kind of error is this? Which kind of error is it when a real message gets classified as junk? If the filter has a default cutoff score of 50 , but you reset it to 60, is that analogous to choosing a higher or lower value of αfor a hypothesis test?
Reducing Errors We can reduce α to lower the chance of a Type I Error, but then that will have the effect of raising β The only way to really reduce both Type I and Type II errors simultaneously is to increase our sample size, which will reduce our standard deviations.
Hypothesis Testing in Minitab Choose Stat > Basic Statistics > 1 Proportion
Hypothesis Testing in Minitab (cont’d) Fill in text fields and check Perform Hypothesis Test checkbox # successes check sample size proportion in claim (Uncheckedgives confidence interval only)
Hypothesis Testing in Minitab (cont’d) Click Options button and then choose alternative and set confidence level confidence level alternativehypothesis (not equal gives CI)check
Hypothesis Testing in Minitab (cont’d) • Click OK and then OK again • You will see results in session window: • Sample proportion • Z-Value • P-Value