Hypothesis Testing

Hypothesis Testing A hypothesis is a claim or statement about a property of a population (in our case, about the mean or a proportion of the population) A hypothesis test (or test of significance) is a standard procedure for testing a claim or statement about a property of a population. It is extremely important to realize that we are not making definitive conclusions. We are giving probabilistic conclusions. We are either concluding that the results we get are likely due to chance, or unlikely.

Examples If we flip a coin 100 times, and 52 come up heads, this could easily occur by chance. There is not sufficient evidence to suggest that the coin is unfair. If we flip a coin 100 times, and 75 come up heads, this would be an extremely rare event if the coin was fair. The extremely low probability is evidence that the coin may not be fair. Note: If would be very sloppy of us to conclude in the second example that the coin is definitely unfair. Although extremely rare, 75 heads is still possible by chance from a fair coin.

Another Example A light bulb is advertised as having a mean life of 1000 hours. From a sample, we find the mean life of our sample to be 900 hours. The 95% confidence interval for the population mean is 850 < μ < 1050 hours. We CANNOT conclude: That the actual mean life of light bulbs is 900 hours That the advertised life is wrong That the advertised life is correct We CAN conclude: From our sample, we are 95% confident that the population mean is between 850 hours and 1050 hours. Since 1000 hours is included in that interval, we do not have sufficient evidence to say that the advertised life is wrong.

Another approach Claim: The mean life of light bulbs is less than 1000 Working Assumption: The mean life of light bulbs is 1000 The sample resulted in a mean life of 900 Assuming that μ =1000, the probability that the mean of our sample would be less than 900 is P( < 900) = 0.0951 There are two possible explanations for why our sample came out with a mean life of 900 hours. Either this occurred by chance (with probability 9.5%), or the actual mean life of light bulbs is less than 900. Since the probability (9.5%) isn’t horribly small, we decide that random chance is a reasonable explanation. There isn’t sufficient evidence to support the claim that the mean life of light bulbs is less than 1000 hours.

Formal Hypothesis TestingThe brief process Convert your claim into a symbolic null and alternative hypothesis Calculate a test statistic Compare the test statistic to critical values OR Find a probability Write a conclusion

Components of a Formal Hypothesis Test The Null hypothesis (denoted H0) is a statement that the value of a population parameter (such as proportion or mean) is equal to some claimed value. The alternative hypothesis (denoted H1 or Ha) is a statement that the value of a population parameter somehow differs from the null hypothesis. The symbolic form must be a >, < or ≠ statement.

We will be testing the null hypothesis directly (by assuming it’s true) to reach a conclusion to either reject H0 or fail to reject H0. Note: We cannot support a claim that a parameter is equal to a value. So, the null hypothesis must always include equality, and the alternative hypothesis must be inequality.

Process • Identify the claim to be tested and express it in symbolic form. • Give the symbolic form that must be true when the original claim is false • Pick the one not including equality to be H1, and let the null hypotheses be that the parameter equals the value being considered.

Example Claim: The mean IQ of statistics students is greater than 110. Symbolic form: μ > 110 Opposite: μ ≤ 110 H0: μ = 110 H1: μ > 110 Note: While often your claim will be the alternative hypothesis, it won’t always be.

Test Statistics A test statistic is a value computed from the sample data, used in making the decision whether or not to reject the null hypothesis. Z value for proportion Z value for mean (sigma known) T value for mean (sigma unknown) The test statistic indicates how far our sample deviates from the assumed population parameter.

Critical region and significance Critical region (or rejection region) is the set of all values of the test statistic that cause us to reject the null hypothesis. Significance level (α) is the probability that the test statistic will fall in the critical region when the null hypothesis is actually true. Common values are 0.01, 0.05 and 0.10 A Critical value is any value that separates the critical region from values of the test statistic that would not cause us to reject the null hypothesis

Example Using a significance level of α =0.05, lets find the critical value for each of these alternative hypotheses: P ≠0.5: Critical region is in two tails of the normal distribution. Using the same method we used in chapter 6, we find the critical values to be z = -1.96 and z=1.96 P < 0.5: The critical region is in the lefttailof the normal distribution. Using the methods from 5.2, we find c so P(z < c) = 0.05. The critical value is -1.645 P > 0.5: The critical region is in the lefttail of the normal distribution. Using the methods from 5.2, we find c so P(z < c) = 0.95. The critical value is 1.645

P-Value The P-value is the probability of getting a value of the test statistic that is at least as extreme as the one obtained for the sample data. If the P-value is very small (such as less than 0.05), we will reject the null hypothesis. See pullout for help on how to calculate P-value. The exact process depends on your alternative hypothesis.

Decisions and Conclusions Our final conclusion will always be one of these: • Reject the null hypothesis • Fail to reject the null hypothesis Traditional Method Reject H0 if the test statistic falls within the critical region Otherwise fail to reject the null hypothesis

Decisions and Conclusions P-value method Reject H0 if P-value ≤ α Fail to reject if H0 > α Less common methods Find P-value, and leave conclusion to the reader Look at whether population parameter falls in confidence interval estimate

Final Wording If your original claim contains equality (became H0) Reject H0: “There is sufficient evidence to warrant rejection of the claim that…” Fail to Reject H0: “There is not sufficient evidence to warrant rejection of the claim that…” If your original claim does not contain equality (was H1) Reject H0: “The sample data support the claim that…” Fail to Reject H0: “There is not sufficient sample evidence to support the claim that…”

Homework 7-2: 1-35 every other odd Every odd recommended.

Hypothesis Testing