Testing Hypotheses About Proportions

Testing Hypotheses About Proportions Chapter 20

Motivating Example Anyone who plays or watches sports has heard of the “home field advantage”. Teams tend to win more often when they play at home. Or do they? In the 2003 Major League Baseball season, there were 2,429 regular season games. It turns out that the home team won in 1,335 of the 2,429 games, or 54.96% of the time. Could this deviation from 50% be explained just from natural sampling variability, or is this evidence to suggest that there really is a home field advantage?

Rare Event Rule for Inferential Statistics • If, under a given assumption, the probability of a particular observed event is exceptionally small, we conclude that the assumption is probably not correct. • Application: • If our assumption is that there is no home field advantage, what is the probability that the home team wins 54.96 % or more of the games? • A small probability would indicate that the result is unlikely if the true home field winning percentage is 50%. Reject idea of no home field advantage.

Other Similar Types of Questions • Has the president’s approval rating changed since last month? • Did the Super Bowl Ad we bought actually increase sales? • Do a majority of Americans run red lights? • Is the global temperature increasing? • Is a new cancer treatment effective? To answer such questions, we test hypotheses about models.

Overview of the Process • Make a hypothesis that proposes a model for the world • Look at the data from the sample • If the facts are consistent with the model, there is no reason to disbelieve the hypothesis. However, we can not conclude the hypothesis is absolutely true either. • What if the facts are inconsistent with the model? If they are only slightly out of step, we might stick with the model. However, if the data dramatically contradict the model, we have strong evidence that the model is incorrect.

The Reasoning of Hypothesis Testing • There are four basic steps to a hypothesis test: • 1. Hypotheses • 2. Model • 3. Mechanics • 4. Conclusion • Let’s look at these steps in detail…

Step 1: Identify the Hypotheses • A hypothesis is a claim or statement about the value of a population parameter. • There are two types of hypotheses: • Null hypothesis (Ho) specifies a population model parameter of interest and proposes a value for that parameter. • Usually states that the population parameter is equal to some value. • Ho: parameter = hypothesized value. • Our example: • Ho: p = 0.50 • In words – the home team wins 50% of all baseball games (there is no home field advantage.

Step 1: Identify the Hypotheses (cont.) • Two types of hypotheses (cont.) • Alternative Hypothesis (Ha) - contains the values of the parameter we accept if we reject the null. • The claim or research question of interest almost always becomes the alternative hypothesis. • Possible options for the alternative hypothesis if the Ho is p =0.50 include: • Ha: p > 0.50 • Ha: p < 0.50 • Ha: p not = 0.50

Process for Identify the Hypotheses 1. Identify the specific claim or hypothesis to be tested and express it in symbolic form. 2. Give the symbolic form that must be true when the original claim is false. 3. Let the alternative hypothesis be the one that does not contain the condition of equality. Let the null hypothesis be the one that contains the condition of equality.

Practice Identifying Hypotheses • Example 1: The proportion of drivers who admit to running red lights is greater than 0.5. • In Step 1, we express the given claim as p > 0.5. • In Step 2, we see that if p>0.5, then p<= 0.5 must be true. • In Step 3, we see that the expression p> 0.5 does not contain equality, so we let the alternative Ha be p>0.5, and we let Ho be p = 0.5.

Practice Identifying Hypotheses (cont.) Example 2: The percentage of men who watch golf on TV is not 70%, as is claimed by the Madison Advertising Company. Example 3: In a Gallup poll of 1012 randomly selected adults, 9% said that cloning of humans should not be allowed. Use a 0.05 significance level to test the claim that less than 10% of all adults say that cloning of humans should be allowed.

Practice Identifying Hypotheses (cont.) • Back to our Baseball Example: • Our question of interest is whether or not there is a home field advantage. • Ho: p = 0.50 • Ha: p > 0.50 • We are interested only in a home field advantage, so the alternative hypothesis is one-sided. • If there is no advantage, we’d expect the proportion to be 0.50.

Step 2: Model • We must state the assumption and check the corresponding conditions to determine whether we can model the sampling distribution of the proportion with a Normal model. • Conditions to check: (same conditions as used for C.I.) • 1. Random Sampling Condition • 2. 10% Condition • 3. Success/Failure Condition

Step 2: Model (cont.) We test the hypothesis Ho: p = po using the statistic where When the conditions are met and the null hypothesis is true, this statistic follows the standard Normal model, so we can use that model to obtain a P-value.

Step 2: Model (cont.) Back to our Baseball Example: Random Sampling Condition - We have results for all 2429 games of the 2003 season. We are interested in more than 2003, and those games, while not randomly selected, are a reasonable representative sample of all recent professional baseball games. 10% Condition - These 2429 games are fewer than 10% of all games played over the years. Success/Failure Condition - Both npo = 2429(0.50) = 1214.5 and nqo = 2429(0.50) = 1214.5 are greater than 10. Because the conditions are satisfied, we’ll use a Normal model for the sampling distribution of the proportion and do a one-proportion z-test.

Step 3: Mechanics Perform the actual calculations of the test statistic given the information in the scenario. Our ultimate goal of the calculation is to find a P-value. P-value - the probability that the observed statistic value ( or even more extreme value) could occur if the null hypothesis were correct. The null hypothesis is rejected if the P-value is very small, such as 0.05 or less.

Step 3: Mechanics (cont.) Back to our Baseball example: Our model is a Normal model with a mean of 0.50 and a standard deviation of The observed proportion is, , is 0.5496. So the z-value is The corresponding P-Value is < 0.0001

Interpreting the P-Value Recall the P-value = P(observed statistic value (or something more extreme) | Ho). Common error: The P-value is not the probability that the null hypothesis is true. All we can say is that given the null hypothesis, there is a 0.01 percent chance of observing a home field winning percentage of 54.2% or larger. The null is rejected if the p-value is very small, say less than 0.05. However, other common threshold values include 0.10 and 0.01. We refer to these values as significance levels.

Step 4: Conclusion • We must clearly explain what we have learned about the original research question (or claim) of interest. • Our conclusion should include these three elements: • 1. State the decision about the null hypothesis. • 2. Link the P-value to the decision • 3. Interpret that decision in the proper context. • Because a p-value of 0.0001 is less than 0.05, we reject the null hypothesis. The sample data support the claim that the home team wins more than 50% of the time and that there is a home field advantage.

Confidence Interval as a Follow-up How big of a difference are we talking about? Find a 95% confidence interval for home field advantage. Critical value(z*) = 1.96 Margin of Error is E=z*SE(phat) = 1.96*0.0101=0.0198 Confidence interval is 0.5496+0.0198 or (0.5298,0.5694) ***We are 95% confident that the interval from 53.0% to 56.9% contains the true winning percentage for home teams.

Example - Travel Through the Internet Among 800 randomly selected Internet users, it was found that 360 of them use the Internet for making travel plans. Use 1 % significance level to test the claim that among Internet users, less than 45% use it for making travel plans. A.) Identify the null and alternative hypothesis. B.) Sketch the model. C.) What is the test statistic? D.) Calculate and interpret the p-value. E.) What is the conclusion.

Example – Car Accidents In a study of 11,000 car accidents, it was found that 5720 of them occurred within 5 miles of home (based on data from Progressive Insurance). Use a 10% significance level to test the claim that 51% of car accidents occur within 5 miles of home.

Example - Super Bowl Ads A start-up company is about to market a new computer printer. It decides to gamble by running commercials during the Super Bowl. The company hopes that name recognition will be worth the high cost of the ads. The goal of the company is that at least 40% of the public recognize its brand name and associate it with computer equipment. The day after the game, a pollster contacts 420 randomly chosen adults, and finds that 181 of them know that this company manufacture printers. Would you recommend that the company continue to advertise during Super Bowls? Explain

Example - Jury Selection Census data for a certain county shows that 19% of the adult residents are Hispanic. Suppose 72 people are called for jury duty, and only 9 of them are Hispanic. Does this call into question the fairness of the jury selection system? Explain.

Assignment • Read Chapter 21 - More About Test • Try the following problems from Chapter 20: • # 1, 3, 9, 11, 21, 23, and 25

Testing Hypotheses About Proportions

Testing Hypotheses About Proportions

Presentation Transcript

Hypotheses Testing

Chapter 20 Testing Hypothesis about proportions

Chapter 20: Testing Hypotheses About Proportions

Lecture Unit 5 Section 5.4 Testing Hypotheses about Proportions

Testing Hypotheses About Proportions

Chapter 8 Testing Hypotheses about Means

Hypothesis Testing: Hypotheses

Chapter 20 – Testing Hypotheses about Proportions

Chapter 20: Testing Hypotheses About Proportions

Hypotheses Testing

Estimating and Testing Hypotheses about Means

Testing hypotheses

Testing Hypotheses

Testing Hypotheses about Proportions

Hypothesis Testing about Proportions part 1

Hypotheses Testing

TESTING HYPOTHESES

Chapters 20, 21 Testing Hypotheses about Proportions

Testing Hypotheses

Chapters 19, 21 Testing Hypotheses about Proportions

Testing Hypotheses II