670 likes | 797 Vues
Chapter 10. Hypothesis Testing Using a Single Sample. Sharing prescription drugs with others can be dangerous. Is this a common occurrence among teens? OR
 
                
                E N D
Chapter 10 Hypothesis Testing Using a Single Sample
Sharing prescription drugs with others can be dangerous. Is this a common occurrence among teens? OR The National Association of Colleges and Employers stated that the average starting salary for students graduating with a bachelor’s degree in 2010 is $48,351. Is this true for your college? How do we answers questions like these using sample data? In Chapter 9, we used sample data to estimate the value of an unknown population characteristic. In this chapter, we will use sample data to test some claim or hypothesis about the population characteristic to see if it is plausible. To do this, we use a test of hypotheses or test procedure.
What is a test of hypotheses? Is it one of the values of the sample statistic that are likely to occur? A test of hypotheses is a method that uses sample data to decide between two competing claims (hypotheses) about the population characteristic. Is the value of the sample statistic . . . • a random occurrence due to natural variation? OR Is it one that isn’t likely to occur? • a value that would be considered surprising?
Hypothesis statements: You are usually trying to determine if this claim is believable. The null hypothesis, denoted by H0, is a claim about a population characteristic that is initially assumed to be true. The alternative hypothesis, denoted by Ha, is the competing claim. The hypothesis statements are ALWAYSabout the population – NEVER about a sample! To determine what the alternative hypothesis should be, you need to keep the research objectives in mind.
To determine which hypothesis is correct, the jury will listen to the evidence. Only if there is “evidence beyond a reasonable doubt” would the null hypothesis be rejected in favor of the alternative hypothesis. You are trying to determine if the evidence supports this claim. Let’s consider a murder trial . . . What is the null hypothesis? What is the alternative hypothesis? • So we will make one of two decisions: • Reject the null hypothesis • Fail to reject the null hypothesis This is what you assume is true before you begin. H0: the defendant is innocent If there is not convincing evidence, then we would “fail to reject” the null hypothesis. Remember that the actually verdict that is returned is “GUILTY” or “NOT GUILTY”. We never end up determining the null hypothesis is true – only that there is not enough evidence to say it’s not true. Ha: the defendant is guilty
The Form of Hypotheses: Null hypothesis H0: population characteristic = hypothesized value Alternative hypothesis Ha: population characteristic > hypothesized value Ha: population characteristic < hypothesized value Ha: population characteristic ≠ hypothesized value This one is considered a two-tailed test because you are interested in both direction. The null hypothesis always includes the equal case. This hypothesized value is a specific number determined by the context of the problem This sign is determined by the context of the problem. Notice that the alternative hypothesis uses the same population characteristic and the same hypothesized value as the null hypothesis. These are considered one-tailed tests because you are only interested in one direction. Let’s practice writing hypothesis statements.
Sharing prescription drugs with others can be dangerous. A survey of a representative sample of 592 U.S. teens age 12 to 17 reported that 118 of those surveyed admitted to having shared a prescription drug with a friend. Is this sufficient evidence that more than 10% of teens have shared prescription medication with friends? State the hypotheses : What is the hypothesized value? What words indicate the direction of the alternative hypothesis? What is the population characteristic of interest? H0: p = .1 Ha: p > .1 The true proportion p of teens who have shared prescription medication with friends
Compact florescent (cfl) lightbulbs are much more energy efficient than regular incandescent lightbulbs. Ecobulb brand 60-watt cfl lightbulbs state on the package “Average life 8000 hours”. People who purchase this brand would be unhappy if the bulbs lasted less than 8000 hours. A sample of these bulbs will be selected and tested. State the hypotheses : What is the hypothesized value? What words indicate the direction of the alternative hypothesis? What is the population characteristic of interest? H0: m = 8000 Ha: m < 8000 The true mean (m) life of the cfl lightbulbs
Because in variation of the manufacturing process, tennis balls produced by a particular machine do not have the same diameters. Suppose the machine was initially calibrated to achieve the specification of m = 3 inches. However, the manager is now concerned that the diameters no longer conform to this specification. If the mean diameter is not 3 inches, production will have to be halted. State the hypotheses : What words indicate the direction of the alternative hypothesis? What is the population characteristic of interest? H0: m = 3 Ha: m ≠ 3 The true mean m diameter of tennis balls
Must use a population characteristic - x is a statistics (sample) For each pair of hypotheses, indicate which are not legitimate and explain why Must be only greater than! Must use same number as in H0! H0MUST be “=“ !
When you perform a hypothesis test you make a decision: reject H0orfail to reject H0 When you make one of these decisions, there is a possibility that you could be wrong! That you made an error! Each could possibly be a wrong decision; therefore, there aretwo types of errors.
Type I error • The error of rejecting H0 when H0 is true • The probability of a Type I error is denoted by a. a is called the significance level of the test This is the lower-case Greek letter “alpha”.
Type II error • The error of failing to reject H0 when H0 is false • The probability of a Type II error is denoted by b This is the lower-case Greek letter “beta”.
Here is another way to look at the types of errors: Suppose H0 is true and we fail to reject it, what type of decision was made? Suppose H0 is false and we reject it, what type of decision was made? Suppose H0 is true and we reject it, what type of decision was made? Suppose H0 is false and we fail to reject it, what type of decision was made? Type I error Correct Type II error Correct
The U.S. Bureau of Transportation Statistics reports that for 2009 72% of all domestic passenger flights arrived on time (meaning within 15 minutes of its scheduled arrival time). Suppose that an airline with a poor on-time record decides to offer its employees a bonus if, in an upcoming month, the airline’s proportion of on-time flights exceeds the overall 2009 industry rate of .72. State the hypotheses. Type I error – the airline decides to reward the employees when the proportion of on-time flights doesn’t exceeds .72 H0: p = .72 Ha: p > .72 State a Type I error in context. State a Type II error in context. Type II error – the airline employees do not receive the bonus when they deserve it.
In 2004, Vertex Pharmaceuticals, a biotechnology company, issued a press release announcing that it had filed an application with the FDA to begin clinical trials on an experimental drug VX-680 that had been found to reduce the growth rate of pancreatic and colon cancer tumors in animal studies. Data resulting from the planned clinical trials can be used to test: Let m = the true mean growth rate of tumors for patients taking the experimental drug H0: m = mean growth rate of tumors for patients not taking the experimental drug Ha: m < mean growth rate of tumors for patients not taking the experimental drug State a Type I error in the context of this problem. A potential consequence of making a Type I error would be that the company would continue to devote resources to the development of the drug when it really is not effective. What is a potential consequence of this error? A Type I error would be to incorrectly conclude that the experimental drug is effective in slowing the growth rate of tumors
In 2004, Vertex Pharmaceuticals, a biotechnology company, issued a press release announcing that it had filed an application with the FDA to begin clinical trials on an experimental drug VX-680 that had been found to reduce the growth rate of pancreatic and colon cancer tumors in animal studies. Data resulting from the planned clinical trials can be used to test: H0: m = mean growth rate of tumors for patients not taking the experimental drug Ha: m < mean growth rate of tumors for patients not taking the experimental drug State a Type II error in the context of this problem. A potential consequence of making a Type II error would be that the company might abandon development of a drug that was effective. What is a potential consequence of this error? A Type II error would be to conclude that the drug is ineffective when in fact the mean growth rate of tumors is reduced
The relationship between a and b The ideal test procedure would result in both a = 0 (probability of a Type I error) and b = 0 (probability of a Type II error). This is impossible to achieve since we must base our decision on sample data. Standard test procedures allow us to select a, the significance level of the test, but we have no direct control over b. Selecting a significance level a = .05 results in a test procedure that, used over and over with different samples, rejects a true H0 about 5 times in 100. So why not always choose a small a (like a = .05 or a = .01)?
Suppose this normal curve represents the sampling distribution for p when the null hypothesis is true. .5 The relationship between a and b If the null hypothesis is false and the alternative hypothesis is true, then the true proportion is believed to be greater than .5 – so the curve should really be shifted to the right. Let’s consider the following hypotheses: H0: p = .5 Ha: p > .5 Let a = .05 This tail would represent b, the probability of failing to reject a false H0. This is the part of the curve that represents a or the Type I error.
The relationship between a and b If the null hypothesis is false and the alternative hypothesis is true, then the true proportion is believed to be greater than .5 – so the curve should really be shifted to the right. Let’s consider the following hypotheses: H0: p = .5 Ha: p > .5 Let a = .01 This tail would represent b, the probability of failing to reject a false H0. Notice that as a gets smaller, b gets larger!
How does one decide what a level to use? After assessing the consequences of type I and type II errors, identify the largesta that is tolerable for the problem. Then employ a test procedure that uses this maximum acceptable value –rather than anything smaller – as the level of significance. Remember, using a smaller a increases b.
The EPA has adopted what is known as the Lead and Copper Rule, which defines drinking water as unsafe if the concentration of lead is 15 parts per billion (ppb) or greater or if the concentration of copper is 1.3 ppb or greater. The manager of a community water system might use lead level measurements from a sample of water specimens to test the following hypotheses: H0: m = 15 versus Ha: m < 15 Which type of error has a more serious consequence? State a Type I error in context. State a Type II error in context. Since most people would consider the consequence of the Type I error more serious, we would want to keep a small – so select a smaller significance level of a = .01. What is a consequence of a Type I? What is a consequence of a Type II? A Type I error leads to the conclusion that a water source meets EPA standards when the water is really unsafe. There are possible health risks to the community A Type II error leads to the conclusion that a water source does NOT meet EPA standards when the water is really safe. The community might lose a good water source.
Large-Sample Hypothesis Test for a Population Proportion The fundamental idea behind hypothesis testing is: We reject H0 if the observed sample is veryunlikely to occur if H0 is true.
These three properties imply that the standardized variable has an approximately standard normal distribution when n is large. 3. When n is large, the sampling distribution of p is approximately normal. Recall the General Properties for Sampling Distributions of p 1. 2. As long as the sample size is less than 10% of the population
In June 2006, an Associated Press survey was conducted to investigate how people use the nutritional information provided on food packages. Interviews were conducted with 1003 randomly selected adult Americans, and each participant was asked a series of questions, including the following two: Based on these data, is it reasonable to conclude that a majority of adult Americans frequently check nutritional labels when purchasing packaged foods? Question 1: When purchasing packaged food, how often do you check the nutritional labeling on the package? Question 2: How often do you purchase food that is bad for you, even after you’ve checked the nutrition labels? It was reported that 582 responded “frequently” to the question about checking labels and 441 responded “very often” or “somewhat often” to the question about purchasing bad foods even after checking the labels.
We will create a test statistic using: A test statistic indicates how many standard deviations the sample statistic (p) is from the population characteristic (p). This observed sample proportion is greater than .5. Is it plausible a sample proportion of p = .58 occurred as a result of chance variation, or is it unusual to observe a sample proportion this large when p = .5? Nutritional Labels Continued . . . H0: p = .5 Ha: p > .5 p = true proportion of adult Americans who frequently check nutritional labels For this sample: We use p > .5 to test for a majority of adult Americans who frequently check nutritional labels.
0 Nutritional Labels Continued . . . H0: p = .5 Ha: p > .5 p = true proportion of adult Americans who frequently check nutritional labels For this sample: In the standard normal curve, seeing a value of 5.08 or larger is unlikely. It’s probability is approximately 0. Next we find the P-value for this test statistic. The P-value is the probability of obtaining a test statistic at least as inconsistent with H0 as was observed, assuming H0 is true. Since the P-value is so small, we reject H0. There is convincing evidence to suggest that the majority of adult Americans frequently check the nutritional labels on packaged foods. P-value ≈ 0
Computing P-values The calculation of the P-value depends on the form of the inequality in the alternative hypothesis. • Ha: p > hypothesize value z curve P-value = area in upper tail Calculated z
Computing P-values The calculation of the P-value depends on the form of the inequality in the alternative hypothesis. • Ha: p < hypothesize value z curve P-value = area in lower tail Calculated z
Computing P-values The calculation of the P-value depends on the form of the inequality in the alternative hypothesis. • Ha: p ≠ hypothesize value P-value = sum of area in two tails z curve Calculated z and –z
Using P-values to make a decision: To decide whether or not to reject H0, we compare the P-value to the significance level a If the P-value > a, we “fail to reject” the null hypothesis. If the P-value <a, we “reject” the null hypothesis.
Summary of the Large-Sample z Test for p Null hypothesis: H0: p = hypothesized value Test Statistic: Alternative Hypothesis: P-value: Ha: p > hypothesized value Area to the right of calculated z Ha: p < hypothesized value Area to the left of calculated z Ha: p ≠ hypothesized value 2(Area to the right of z) of +z or 2(Area to the left of z) of -z
Summary of the Large-Sample z Test for p Continued . . . Assumptions: • p is a sample proportion from arandom sample • The sample size n is large. (np> 10 and n(1 - p) > 10) • If sampling is without replacement, the sample size is no more than 10% of the population size
A report states that nationwide, 61% of high school graduates go on to attend a two-year or four-year college the year after graduation. Suppose a random sample of 1500 high school graduates in 2009 from a particular state estimated the proportion of high school graduates that attend college the year after graduation to be 58%. Can we reasonably conclude that the proportion of this state’s high school graduates in 2009 who attended college the year after graduation is different from the national figure? Use a = .01. H0: p = .61 Ha: p ≠ .61 Where p is the proportion of all 2009 high school graduates in this state who attended college the year after graduation State the hypotheses.
College Attendance Continued . . . H0: p = .61 Ha: p ≠ .61 Where p is the proportion of all 2009 high school graduates in this state who attended college the year after graduation • Assumptions: • Given a random sample of 1500 high school graduates • Since 1500(.61) > 10 and 1500(.39) > 10, sample size is large enough. • Population size is much larger than the sample size.
College Attendance Continued . . . H0: p = .61 Ha: p ≠ .61 Where p is the proportion of all 2009 high school graduates in this state who attended college the year after graduation Test statistic: P-value = 2(.0087) = .0174 Since P-value > a, we fail to reject H0. The evidence does not suggest that the proportion of 2009 high school graduates in this state who attended college the year after graduation differs from the national value. What potential error could you have made? Type II The area to the left of -2.38 is approximately .0087 Use a = .01
Calculate p. In December 2009, a county-wide water conservation campaign was conducted in a particular county. In 2010, a random sample of 500 homes was selected and water usage was recorded for each home in the sample. Suppose the sample results were that 220 households had reduced water consumption. The county supervisors wanted to know if their data supported the claim that fewer than half the households in the county reduced water consumption. H0: p = .5 Ha: p < .5 State the hypotheses. where p is the proportion of all households in the county with reduced water usage
p is from a random sample of households Water Usage Continued . . . H0: p = .5 Ha: p < .5 where p is the proportion of all households in the county with reduced water usage Verify assumptions 2. Sample size n is large because np = 250 >10 and n(1-p) = 250 > 10 3. It is reasonable that there are more than 5000 (10n) households in the county.
Water Usage Continued . . . H0: p = .5 Ha: p < .5 where p is the proportion of all households in the county with reduced water usage Calculate the test statistic and P-value Look this value up in the table of z curve areas What potential error could you have made? Type I Use a= .01 P-value = .0037 Since P-value < a, we reject H0. There is convincing evidence that the proportion of households with reduced water usage is less than half.
.5 Water Usage Continued . . . H0: p = .5 Ha: p < .5 where p is the proportion of all households in the county with reduced water usage Since P-value < a, we reject H0. Used a = .01 Confidence intervals are two-tailed, so we need to put .01 in the upper tail (since the curve is symmetrical). With .01 in each tail, that puts .98 in the middle – this is the appropriate confidence level Compute a 98% confidence interval: Notice that the hypothesized value of .5 is NOT in the 98% confidence interval and that we “rejected” H0! Let’s create a confidence interval with this data. What is the appropriate confidence level to use? Since we are testing Ha: p < .05, a would also be in the lower tail. .98
College Attendance Revisited . . . H0: p = .61 Ha: p ≠ .61 Where p is the proportion of all 2009 high school graduates in this state who attended college the year after graduation Since P-value > a, we fail to reject H0. Use a = .01 Let’s compute a confidence interval for this problem. This is a two-tailed test so a gets split evenly into both tails, leaving 99% in the middle. Notice that the hypothesized value of .5 IS in the 98% confidence interval and that we “failed to reject” H0! .99
1) x is the sample mean from a random sample, Let’s review the assumptions for a confidence interval for a population mean The assumptions are the same for a large-sample hypothesis test for a population mean. 2) the sample size n is large (n > 30), and 3) s, the population standard deviation, is known or unknown This is the test statistic when s is unknown. This is the test statistic when s is known. P-value is area under the z curve P-value is area under the t curve with df=n-1
The One-Sample t-test for a Population Mean Null hypothesis: H0: m = hypothesized value Test Statistic: Alternative Hypothesis: P-value: Ha: m > hypothesized value Area to the right of calculated t with df = n-1 Ha: m < hypothesized value Area to the left of calculated t with df = n-1 Ha: m ≠ hypothesized value 2(Area to the right of t) of +t or 2(Area to the left of t) of -t
The One-Sample t-test for a Population MeanContinued . . . Assumptions: • x and s are the sample mean and sample standard deviation from arandom sample • The sample size n is large (n > 30) or the population distribution is at least approximately normal.
x = 59.30 s = 9.84 n = 20 A study conducted by researchers at Pennsylvania State University investigated whether time perception, an indication of a person’s ability to concentrate, is impaired during nicotine withdrawal. After a 24-hour smoking abstinence, 20 smokers were asked to estimate how much time had elapsed during a 45-second period. Researchers wanted to see whether smoking abstinence had a negative impact on time perception, causing elapsed time to be overestimated. Suppose the resulting data on perceived elapsed time (in seconds) were as follows: What is the mean and standard deviation of the sample?
x = 59.30 s = 9.84 n = 20 40 50 60 70 Smoking Abstinence Continued . . . H0: m = 45 Ha: m > 45 Assumptions: • It is reasonable to believe that the sample of smokers is representative of all smokers. Where m is the true mean perceived elapsed time for smokers who have abstained from smoking for 24-hours State the hypotheses. Since the boxplot is approximately symmetrical, it is plausible that the population distribution is approximately normal. Verify assumptions. To do this, we need to graph the data using a boxplot or normal probability plot 2) Since the sample size is not at least 30, we must determine if it is plausible that the population distribution is approximately normal.
x = 59.30 s = 9.84 n = 20 Smoking Abstinence Continued . . . H0: m = 45 Ha: m > 45 Test statistic: P-value ≈ 0a = .05 Since P-value < a, we reject H0. There is convincing evidence that the mean perceived elapsed time is greater than the actual elapsed time of 45 seconds. Where m is the true mean perceived elapsed time for smokers who have abstained from smoking for 24-hours Compute the test statistic and P-value.
x = 59.30 s = 9.84 n = 20 Smoking Abstinence Continued . . . H0: m = 45 Ha: m > 45 Since P-value < a, we reject H0. a = .05 Compute the appropriate confidence interval. Where m is the true mean perceived elapsed time for smokers who have abstained from smoking for 24-hours Notice that the hypothesized value of 45 is NOT in the 90% confidence interval and that we “rejected” H0! Since this is a one-tailed test, a goes in the upper tail. .05 goes in the lower tail, leaving .90 in the middle.
x = 116.80 s = 9.45 n = 10 A growing concern of employers is time spent in activities like surfing the Internet and emailing friends during work hours. The San Luis Obispo Tribune summarized the findings of a large survey of workers in an article that ran under the headline “Who Goofs Off More than 2 Hours a Day? Most Workers, Survey Says” (August 3, 2006). Suppose that the CEO of a large company wants to determine whether the average amount of wasted time during an 8-hour day for employees of her company is less than the reported 120 minutes. Each person in a random sample of 10 employees was contrasted and asked about daily wasted time at work. The resulting data are the following: What is the mean and standard deviation of the sample?