1 / 43

Hypothesis Testing

Hypothesis Testing. Math 1680. Overview. Introduction One-Sample z Tests and t Tests Two-Sample z Tests Chi-Squared Tests Summary. Introduction. Very often, we can model a chance process and use that model to predict results

reegan
Télécharger la présentation

Hypothesis Testing

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Hypothesis Testing Math 1680

  2. Overview • Introduction • One-Sample z Tests and t Tests • Two-Sample z Tests • Chi-Squared Tests • Summary

  3. Introduction • Very often, we can model a chance process and use that model to predict results • Sometimes we get a result that seems far off of the prediction • An important question is how likely the observed result would be if the chance model was correct • Hypothesis tests offer an answer

  4. Introduction • In a hypothesis test, an observed result is compared with the expected result from an appropriate chance model • We assume that the chance model is correct • Null hypothesis, or H0 • Usually want to reject the null hypothesis in favor of some alternative explanation

  5. One-Sample z Tests and t Tests • For example, consider a coin with heads and tails on it • You flip the coin 40 times and find you get 25 heads • Assuming the coin is fair (note that this is the null hypothesis), how many heads would you expect to get? • How far off should you expect to be? • Is this difference significant? 20 3.16

  6. One-Sample z Tests and t Tests • Recall that with enough flips, the number of heads is approximately normal (via the Central Limit Theorem) • The center of the curve is the expected number of heads • Approximately what is the probability of getting 25 or more heads in 40 flips? • Keep in mind that we need to use 24.5 instead of 25 when we standardize! • This number is the P-value  7.74%

  7. One-Sample z Tests and t Tests • Most scientists will say that a P-value of less than 5% is statistically significant • This is usually good enough evidence to reject the null hypothesis • A P-value of less than 1% is highly significant • The null hypothesis should almost certainly be rejected • Bear in mind that these numbers are arbitrary cutoffs

  8. One-Sample z Tests and t Tests • Since the P-value for the coin is about 7.7%, the result is not statistically significant • However, since the P-value is fairly close to 5%, it may be worth flipping the coin another 40 times and compiling the results to try another test

  9. One-Sample z Tests and t Tests • The previous example illustrates a one-sample z test • We only had one sample and wanted to compare it against a chance model • Since the variable was approximately normal, we used a z score to find the P-value

  10. One-Sample z Tests and t Tests • We were comparing the null hypothesis of flipping a fair coin to the alternative that the coin was biased in favor of heads • We were only looking at the right tail of the curve (one-tailed) • We could also compare against the coin being biased in either direction • We would then look at both tails (two-tailed)

  11. One-Sample z Tests and t Tests • To perform a one-sample z test on the result of an experiment… • State the null hypothesis and an alternative hypothesis • Compute the expected value and standard error for the result using the model from the null hypothesis • Use a normal approximation to find the P-value • If the P-value is less than 5%, the result is significant enough to reject the null hypothesis

  12. One-Sample z Tests and t Tests • In many cities, chlorine is added to drinking water to remove microbes • A typical recommended concentration is 3ppm (parts per million) • A reservoir technician wants to determine if the chlorine concentration is low enough to warrant adding more to the water • She takes 50 samples from the reservoir outlet and measures the concentration • She finds that the average concentration is 2.6ppm with an SD of 0.9ppm • State the null hypothesis and the alternate hypothesis, and find the P-value for the observation • Should the technician restock the reservoir? H0: The reservoir already has enough chlorine in it. HA: The reservoir needs more chlorine. P 0.1%, the reservoir should be restocked.

  13. One-Sample z Tests and t Tests • Sometimes our sample is too small to justify a normal approximation • An engineer working for a steel manufacturer wants to determine the strength of the steel beams the company produces • He places 10 steel bars in a machine and pulls them until they deform • If the sample is too small, you cannot use a z test to check the result!

  14. One-Sample z Tests and t Tests • Instead, we use a t test • A t distribution with m degrees of freedom is used in place of the normal curve • The degrees of freedom will be the number of measurements – 1 in this context • The distribution can be found on page A-106 of the text

  15. One-Sample z Tests and t Tests • Since the sample is small, we have to adjust the SD of the measurements to reflect the true standard error • After adjusting the SD, calculate the SE in the usual way

  16. One-Sample z Tests and t Tests • Once the EV and SE are calculated, standardize the observed result to get the t score • Look this value up in the t table • Find what range of t scores your t score would be in to estimate the P-value

  17. One-Sample z Tests and t Tests • An engineer working for a steel manufacturer wants to determine the strength of the steel beams the company produces • The type of steel he is checking is rated to have a tensile strength of 7,525psi (pounds per square inch) • He places 10 steel bars in a machine and pulls them until they deform • The average tension of deformity was 7,486psi with an SD of 47psi • Make a t test to determine if the steel is up to specifications P  1.7%, the steel is inferior

  18. One-Sample z Tests and t Tests • An engineer working on an aerial launch guided missile is testing the missile’s accuracy • His goal is for the missile to strike within 10m of its target • Because each missile costs $2 million, the engineer only gets to test-fire five missiles • The missiles strike at distances 9.2m, 10.4m, 11.7m, 9.6m, and 10.2m • Are the missiles ready for mass production? P  63%, the missiles are good enough

  19. Two-Sample z Tests • Sometimes we are interested in comparing two averages against each other • If the chance model predicts the averages to be the same, their difference should be 0 • This is going to be the null hypothesis • We can run a z (or t) test on the difference of the two observed averages and compare it to the null hypothesis

  20. Two-Sample z Tests • The expected difference between the two averages is just the difference between the expected averages • The null hypothesis predicts this to be 0 • The standard error for the difference is calculated as follows: • Where

  21. Two-Sample z Tests • To perform a two-sample z test, standardize the observed difference according to the expected value and standard error for the difference to get the z score • Then look the z score up in the normal table to find the P-value

  22. Two-Sample z Tests • You have two graders for this course • To ensure they are grading consistently, I compare the averages from each of their groups • On HW 5, one group of 38 students had an average of 47.7 with an SD of 13.8 • On HW 5, the other group of 39 students had an average of 46.0 with an SD of 14.8 • Make a z test and determine if there was a significant difference between the two groups P 62%, the graders were consistent

  23. Two-Sample z Tests • Another useful property of two-sample z tests is that they can be used to determine the significance of the difference between treatment and control groups in studies • This is how we know if the studies from Chapter 1 and 2 show statistically significant results • Compare the treatment group’s average/percentage to that of the control group

  24. Two-Sample z Tests • (Hypothetical) One high school does a study to see if there is a link between playing music and better grades in high school • The administrators compare the GPA’s (for that year) of students who were enrolled in a music course (such as band, choir, etc.) with those of students not enrolled in any such class • Make a two-sample z test • Do music students really have higher GPA’s? z = 18.756, P  0%. Music students do have higher GPA’s

  25. Chi-Squared Tests • Sometimes we need to compare the sample distribution to the predicted distribution • A gambler observes throws of a die to determine if the die is fair • After observing 48 throws, he has the following observations • Is the die fair?

  26. Chi-Squared Tests • In this case, we are comparing the observations against the null hypothesis that each outcome is equally likely • In 48 throws, we would expect to see 8 of each value come up

  27. Chi-Squared Tests • To get an idea of how far off each observed frequency (OF) is from the expected frequency (EF), we calculate the following for each possible value in the distribution • By summing up these values, we obtain the 2 (chi-square) statistic • For the die example,

  28. Chi-Squared Tests • Once we have the 2 statistic, we can look it up in a 2 table with m degrees of freedom • The degrees of freedom will be the number of values in the distribution – 1 in this context • The distribution can be found on page A-107 of the text

  29. Chi-Squared Tests • Since there are six possible values in the die-rolling distribution, there are 6 – 1 = 5 degrees of freedom in the 2 distribution • Look up the 2 value of 9.75 in the row for 5 degrees of freedom • The table tells us that the P-value is between 5% and 10%, so the result is not statistically significant • However, the rolls came heavy on 4 and light on 3 • 3 and 4 are on opposite sides of the die • Perhaps it would be good to observe more throws and retest

  30. Chi-Squared Tests • A programmer designing a random number generator needs to ensure that the numbers are uniformly distributed between 0 and 1 • “Uniformly distributed” means that each number between 0 and 1 is equally likely to be generated • She generates 1,000 numbers and groups them into class intervals based on their first digit after the decimal

  31. Chi-Squared Tests • The results are shown in the table • Are the numbers close enough to uniform, or should the programmer adjust the generator? Use a 2 test with 9 degrees of freedom. 2 = 98.94, P 0%. The generator is certainly not uniform.

  32. Chi-Squared Tests • Another use for the 2 test is to determine if two variables are independent • If two variables are independent, then the distribution of one variable under the other should look the same • The 2 test tells us if two distributions look the same

  33. Chi-Squared Tests • To calculate the expected frequencies in a block, • Find the proportion of cases of all the variables in that row compared to the total number of cases in the table • Multiply this by the total number of cases in that column

  34. (1156/4163)(760)  211 Chi-Squared Tests Total = 1156 Total = 3007 Total = 760 Total = 1613 Total = 1218 Total = 572 Grand total = 4163

  35. (1156/4163)(1613)  448 Chi-Squared Tests Total = 1156 Total = 3007 Total = 760 Total = 1613 Total = 1218 Total = 572 Grand total = 4163

  36. Chi-Squared Tests Total = 1156 Total = 3007 Total = 760 Total = 1613 Total = 1218 Total = 572 Grand total = 4163

  37. Chi-Squared Tests • To find how far off each observed case is from the expected case, use the formula • To get the value of 2, add up all of these terms • In this case,

  38. Chi-Squared Tests • The number of degrees of freedom will be (number of rows – 1)(number of columns – 1) • In this case, there are (4 – 1)(2 – 1) = 3 degrees of freedom • The last step is to estimate the P-value by finding the range in the table which covers 2 = 14.3 for 3 degrees of freedom • The table tells us P < 1%, meaning that we can say the variables are not independent

  39. Summary • When we want to show that a result was not likely to occur by pure chance, we can use a hypothesis test to validate our claim • A hypothesis test takes as a null hypothesis some chance model which could describe the situation

  40. Summary • The goal of the researcher is to reject the null hypothesis • This is accomplished by finding a P-value that is small enough to be considered “significant” • P-values less than 5% are generally considered statistically significant • The observed result was very unlikely to occur by pure chance

  41. Summary • To compare a sample average or percentage against a chance model, use a z test (if sample is large enough) or a t test (if sample is small) • To compare the averages or percentages from two different samples, use a z (or t) test for the difference between the averages/percentages

  42. Summary • To compare two entire distributions, use a 2 test • The null hypothesis for a 2 test is that the distributions being compared are the same • A 2 test can also be used to check if two variables are independent

  43. Summary • Remember that all of the hypothesis tests can only give the researcher a probability that the observed value occurred under the null hypothesis conditions • Even if a score is significant, the test cannot sponsor an alternative • Proposing a viable alternative is a task for the researcher

More Related