Download Presentation
## STAT E100

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -

**STAT E100**Section Week 11 – Hypothesis testing, Paired t-test, Chi-square test for independence**Course Review**• Project Proposals due Nov. 19th, email your TA. • Exam 2 is Nov 26th, practice tests have already been posted. • Exams are cumulative, about 20% future exams will be old stuff. • Email your TA to join the study group!**Key Equations:**For 2-proportion z- significance: with pooling: For 2-proportion Z - interval:**Key Equations:**For Paired t-tests: In SPSS:Analyze → Compare Means → Paired -Samples T Test**Key Equations:**For Chi-square test of Independence: To calculate the contingency table: To calculate the test statistic: In order for this χ2 test to be valid, we need all the expected cell counts to be ≥ 5. df= (#rows – 1) x (#cols – 1).**Sample Question #1**2) In 2008, the Red Sox and Yankees starters’ batting averages were: • Perform a 2-sample t-test for these data. What is your conclusion? b) Perform a paired t-test for these data. Do you results in parts a) and b) agree? Why or why not?**Sample Question #1**2) In 2008, the Red Sox and Yankees starters’ batting averages were: • Perform a 2-sample t-test for these data. What is your conclusion? • 2- sample t- significance test • Ho: μBOS- μNYY= 0 • Ha: μBOS- μNYY≠ 0 • Since p > 0.05, we cannot reject the null hypothesis that there is no relationship Red Sox and Yankees starters’ batting averages. We do not have evidence to support the claim that the batting averages are statistically significantly different.**Sample Question #1**2) In 2008, the Red Sox and Yankees starters’ batting averages were: • b) Perform a paired t-test for these data. Do you results in parts a) and b) agree? Why or why not? • Paired t- test • Ho: μDiff= 0 • Ha: μDiff ≠ 0 • Since p > 0.05, we cannot reject the null hypothesis. We do not have evidence to support the claim that the batting averages are statistically significantly different. • The two tests agree here; but that is not always the case. For example, if n is different for the 2 groups, then a paired t-test cannot be performed in this manner.**Sample Question #2**A study was conducted to determine if football helmets with newer anti-concussion technology (Riddell's Revolution helmet), actually led to a lower rate of concussions in high school football players compared to standard helmets. In an observational study in western Pennsylvania, 62 of 1173 Revolution helmet wearers suffered a concussion, while 74 of 968 standard helmet wearers suffered a concussion. Does this study provide evidence of a difference in the risk of suffering a concussion between wearing the two types of helmets? http://journals.lww.com/neurosurgery/Abstract/2006/02000/Examining_Concussion_Rates_and_Return_to_Play_in.9.aspx**Sample Question #2**A study was conducted to determine if football helmets with newer anti-concussion technology (Riddell's Revolution helmet), actually led to a lower rate of concussions in high school football players compared to standard helmets. In an observational study in western Pennsylvania, 62 of 1173 Revolution helmet wearers suffered a concussion, while 74 of 968 standard helmet wearers suffered a concussion. Does this study provide evidence of a difference in the risk of suffering a concussion between wearing the two types of helmets? http://journals.lww.com/neurosurgery/Abstract/2006/02000/Examining_Concussion_Rates_and_Return_to_Play_in.9.aspx There are 2 mathematically equivalent ways of doing this problem. This is a situation where the answers should agree. Here is the first way: 2-proportion z- significance test Ho: p1 - p2 = 0 The proportion of individuals suffering a concussion between wearing the two types of helmets is the same. Ha: p1 - p2 ≠ 0 The proportion of individuals suffering a concussion between wearing the two types of helmets is not the same. Since p <0.05, we can reject the null hypothesis. We have sufficient evidence to suggest that there is a difference in the risk of suffering a concussion between wearing the two types of helmets.**Sample Question #2**A study was conducted to determine if football helmets with newer anti-concussion technology (Riddell's Revolution helmet), actually led to a lower rate of concussions in high school football players compared to standard helmets. In an observational study in western Pennsylvania, 62 of 1173 Revolution helmet wearers suffered a concussion, while 74 of 968 standard helmet wearers suffered a concussion. Does this study provide evidence of a difference in the risk of suffering a concussion between wearing the two types of helmets? http://journals.lww.com/neurosurgery/Abstract/2006/02000/Examining_Concussion_Rates_and_Return_to_Play_in.9.aspx There are 2 mathematically equivalent ways of doing this problem. This is a situation where the answers should agree. Here is the second way: Chi-square test for Independence H0: The risk of suffering a concussion is independent of the helmet type HA: The risk of suffering a concussion is not independent of the helmet type. This χ2 statistic has df = (2 – 1)*(2 – 1) = 1. Since the p-value < 0.05, reject the null hypothesis. We have enough evidence to suggest that the risk of suffering a concussion is associated with the helmet type.**Sample Question #3**Below you will find a contingency table for the breakdown of gender within each class year in this semester’s Stat 104 class along with the χ2test output. Gender year | F M | Total -----------+----------------------+---------- Freshman | 71 117 | 188 Junior | 18 12 | 30 Senior | 10 11 | 21 Sophomore | 59 50 | 109 -----------+----------------------+---------- Total | 158 190 | 348 a) What are the hypotheses for this χ2 test in this situation? b) What is the expected number of female Seniors?**Sample Question #3**Below you will find a contingency table for the breakdown of gender within each class year in this semester’s Stat 104 class along with the χ2test output. Gender year | F M | Total -----------+----------------------+---------- Freshman | 71 117 | 188 Junior | 18 12 | 30 Senior | 10 11 | 21 Sophomore | 59 50 | 109 -----------+----------------------+---------- Total | 158 190 | 348 a) What are the hypotheses for this χ2 test in this situation? H0: The gender breakdown is independent of class year in this semester’s Stat 104 class. HA: The gender breakdown is not independent of class year in this semester’s Stat 104 class. b) What is the expected number of female Seniors? (Row total *Column total)/n = 9.5344**Sample Question #3**Below you will find a contingency table for the breakdown of gender within each class year in this semester’s Stat 104 class along with the χ2test output. Gender year | F M | Total -----------+----------------------+---------- Freshman | 71 117 | 188 Junior | 18 12 | 30 Senior | 10 11 | 21 Sophomore | 59 50 | 109 -----------+----------------------+---------- Total | 158 190 | 348 c) How many degrees of freedom are in this test? d) SPSS report the chi-squared test statistic to be 10.39 for this table. What is the approximate p-value for this test? e) What is your conclusion?**Sample Question #3**Below you will find a contingency table for the breakdown of gender within each class year in this semester’s Stat 104 class along with the χ2test output. Gender year | F M | Total -----------+----------------------+---------- Freshman | 71 117 | 188 Junior | 18 12 | 30 Senior | 10 11 | 21 Sophomore | 59 50 | 109 -----------+----------------------+---------- Total | 158 190 | 348 c) How many degrees of freedom are in this test? Df = (4-1)(2-1) = 3 d) SPSS report the chi-squared test statistic to be 10.39 for this table. What is the approximate p-value for this test? 0.02 > p > 0.01 e) What is your conclusion? Since p < 0.05, we reject the null hypothesis. There is evidence to suggest that the gender breakdown is not independent of class year in this semester’s Stat 104 class.