html5-img
1 / 62

Chapter 24

Chapter 24. Two-Way Tables and the Chi-square Test. Thought Question 1.

bruce-ellis
Télécharger la présentation

Chapter 24

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Chapter 24 Two-Way Tables and the Chi-square Test Chapter 24

  2. Thought Question 1 A random sample of registered voters were asked whether they preferred balancing the budget or cutting taxes. Each was then categorized as being either a Democrat or a Republican. Of the 30 Democrats, 12 preferred cutting taxes, while of the 40 Republicans, 24 preferred cutting taxes. How would you display the data in a table? Chapter 24

  3. Categorical Variables • In this chapter we will study the relationship between two categorical variables(variables whose values fall in groups or categories). • To analyze categorical data, use the counts or percents of individuals that fall into various categories. Chapter 24

  4. Two-Way Table • When there are two categorical variables, the data are summarized in a two-way table • each row represents a value of the row variable • each column represents a value of the column variable • The number of observations falling into each combination of categories is entered into each cell of the table • Relationships between categorical variables are described by calculating appropriate percents from the counts given in the table • prevents misleading comparisons due to unequal sample sizes for different groups Chapter 24

  5. Case Study Helped Pick Up Pencils?Which is More Likely:Females or Males? Statistical Methods for Psychology, 3rd edition, D. C. Howell, 1992, Belmont, CA: Duxbury Press, p. 154. Chapter 24

  6. Case StudyDrop the Pencils A handful of pencils were accidentally dropped, so it appeared, by the researcher in an elevator in the presence of either a female subject or a male subject. The subject’s response was observed: did the subject help pick up the pencils or not? Chapter 24

  7. Case StudyThe Question The question was whether the males or females who observed this mishap would be more likely to help pick up the pencils. • Explanatory variable: gender • Response variable: “pick up” action (Y/N) Categorical Data Chapter 24

  8. Case StudyDisplay the Results: Contingency (Two-Way) Table Chapter 24

  9. Case StudyDisplay the Results: Percentages Chapter 24

  10. Case StudyStatistical Significance Is the difference between the percentages for males vs. females statistically significant? One of the following must be true: Percentages are really the same in population; observed difference is due to chance. - or - Percentages are really different in population; observed difference reflects this. Chapter 24

  11. Assessing Statistical Significancefor a Two-Way Table • Strength of the relationship • measured by the difference in the sample percentages • Much easier to rule out chance with large samples Chapter 24

  12. The Chi-Square statistic measures the magnitude of the difference in the sample percentages, incorporating sample size in its calculation Measuring the Difference with the Chi-Square Statistic • If percentages in the population are the same, then the Chi-square tends to be small(near 0) • If percentages in the population are different, then the Chi-square tends to be large Chapter 24

  13. Make the Decision:Is the relationship statistically significant? • “Critical value” for 22 tables = 3.84 • If the chi-square value (for 22 tables) is larger than 3.84, then the relationship is considered to be statistically significant. • Note: Z = square root of the chi-square.(for 22 tables) • Critical value 3.84 is (1.96)2 [ (~2)2 ] *Note that the procedure given here is specifically for 22 tables (2 rows and 2 columns); the general procedure for any two-way table (with any number of rows and columns) is given later in this chapter (see slide 32) Chapter 24

  14. Case StudyStatistical Significance • Is the difference between the percentages for males vs. females statistically significant? • Chi-square statistic = 8.65 • Since our chi-square is 8.65 > 3.84, we conclude there is a statistically significant relationship between gender and helping to pick up the pencils. Chapter 24

  15. Case Study Thought Question 1 Agenda versus Political Party Chi-square = 2.75(significant?) Chapter 24

  16. Case Study • Quitting Smoking with Nicotine Patches • (JAMA, Feb. 23, 1994, pp. 595-600) • Two Categorical Variables: • Explanatory: Treatment assignment • Nicotine patch • Control patch • Response: Still smoking after 8 weeks? • Yes • No Chapter 24

  17. Case StudyDisplay the Results: Contingency (Two-Way) Table Chapter 24

  18. Case StudyStatistically Significant Relationship? • Chi-square = 19.2 • There is a statistically significant relationship between the type of patch used and the cessation of smoking for at least 8 weeks. Chapter 24

  19. Case StudyPopular Ad: Seldane-D Allergy Tablets Time, 27 March 1995, p. 18 • Double-blind study of side effects • Seldane-D: 374 subjects • 27 (7.2%) reporteddrowsiness • 347 did not • Placebo: 193 subjects • 22 (11.4%) reporteddrowsiness • 171 did not Chapter 24

  20. Case StudySummaries • Chi-square = 2.58 • Baseline risk of drowsiness using placebo = 11.4% • Risk of drowsiness using Seldane-D= 7.2% • Relative risk of drowsiness using Seldane-D versus placebo = 0.63 Chapter 24

  21. Case StudyConclusion • Randomized, controlled experiment • Statistically insignificant relationship between use of Seldane-D allergy tablets and presence of drowsiness. • Evidence does not support that Seldane-D causes drowsiness in some people. Chapter 24

  22. demo A Caution About Sample Size, Statistical Significance, and Chi-Square • The effect of sample size on the chi-square statistic when the table percentages stay the same: • For example, if n=850 (instead of 567) and all percentages remain the same, then the chi-square would be 3.87 (instead of 2.58); would the conclusion change? Chapter 24

  23. Inference for Relative Risk • Confidence Intervals • if two risks are the same, the relative risk is 1 • see if confidence interval contains 1 • Hypothesis Tests • to test if two individual risks are equal, test to see if the relative risk is 1 • use the chi-square value to find the P-value Chapter 24

  24. Case Study Relationship between breast cancer and induced abortion Daling, et. al., (1994) “Risk of breast cancer among young women: relationship to induced abortion.” Journal of the National Cancer Institute, Vol. 86, No. 21, pp. 1584-1592. Is the risk of breast cancer among women who have had an induced abortion different from the risk among those who have not? Chapter 24

  25. Case Study: Sample Relationship between breast cancer and induced abortion • 845 breast cancer cases were identified in Washington State from 1983 to 1990. • 910 control women were identified using random-digit dialing in the same area. • Women born prior to 1944 were excluded. Chapter 24

  26. Case Study: C.I. Results Relationship between breast cancer and induced abortion • The relative risk for breast cancer was 1.5, with the higher risk for women who had an induced abortion. • A 95% confidence interval for the relative risk was 1.2 to 1.9. (given) • Note the confidence interval does not contain the value one( risks are different) Chapter 24

  27. Case Study: C.I. Results Relationship between breast cancer and induced abortion • No increased risk was found for women who had spontaneous abortions; the relative risk was 0.9. • A 95% confidence interval for the relative risk was 0.7 to 1.2. (given) • Note the confidence interval does contain the value one( risks are not different) Chapter 24

  28. Case Study (continued) Relationship between breast cancer and induced abortion Daling, et. al., (1994) “Risk of breast cancer among young women: relationship to induced abortion.” Journal of the National Cancer Institute, Vol. 86, No. 21, pp. 1584-1592. Is the risk of breast cancer among women who have had an induced abortion different from the risk among those who have not? Chapter 24

  29. Case Study: The Hypotheses • Null: The risk of developing breast cancer for women who have had an induced abortion is the same as the risk for women who have not had an induced abortion.[RR= 1] • Alt: The risk of developing breast cancer for women who have had an induced abortion is different from the risk for women who have not had an induced abortion. [RR¹ 1] Chapter 24

  30. Case Study: Test Statistic and P-value • Relative Risk = 1.5 • Could also display data in a 22 table and compute the chi-square value (9.75). • The P-value (we will not compute this one, just take it from the study) is 0.002. • Recall: Z = square root of the chi-square.(for 22 tables) Chapter 24

  31. Case Study: Decision • Since the P-value is small, we reject chance as the reason for the relative risk (1.5) being different from 1.0. • We find the result to be statistically significant. • We reject the null hypothesis. The data provide evidence that the two population risks (of developing breast cancer) are not the same. Chapter 24

  32. Two-Way Table: General Procedure • The remainder of this chapter presents the general procedure for determining if a significant relationship exists between two categorical variables with any number of levels • how to analyze two-way tables with any number of rows and columns • results apply to special case of 22 tables Chapter 24

  33. Case Study Health Care: Canada and U.S. Mark, D. B. et al., “Use of medical resources and quality of life after acute myocardial infarction in Canada and the United States,” New England Journal of Medicine, 331 (1994), pp. 1130-1135. Data from patients’ own assessment of their quality of life relative to what it had been before their heart attack (data from patients who survived at least a year) Chapter 24

  34. Case Study Health Care: Canada and U.S. Chapter 24

  35. Case Study Health Care: Canada and U.S. Compare the Canadian group to the U.S. group in terms of feeling much better: We have that 75 Canadians reported feeling much better, compared to 541 Americans. The groups appear greatly different, but look at the group totals. Chapter 24

  36. Case Study Health Care: Canada and U.S. Compare the Canadian group to the U.S. group in terms of feeling much better: Change the counts to percents Now, with a fairer comparison using percents, the groups appear very similar in terms of feeling much better. Chapter 24

  37. Case Study Health Care: Canada and U.S. Is there a relationship between the explanatory variable (Country) and the response variable (Quality of life)? For each level of the explanatory variable (Country), look at the percents across all levels of the response variable (Quality of life). Conclude that a relationship exists if these distributions look significantly different. Chapter 24

  38. Hypothesis Test • In tests for two categorical variables, we are interested in whether a relationship observed in a single sample reflects a real relationship in the population. • Hypotheses: • Null: the percentages for one variable are the same for every level of the other variable(No real relationship). • Alt: the percentages for one variable vary over levels of the other variable. (Is a real relationship). Chapter 24

  39. Case Study Health Care: Canada and U.S. Null hypothesis: The percentages for one variable are the same for every level of the other variable.(No real relationship). For example, could look at differences in percentages between Canada and U.S. for each level of “Quality of life”: 24% vs. 25% for those who felt ‘Much better’, 23% vs. 23% for ‘Somewhat better’, etc. * Want to do all of these comparisons as one overall test… Chapter 24

  40. Hypothesis Test • H0: no real relationship between the two categorical variables that make up the rows and columns of a two-way table • To test H0, compare the observed counts in the table (the original data) with the expected counts (the counts we would expect if H0 were true) • if the observed counts are far from the expected counts, that is evidence against H0 in favor of a real relationship between the two variables Chapter 24

  41. Expected Counts • The expected count in any cell of a two-way table (when H0 is true) is Chapter 24

  42. Case Study Health Care: Canada and U.S. For the observed data to the right, find the expected value for each cell: For the expected count of Canadians who feel ‘Much better’ (expected count for Row 1, Column 1): Chapter 24

  43. Compare to see if the data support the null hypothesis Case Study Health Care: Canada and U.S. Observed counts: Expected counts: Chapter 24

  44. Chi-Square Statistic • To determine if the differences between the observed counts and expected counts are statistically significant (to show a real relationship between the two categorical variables), we use the chi-square statistic: where the sum is over all cells in the table. Chapter 24

  45. Chi-Square Statistic • The chi-square statistic is a measure of the distance of the observed counts from the expected counts • is always zero or positive • is only zero when the observed counts are exactly equal to the expected counts • large values of X2 are evidence against H0 because these would show that the observed counts are far from what would be expected if H0 were true • the chi-square test is one-sided (any violation of H0 produces a large value of X2) Chapter 24

  46. Case Study Health Care: Canada and U.S. Observed counts Expected counts Chapter 24

  47. Chi-Square Test • Calculate value of chi-square statistic • by hand (cumbersome) • using technology (computer software, etc.) • Find P-value in order to reject or fail to reject H0 • use chi-square table for chi-square distribution (next few slides) • from computer output • If significant relationship exists (small P-value): • compare appropriate percents in data table • compare individual observed and expected cell counts • look at individual terms in the chi-square statistic Chapter 24

  48. Case Study Health Care: Canada and U.S. Using Technology: Chapter 24

  49. Chi-Square Distributions • Family of distributions that take only positive values and are skewed to the right • Specific chi-square distribution is specified by giving its degrees of freedom (formula on next slide) Chapter 24

  50. Chi-Square Distributions • Chi-square test for a two-way table withr rows and c columns uses critical values from a chi-square distribution with(r  1)(c  1) degrees of freedom • P-value is the area to the right of X2 under the density curve of the chi-square distribution • use chi-square table Chapter 24

More Related