Analysis of Categorical Data: Goodness-of-Fit Tests

Chapter 12 The Analysis of Categorical Data and Goodness of Fit Tests

Suppose we wanted to determine if the proportions for the different colors in a large bag of M&M candies matches the proportions that the company claims is in their candies. k is used to denote the number of categories for a categorical variable We could record the color of each candy in the bag. There are six colors – so k = 6. This would be univariate, categorical data. How many categories for color would there be?

We could count how many candies of each color are in the bag. M&M Candies Continued . . . A goodness-of-fit test will allow us to determine if these observed counts are consistent with what we expect to have. A one-way frequency table is used to display the observed counts for the k categories.

Goodness-of-Fit Test Procedure Null Hypothesis: H0: p1 = hypothesized proportion for Category 1 pk = hypothesized proportion for Category k Ha: H0 is not true Test Statistic: The goodness-of-fit statistic, denoted by X2, is a quantitative measure to the extent to which the observed counts differ from those expected when H0 is true. The goodness-of-fit test is used to analysze univariate categorical data from a single sample. . . . Read “chi-squared” The X2 value can never be negative.

Goodness-of-Fit Test Procedure Continued . . . P-values: When H0 is true and all expected counts are at least 5, X2 has approximately a chi-square distribution with df = k – 1. Therefore, the P-value associated with the computed test statistic value is the area to the right ofX2 under the df = k – 1 chi-square curve. Assumptions: • Observed cell counts are based on a random sample • The sample size is large enough as long as every expected cell count is at least 5

Facts About c2 distributions • Different df have different curves • c2curves are skewed right • As df increases, the c2 curve shifts toward the right and becomes more like a normal curve df=3 df=5 df=10

A common urban legend is that more babies than expected are born during certain phases of the lunar cycle, especially near the full moon. The table below shows the number of days in the eight lunar phases with the number of births in each phase for 24 lunar cycles. There are eight phases so k = 8.

Lunar Phases Continued . . . Let: p1 = proportion of births that occur during the new moon p2 = proportion of births that occur during the waxing crescent moon p3 = proportion of births that occur during the first quarter moon p4 = proportion of births that occur during the waxing gibbous moon p5 = proportion of births that occur during the full moon p6 = proportion of births that occur during the waning gibbous moon p7 = proportion of births that occur during the last quarter moon p8 = proportion of births that occur during the waning crescent moon There is a total of 699 days in the 24 lunar cycles. If there is no relationship between the number of births and lunar phase, then the expected proportions equal the number of days in each phase out of the total number of days. The hypothesis statements would be: H0: p1 = .0343, p2 = .2175, p3 = .0343, p4 = .2132, p5 = .0343, p6 = .2146, p7 = .0343, p8 = .2175 Ha: H0 is not true p1 = .0343 p2 = .2175 p3 = .0343 p4 = .2132 P5 = .0343 p6 = .2146 p7 = .0343 p8 = .2175

Lunar Phases Continued . . . H0: p1 = .0343, p2 = .2175, p3 = .0343, p4 = .2132, p5 = .0343, p6 = .2146, p7 = .0343, p8 = .2175 Ha: H0 is not true There is a total of 222,784 births in the sample. If there is no relationship between the number of births and lunar phase, then the expected counts for each category would equal n(hypothesized proportion).

Lunar Phases Continued . . . H0: p1 = .0343, p2 = .2175, p3 = .0343, p4 = .2132, p5 = .0343, p6 = .2146, p7 = .0343, p8 = .2175 Ha: H0 is not true What type of error could we have potentially made with this decision? Type II Test Statistic: P-value > .10 df = 7a= .05 Since the P-value > a, we fail to reject H0. There is not sufficient evidence to conclude that lunar phases and number of births are related. The X2 test statistic is smaller than the smallest entry in the df = 7 column of Appendix Table 8.

A study was conducted to determine if collegiate soccer players had in increased risk of concussions over other athletes or students. The two-way frequency table below displays the number of previous concussions for students in independently selected random samples of 91 soccer players, 96 non-soccer athletes, and 53 non-athletes. If there were no difference between these 3 populations in regards to the number of concussions, how many soccer players would you expect to have no concussions? We would expect (158/240)(91). These values in green are the observed counts. Also called a contingency table. This is univariate categorical data - number of concussions - from 3 independent samples. These values in blue are the marginal totals. This value in red is the grand total.

X2 Test for Homogeneity Null Hypothesis: H0: the true category proportions are the same for all the populations or treatments Alternative Hypothesis: Ha: the true category proportions are not all the same for all the populations or treatments Test Statistic: The c2 Test for Homogeneity is used to analyze univariate categorical data from 2 or more independent samples.

X2 Test for Homogeneity Continued . . . Expected Counts: (assuming H0 is true) P-value:When H0 is true and all expected counts are at least 5, X2 has approximately a chi-square distribution with df = (number of rows – 1)(number of columns – 1). The P-value associated with the computed test statistic value is the area to the right ofX2 under the appropriate chi-square curve.

X2 Test for Homogeneity Continued . . . Assumptions: • Data are from independently chosenrandom samples or from subjects who were assigned at random to treatment groups. • The sample size is large: all expected cell counts are at least 5. If some expected counts are less than 5, rows or columns of the table may be combined to achieve a table with satisfactory expected counts.

Soccer Players Continued . . . State the hypotheses. H0: Proportions in each response category (number of concussions) are the same for all three groups Ha: Category proportions are not all the same for all three groups Df = (2)(3) = 6 To find df count the number of rows and columns – not including the totals! df = (number of rows – 1)(number of columns – 1) Another way to find df – you can also cover one row and one column, then count the number of cells left (not including totals)

Soccer Players Continued . . . df = 4 Test Statistic: Notice that NOT all the expected counts are at least 5. So combine the column for 2 concussions and the column for 3 or more concussions. This combined table has a df = (2)(2) = 4. Expected counts are shown in the parentheses next to the observed counts. P-value < .001a= .05

Soccer Players Continued . . . Since the P-value < a, we reject H0. There is strong evidence to suggest that the category proportions for the number of concussions is not the same for the 3 groups. We can look at the chi-squarecontributions – which of the cells above have the greatest contributions to the value of the X2 statistic? These cells had the largest contributions to the X2 test statistic. Is that all I can say – that there is a difference in proportions for the groups?

X2 Test for Independence Null Hypothesis: H0: The two variables are independent Alternative Hypothesis: Ha: The two variables are not independent Test Statistic: The c2 Test for Independence is used to analyze bivariate categorical data from a single sample.

X2 Test for Independence Continued . . . Expected Counts: (assuming H0 is true) P-value:When H0 is true and assumptions for X2 test are satisfied, X2 has approximately a chi-square distribution with df = (number of rows – 1)(number of columns – 1). The P-value associated with the computed test statistic value is the area to the right ofX2 under the appropriate chi-square curve.

X2 Test for Independence Continued . . . Assumptions: • The observed counts are based on data from a random sample. • The sample size is large: all expected cell counts are at least 5. If some expected counts are less than 5, rows or columns of the table may be combined to achieve a table with satisfactory expected counts.

The paper “Contemporary College Students and Body Piercing” (Journal of Adolescent Health, 2004) described a survey of 450 undergraduate students at a state university in the southwestern region of the United States. Each student in the sample was classified according to class standing (freshman, sophomore, junior, senior) and body art category (body piercing only, tattoos only, both tattoos and body piercing, no body art). Is there evidence that there is an association between class standing and response to the body art question? Use a = .01. State the hypotheses.

Body Art Continued . . . H0: class standing and body art category are independent Ha: class standing and body art category are not independent df = 9 Assuming H0 is true, what are the expected counts? How many degrees of freedom does this two-way table have?

Body Art Continued . . . Test Statistic: P-value < .001a= .01

Body Art Continued . . . Since the P-value < a, we reject H0. There is sufficient evidence to suggest that class standing and the body art category are associated. Seniors having both body piercing and tattoos contribute the most to the X2 statistic. Which cell contributes the most to the X2 test statistic?

Analysis of Categorical Data: Goodness-of-Fit Tests

Analysis of Categorical Data: Goodness-of-Fit Tests

Presentation Transcript

12~Chapter 12

Chapter 12

Chapter 12

Chapter 12

Chapter 12

Chapter 12

Chapter 12

Chapter 12

Chapter 12

CHAPTER 12

Chapter 12

CHAPTER 12

Chapter 12

Chapter 12

Chapter 12

Chapter 12

Chapter 12

Chapter 12

Chapter 12

Chapter 12

Chapter 12

Chapter 12