Chapter 14

Chapter 14 Chi Square - 2 Chi square

Chi Square • Chi Square is a non-parametric statistic used to test the null hypothesis. • It is used for nominal data. • It is equivalent to the F test that we used for single factor and factorial analysis. Chi square

… Chi Square • Nominal data puts each participant in a category. Categories are best when mutually exclusive and exhaustive. This means that each and every participant fits in one and only one category • Chi Square looks at frequencies in the categories. Chi square

Expected frequencies and the null hypothesis ... • Chi Square compares the expected frequencies in categories to the observed frequencies in categories. • “Expected frequencies”are the frequencies in each cell predicted by the null hypothesis Chi square

… Expected frequencies and the null hypothesis ... The null hypothesis: • H0: fo = fe • There is no difference between the observed frequency and the frequency predicted (expected) by the null. The experimental hypothesis: • H1: fo  fe • The observed frequency differs significantly from the frequency predicted (expected) by the null. Chi square

Calculating 2 For each cell: • Calculate the deviations of the observed from the expected. • Square the deviations. • Divide the squared deviations by the expected value. Chi square

Calculating 2 • Add ‘em up. • Then, look up 2 in Chi Square Table • df = k - 1 (one sample 2) • OR df= (Columns-1) * (Rows-1) • (2 or more samples) Chi square

Critical values of2 df 1 2 3 4 5 6 7 8 .05 3.84 5.99 5.82 9.49 11.07 12.59 14.07 15.51 .01 6.63 9.21 11.34 13.28 15.09 16.81 18.48 20.09 df 9 10 11 12 13 14 15 16 .05 16.92 18.31 19.68 21.03 22.36 23.68 25.00 26.30 .01 21.67 23.21 24.72 26.22 27.69 29.14 30.58 32.00 df 17 18 19 20 21 22 23 24 .05 27.59 28.87 30.14 31.41 32.67 33.92 35.17 36.42 .01 33.41 34.81 36.19 37.57 38.93 40.29 41.64 42.98 df 25 26 27 28 29 30 .05 37.65 38.89 40.14 41.34 42.56 43.77 .01 44.31 45.64 46.96 48.28 49.59 50.89

Critical values of2 df 1 2 3 4 5 6 7 8 .05 3.84 5.99 5.82 9.49 11.07 12.59 14.07 15.51 .01 6.63 9.21 11.34 13.28 15.09 16.81 18.48 20.09 df 9 10 11 12 13 14 15 16 .05 16.92 18.31 19.68 21.03 22.36 23.68 25.00 26.30 .01 21.67 23.21 24.72 26.22 27.69 29.14 30.58 32.00 df 17 18 19 20 21 22 23 24 .05 27.59 28.87 30.14 31.41 32.67 33.92 35.17 36.42 .01 33.41 34.81 36.19 37.57 38.93 40.29 41.64 42.98 df 25 26 27 28 29 30 .05 37.65 38.89 40.14 41.34 42.56 43.77 .01 44.31 45.64 46.96 48.28 49.59 50.89 Degrees of freedom

Critical values of2 df 1 2 3 4 5 6 7 8 .05 3.84 5.99 5.82 9.49 11.07 12.59 14.07 15.51 .01 6.63 9.21 11.34 13.28 15.09 16.81 18.48 20.09 df 9 10 11 12 13 14 15 16 .05 16.92 18.31 19.68 21.03 22.36 23.68 25.00 26.30 .01 21.67 23.21 24.72 26.22 27.69 29.14 30.58 32.00 df 17 18 19 20 21 22 23 24 .05 27.59 28.87 30.14 31.41 32.67 33.92 35.17 36.42 .01 33.41 34.81 36.19 37.57 38.93 40.29 41.64 42.98 df 25 26 27 28 29 30 .05 37.65 38.89 40.14 41.34 42.56 43.77 .01 44.31 45.64 46.96 48.28 49.59 50.89 Critical values  = .05

Critical values of2 df 1 2 3 4 5 6 7 8 .05 3.84 5.99 5.82 9.49 11.07 12.59 14.07 15.51 .01 6.63 9.21 11.34 13.28 15.09 16.81 18.48 20.09 df 9 10 11 12 13 14 15 16 .05 16.92 18.31 19.68 21.03 22.36 23.68 25.00 26.30 .01 21.67 23.21 24.72 26.22 27.69 29.14 30.58 32.00 df 17 18 19 20 21 22 23 24 .05 27.59 28.87 30.14 31.41 32.67 33.92 35.17 36.42 .01 33.41 34.81 36.19 37.57 38.93 40.29 41.64 42.98 df 25 26 27 28 29 30 .05 37.65 38.89 40.14 41.34 42.56 43.77 .01 44.31 45.64 46.96 48.28 49.59 50.89 Critical values  = .01

Example If there were 5 degrees of freedom, how big would 2 have to be for significance at the .05 level? Chi square

Another example If there were 2 degrees of freedom, how big would 2 have to be for significance at the .05 level? Note: Unlike most other tables you have seen, the critical values for Chi Square get larger as df increase. This is because you are summing over more cells, each of which usually contributes to the total observed value of chi square. Chi square

2 = 13.33 One sample example from the cpe: Party: 75% male, 25% femaleThere are 40 swimmers. Since 75% of people at party are male, 75% of swimmers should be male. So expected value for males is .750 X 40 = 30. For women it is .250 x 40 = 10.00 Observed 20 20 Expected 30 10 O-E -10 10 (O-E)2 100 100 (O-E)2/E 3.33 10 Male Female df = k-1 = 2-1 = 1 Chi square

2 (1, n=40)= 13.33 Critical values of2 df 1 2 3 4 5 6 7 8 .05 3.84 5.99 5.82 9.49 11.07 12.59 14.07 15.51 .01 6.63 9.21 11.34 13.28 15.09 16.81 18.48 20.09 df 9 10 11 12 13 14 15 16 .05 16.92 18.31 19.68 21.03 22.36 23.68 25.00 26.30 .01 21.67 23.21 24.72 26.22 27.69 29.14 30.58 32.00 df 17 18 19 20 21 22 23 24 .05 27.59 28.87 30.14 31.41 32.67 33.92 35.17 36.42 .01 33.41 34.81 36.19 37.57 38.93 40.29 41.64 42.98 df 25 26 27 28 29 30 .05 37.65 38.89 40.14 41.34 42.56 43.77 .01 44.31 45.64 46.96 48.28 49.59 50.89 Exceeds critical value at  = .01 Reject the null hypothesis. Gender does affect who goes swimming. Women go swimming more than expected. Men go swimming less than expected.

Freshmen Sophomores 2 sample example Freshman and sophomores who like horror movies. 150 50 Likes horror films 100 200 Dislikes horror films Chi square

Freshmen Sophomores … CPE 15.2.1 Freshman and sophomores and horror movies. There are 500 altogether. 200 (or a proportion of .400 are freshmen, 300 (.600) are sophmores. (Proportions appear in parentheses in the margins.) Multiplying by row totals yield the following expected frequency for the first cell. (This time we use the formula: (Proprowncol)=Expected Frequency). (EF appears in parentheses in each cell.) (100) 200 (.400) 150 50 (100) Likes horror films 100 (150) 200 (150) 300 (.600) Dislikes horror films 250 500 250 Chi square

O-E 50 -50 -50 50 (O-E)2 2500 2500 2500 2500 (O-E)2/E 25.00 16.67 25.00 16.67 2 = 83.33 Computing 2 Observed 150 100 50 200 Expected 100 150 100 150 Fresh Likes Fresh Dislikes Soph Likes Soph Dislikes df = (C-1)(R-1) = (2-1)(2-1) = 1 Chi square

2 (1, n=500)= 83.33 Critical values of2 df 1 2 3 4 5 6 7 8 .05 3.84 5.99 5.82 9.49 11.07 12.59 14.07 15.51 .01 6.63 9.21 11.34 13.28 15.09 16.81 18.48 20.09 df 9 10 11 12 13 14 15 16 .05 16.92 18.31 19.68 21.03 22.36 23.68 25.00 26.30 .01 21.67 23.21 24.72 26.22 27.69 29.14 30.58 32.00 df 17 18 19 20 21 22 23 24 .05 27.59 28.87 30.14 31.41 32.67 33.92 35.17 36.42 .01 33.41 34.81 36.19 37.57 38.93 40.29 41.64 42.98 df 25 26 27 28 29 30 .05 37.65 38.89 40.14 41.34 42.56 43.77 .01 44.31 45.64 46.96 48.28 49.59 50.89 Critical at  = .01 Reject the null hypothesis. Fresh/Soph dimension does affect liking for horror movies. Proportionally, more freshman than sophomores like horror movies

The only (slightly) hard part is computing expected frequencies In one sample case, multiply n by hypothetical proportion based on random model. Random model says that proportion in population in each category should be same as in the sample. Chi square

Station Station Station Station A B C D 25 25 25 25 Expected Values 40 30 20 10 Observed Values Simple Example - 100 teenagers listen to radio stations H1: Some stations are more popular with teenagers than others. H0: Radio station do not differ in popularity with teenagers. NOTE: YOU ALWAYS TEST H0 Expected frequencies are the frequencies predicted by the null hypothesis. In this case, the problem is simple because the null predicts an equalproportion of teenagers will prefer each of the four radio stations. Is the observed significantly different from the expected? Chi square

O-E (O-E)2 (O-E)2/E 2 = 20.00 Observed Expected 40 30 20 10 25 25 25 25 15 5 -5 15 225 25 25 225 9.00 1.00 1.00 9.00 Closeness to final exam Category 1 Station 2 Station 3 Station 4 df = k-1 = (4-1) = 3 2(3, n=100) = 20.00, p<.01

Example - Admissions to Psychiatric Hospitals Close to a once/year final H1: More students are admitted to psychiatric hospitals when it is near their final exam. H0: Time from final exam does not have an effect on hospital admissions. . Category 1: Within 7 days of final. (11 admitted) Category 2: Between 8 and 30 days. (24 admitted) Category 3: Between 31 and 90 days. (69 admitted) Category 4: More than 90 days. (96 admitted) Chi square

Number of days Category 1 (within 7): Category 2 (8-30): Category 3 (31-90): Category 4 (rest of year): Psychiatric Admissions • Expected frequency=expected proportion of days*n • There are 365 days and 1 final and 200 patients admitted each year. • Proportion of each kind of day computed below: Chi square

Days: Category 1 (within 7): Category 2 (8-30): Category 3 (31-90 ): Category 4 (rest of year): Expected Frequencies To obtain expected frequencies with 200 admissions: multiply proportion of days of each type by n=200. This time the proportions are not equal. Chi square

O-E (O-E)2 (O-E)2/E 2 = 1.57 Observed Expected 11 24 69 96 8 26 66 100 3 -2 3 -4 9 4 9 16 1.12 0.15 0.14 0.16 Closeness to final exam Category 1 Category 2 Category 3 Category 4 df = k-1 = (4-1) = 3 2(3, n=200) = 1.57, n.s.

The only (slightly)hard part is computing expected frequencies In the multi-sample case, multiply the proportion in each row by n in each column to obtain EF in each cell. Chi square

Vit C and flu study • Sixty randomly chosen participants. • Thirty get Vitamin C. • Of that 30, 10 get the flu, 20 do not • Thirty get placebo • Of that 30, 15 get the flu, 15 do not Chi square

Expected frequency = proportionROW nCOL • got flu no flu row n (prop.) • Vit C 10 20 30 (.500) • No Vit C 15 15 30 (.500) • Col. Totals 30 30 n=60 Chi square

Had Influenza. No influenza. 10 (12.50) 20 (17.50) Vitamin C 15 (12.50) 15 (17.50) Placebo (Expected) Values Observed Values Expected frequencies Multiply the proportion in each row times the number in each column. Here Vitamin C row has 30 research participants. Total n = 60. So proportion in that row =30/60=.500. Same for placebo group. Number in each column: Twenty-five got influenza. So (25 X .500=12.50 should come from the Vitamin C group. Same for placebo. Thirty five did not get influenza, so 35X.500 = 17.5 of each group should not have gotten the flu. Are the observed significantly different from the expected? Chi square

O-E -2.50 2.50 2.50 -2.50 (O-E)2 6.25 6.25 6.25 6.25 (O-E)2/E .50 .36 .50 .36 2 = 1.72 Computing 2 Observed 10 20 15 15 Expected 12.50 17.50 12.50 17.50 VitC-got flu VitC-no flu Placebo-got flu Placebo-no flu df = (C-1)(R-1) = (2-1)(2-1) = 1 Chi square

Differences are not significant 2 (1, n=60)= 1.72, n.s. Vit C consumption not significantly related to getting the flu in this study. Chi square

A 3 x 4 Chi Square Women, stress, and seating preferences. (and perimeter vs. interior, front vs. back Front Front Back Back Perim Inter Perim Inter Very Stressed Females Moderately Stressed Females Control Group Females 10 70 5 15 100 15 50 10 25 100 35 30 15 20 100 30 60 n=300 150 60 Chi square

Proportion in each row nROW/n=100/300=.333 Chi square

Expected frequencies Women, stress, and perimeter versus interior seating preferences. Front Front Back Back Perim Inter Perim Inter Very Stressed Females Moderately Stressed Females Control Group Females 10 (20) 70 5 15 100 (20) 15 50 10 25 100 (20) 35 30 15 20 100 30 60 300 150 60 Chi square

Column 2 Women, stress, and perimeter versus interior seating preferences. Front Front Back Back Perim Inter Perim Inter Very Stressed Females Moderately Stressed Females Control Group Females 10 (20) 70 5 15 (50) 100 (20) 15 50 (50) 10 25 100 (20) 35 30 (50) 15 20 100 30 60 300 150 60 Chi square

Column 3 Women, stress, and perimeter versus interior seating preferences. Front Front Back Back Perim Inter Perim Inter Very Stressed Females Moderately Stressed Females Control Group Females 10 (20) 70 5 15 (50) (10) 100 (20) 15 50 (50) 10 (10) 25 100 (20) 35 30 (50) 15 (10) 20 100 30 60 300 150 60 Chi square

All the expected frequencies Women, stress, and perimeter versus interior seating preferences. Front Front Back Back Perim Inter Perim Inter Very Stressed Females Moderately Stressed Females Control Group Females 10 (20) 70 5 15 (50) (10) (20) 100 (20) 15 50 (50) 10 (10) 25 (20) 100 (20) 35 30 (50) 15 (10) 20 (20) 100 30 60 300 150 60 Chi square

O-E -10 20 -5 -5 (O-E)2 100 400 25 25 (O-E)2/E 5.00 8.00 2.50 1.25 2 = 41.00 Observed 10 70 5 15 Expected 20 50 10 20 Very Stressed FrontP FrontI BackP BackI 15 50 10 25 20 50 10 20 -5 0 0 5 25 0 0 25 1.25 0.00 0.00 1.25 Moderately Stressed FrontP FrontI BackP BackI 35 30 15 20 20 50 10 20 15 -20 5 0 225 400 25 0 11.25 8.00 2.50 0.00 Control Group FrontP FrontI BackP BackI df = (C-1)(R-1) = (4-1)(3-1) = 6

2 (6, N=300)= 41.00 Critical values of2 df 1 2 3 4 5 6 7 8 .05 3.84 5.99 5.82 9.49 11.07 12.59 14.07 15.51 .01 6.63 9.21 11.34 13.28 15.09 16.81 18.48 20.09 df 9 10 11 12 13 14 15 16 .05 16.92 18.31 19.68 21.03 22.36 23.68 25.00 26.30 .01 21.67 23.21 24.72 26.22 27.69 29.14 30.58 32.00 df 17 18 19 20 21 22 23 24 .05 27.59 28.87 30.14 31.41 32.67 33.92 35.17 36.42 .01 33.41 34.81 36.19 37.57 38.93 40.29 41.64 42.98 df 25 26 27 28 29 30 .05 37.65 38.89 40.14 41.34 42.56 43.77 .01 44.31 45.64 46.96 48.28 49.59 50.89 Critical at  = .01 Reject the null hypothesis. There is an effect between stressed women and seating position.

Observed 10 70 5 15 Expected 20 50 10 20 O-E -10 20 -5 -5 (O-E)2 100 400 25 25 (O-E)2/E 5.00 8.00 2.50 1.25 Very Stressed FrontP FrontI BackP BackI 15 50 10 25 20 50 10 20 -5 0 0 5 25 0 0 25 1.25 0.00 0.00 1.25 Moderately Stressed FrontP FrontI BackP BackI Very stressed women avoid the perimeter and prefer the front interior. The control group prefers the perimeter and avoids the front interior. 35 30 15 20 20 50 10 20 15 -20 5 0 225 400 25 0 11.25 8.00 2.50 0.00 Control Group FrontP FrontI BackP BackI 2 = 41.00 df = (C-1)(R-1) = (4-1)(3-1) = 6

Summary: Different Ways of Computing the Frequencies Predicted by the Null Hypothesis • One sample • Expect subjects to be distributed equally in each cell. OR • Expect subjects to be distributed proportionally in each cell. OR • Expect subjects to be distributed in each cell based on prior knowledge, such as, previous research. • Multi-sample • Expect subjects in different conditions to be distributed similarly to each other. Find the proportion in each row and multiply by the number in each column to do so. Chi square

Conclusion - Chi Square • Chi Square is a non-parametric statistic,used for nominal data. • It is equivalent to the F test that we used for single factor and factorial analysis. • Chi Square compares the expected frequencies in categories to the observed frequencies in categories. Chi square

… Conclusion - Chi Square The null hypothesis: • H0: fo = fe • There is no difference between the observed frequency and frequency predicted by the null hypothesis. The experimental hypothesis: • H1: fo  fe • The observed frequency differs significantly from the frequency expected by the null hypothesis. Chi square

The end. Chi square

Chapter 14

Chapter 14

Presentation Transcript

Chapter 14

Chapter 14

Chapter 14

Chapter 14

Chapter 14

Chapter 14

Chapter 14

Chapter 14

Chapter 14

Chapter 14

Chapter 14

Chapter 14

Chapter 14

Chapter 14.

Chapter 14

Chapter 14

CHAPTER 14

Chapter 14

Chapter 14

Chapter 14

Chapter 14

Chapter 14