Télécharger la présentation
## Chi-Square

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -

**Chi-Square**Heibatollah Baghi, and Mastee Badii**Chi-Square (χ2) and Frequency Data**• Up to this point, the inference to the population has been concerned with “scores” on one or more variables, such as CAT scores, mathematics achievement, and hours spent on the computer. • We used these scores to make the inferences about population means. To be sure not all research questions involve score data. • Today the data that we analyze consists of frequencies; that is, the number of individuals falling into categories. In other words, the variables are measured on a nominal scale. • The test statistic for frequency data is Pearson Chi-Square. The magnitude of Pearson Chi-Square reflects the amount of discrepancy between observed frequencies and expected frequencies.**Steps in Test of Hypothesis**• Determine the appropriate test • Establish the level of significance:α • Formulate the statistical hypothesis • Calculate the test statistic • Determine the degree of freedom • Compare computed test statistic against a tabled/critical value**1. Determine Appropriate Test**• Chi Square is used when both variables are measured on a nominal scale. • It can be applied to interval or ratio data that have been categorized into a small number of groups. • It assumes that the observations are randomly sampled from the population. • All observations are independent (an individual can appear only once in a table and there are no overlapping categories). • It does not make any assumptions about the shape of the distribution nor about the homogeneity of variances.**2. Establish Level of Significance**• α is a predetermined value • The convention • α = .05 • α = .01 • α = .001**3. Determine The Hypothesis:Whether There is an Association**or Not • Ho : The two variables are independent • Ha : The two variables are associated**4. Calculating Test Statistics**• Contrasts observed frequencies in each cell of a contingency table with expected frequencies. • The expected frequencies represent the number of cases that would be found in each cell if the null hypothesis were true ( i.e. the nominal variables are unrelated). • Expected frequency of two unrelated events is product of the row and column frequency divided by number of cases. Fe= Fr Fc / N**Continued**4. Calculating Test Statistics**Continued**4. Calculating Test Statistics Observed frequencies Expected frequency Expected frequency**5. Determine Degrees of Freedom**df = (R-1)(C-1) Number of levels in column variable Number of levels in row variable**6. Compare computed test statistic against a tabled/critical**value • The computed value of the Pearson chi- square statistic is compared with the critical value to determine if the computed value is improbable • The critical tabled values are based on sampling distributions of the Pearson chi-square statistic • If calculated 2 is greater than 2 table value, reject Ho**Example**• Suppose a researcher is interested in voting preferences on gun control issues. • A questionnaire was developed and sent to a random sample of 90 voters. • The researcher also collects information about the political party membership of the sample of 90 respondents.**Bivariate Frequency Table or Contingency Table**Observed frequencies**Row frequency**Bivariate Frequency Table or Contingency Table**Bivariate Frequency Table or Contingency Table**Column frequency**1. Determine Appropriate Test**• Party Membership ( 2 levels) and Nominal • Voting Preference ( 3 levels) and Nominal**2. Establish Level of Significance**Alpha of .05**3. Determine The Hypothesis**• Ho : There is no difference between D & R in their opinion on gun control issue. • Ha : There is an association between responses to the gun control survey and the party membership in the population.**Continued**4. Calculating Test Statistics = 50*25/90**Continued**4. Calculating Test Statistics = 40* 25/90**Continued**4. Calculating Test Statistics = 11.03**5. Determine Degrees of Freedom**df = (R-1)(C-1) =(2-1)(3-1) = 2**6. Compare computed test statistic against a tabled/critical**value • α = 0.05 • df = 2 • Critical tabled value = 5.991 • Test statistic, 11.03, exceeds critical value • Null hypothesis is rejected • Democrats & Republicans differ significantly in their opinions on gun control issues**Additional Information in SPSS Output**• Exceptions that might distort χ2Assumptions • Associations in some but not all categories • Low expected frequency per cell • Extent of association is not same as statistical significance Demonstrated through an example**Another Example Heparin Lock Placement**Time: 1 = 72 hrs 2 = 96 hrs from Polit Text: Table 8-1**Continued**Hypotheses in Heparin Lock Placement • Ho: There is no association between complication incidence and length of heparin lock placement. (The variables are independent). • Ha: There is an association between complication incidence and length of heparin lock placement. (The variables are related).**Continued**More of SPSS Output**Pearson Chi-Square**• Pearson Chi-Square = .250, p = .617 Since the p > .05, we fail to reject the null hypothesis that the complication rate is unrelated to heparin lock placement time. • Continuity correction is used in situations in which the expected frequency for any cell in a 2 by 2 table is less than 10.**Continued**More SPSS Output**Phi Coefficient**• Pearson Chi-Square provides information about the existence of relationship between 2 nominal variables, but not about the magnitude of the relationship • Phi coefficient is the measure of the strength of the association**Cramer’s V**• When the table is larger than 2 by 2, a different index must be used to measure the strength of the relationship between the variables. One such index is Cramer’s V. • If Cramer’s V is large, it means that there is a tendency for particular categories of the first variable to be associated with particular categories of the second variable.**Cramer’s V**• When the table is larger than 2 by 2, a different index must be used to measure the strength of the relationship between the variables. One such index is Cramer’s V. • If Cramer’s V is large, it means that there is a tendency for particular categories of the first variable to be associated with particular categories of the second variable. Smallest of number of rows or columns Number of cases**Take Home Lesson**How to Test Association between Frequency of Two Nominal Variables