1 / 15

Chapter 17: Chi-Square Tests

Chapter 17: Chi-Square Tests.

howe
Télécharger la présentation

Chapter 17: Chi-Square Tests

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Chapter 17: Chi-Square Tests • These tests can be used when all of the data from a study have been measured on nominal scales—that is, the data are in the form of frequencies for different categories (e.g., How many consumers studied preferred each of the five most popular brands of toothpaste in the United States? How many professors are female and how many are male in ten major academic fields in China?). • You would need a multinomial distribution to determine the exact probabilities associated with random choices. It is much easier to use an approximation, instead. A chi-square statistic is calculated, which follows, approximately, the well-known chi-square distribution, when the null hypothe-sis is true. Prepared by Samantha Gaies, M.A.

  2. The Chi-Square (χ2)Distribution • This distribution represents the null hypothesis and is therefore used to find the appropriate critical values for chi-square tests. • Its shape depends on the number of degrees of freedom associated with the table of data. • Because χ2 can’t be less than 0, all chi-square distributions are positively skewed, though the skewing decreases as the df get larger. • With infinite df, the chi-square distribution becomes identical to the normal distribution. • Similar to the way the F distribution is used for ANOVA, only one tail of the chi-square distribution is used to determine the statistical significance of a chi-square test. • Differences larger than expected by chance lead to χ2 values in the positive tail; unusually small values are in the smaller, negative tail, near zero. • Only large discrepancies from chance expectations are inconsistent with the null hypothesis. Prepared by Samantha Gaies, M.A.

  3. One-Way Chi-Square Tests • “One way” means that all categories are considered levels of the same factor. The df equal one less than the number of categories (i.e., k – 1) • χ2crit increases as alpha decreases, and increases as df increase • The formula for any chi-square test is as follows: where fo is the observed frequency for any one of the categories, fe is the expected frequency for that same category, and the summation is performed over all of the categories. • If the calculated value for χ2 is larger than the critical value of χ2, then the null hypothesis can be rejected. • The nature of the null hypothesis depends on the type of chi-square test performed, as described next. Prepared by Samantha Gaies, M.A.

  4. Types of One-Way Χ2Tests • Expected frequencies are hypothesized to be equal • Games of chance (with equal frequencies of outcomes), or • Testing for equal preferences among a set of categories • The expected frequency for each category is the total number of obtained responses (N) divided by the number of categories (k) 2.Population proportions are known • Sometimes we have good estimates of population proportions • # of voters registered in different political parties • % of population at different income levels Prepared by Samantha Gaies, M.A.

  5. Types of One-Way Chi- Square Tests (cont.) • The shape of a distribution is being tested • e.g., The distribution of annual income in a sample. The expected frequencies are generated by the appropriate normal distribution. Rejection of the null implies, if the sample were truly random, that the underlying population is not normally distributed. • If the population were known to be normal, rejection of the null would imply that either: • The sample is nota random one, or • The sample has not been drawn from the hypothesized population Prepared by Samantha Gaies, M.A.

  6. Types of One-Way Chi- Square Tests (cont.) • A theoretical model is being tested • Sometimes a well-formulated theory can make quantitative predictions in terms of frequencies expected in different categories. • In such a case, you will want your observed frequencies to be as close as possible to the theoretically expected frequencies. • For this type of chi-square test, you do not want to reject the null hypothesis, because the expected frequencies come from your research hypothesis. Prepared by Samantha Gaies, M.A.

  7. Try this example of the first type of Chi-Square test: Are any of the following three energy drinks preferred in the United States to any of the others? Note that the fes were based on dividing N (= 90) by k (= 3), the number of categories • df = k – 1 = 2; χ2.05 (2) = 5.99 • 3.47 < 5.99, so the null hypothesis (that these three drinks are equally preferred in the U.S. population) cannot be rejected Prepared by Samantha Gaies, M.A.

  8. The Relationship between the Binomial Test and the Chi-Square Test with Two Categories • With two categories, the null hypothesis can be tested by using the binomial distribution, yielding a z score, or by performing a one-way χ2 test. The χ2 statistic will equal the square of the z score. • Squaring the normal distribution yields a chi-square distribution with one degree of freedom. Prepared by Samantha Gaies, M.A.

  9. Two-Way Chi-Square Tests • The most interesting psychological questions involve the relationship between at least two variables rather than the distribution of just one variable. • The two-way chi-square test is appropriate for quantifying the relationship between two categorical variables. • It is often referred to as Pearson’s Chi-square Test of Association (or Independence). • The Null Hypothesis for the two-way test (H0) is: there is no association between the two variables; that is, the way one of the variables is distributed into categories does not change at different levels of the second variable. • The Alternative Hypothesis (HA) is simply: the null hypothesis is not true; that is, the two variables are not independent. Prepared by Samantha Gaies, M.A.

  10. Contingency Tables • The data for a two-way chi-square test are usually arranged in a two-way contingency table, also known as a cross-classification table. (Therefore, the data are often referred to as cross-classified categorical data.) • The df for a two-way chi-square test are equal to: (R– 1)(C – 1), where R = the number of rows in the table, and C = the number of columns The following is an example of a two-way contin-gency table for a study investigating the relationship between musical ability (high or low) and confidence (high, medium, or low) in 10-year-old children. Prepared by Samantha Gaies, M.A.

  11. Try this example… Fill in the expected frequencies in the empty parentheses in the table below for the musical ability–confidence example, using the formula shown below the table (where N is the total of all the frequencies in the table). After all the fes are found, the two-way chi-square statistic is calculated using the same formula as in the one-way case, applied to each cell of the table. Prepared by Samantha Gaies, M.A.

  12. Simplified Formula for the 2 x 2 Case where the frequencies of the 2 x 2 table are labeled as: A B C D Strength of Association • Can also be called effect size • For a 2 x 2 contingency table, it is appropriate to calculate Pearson’s r for the two variables, but first you have to assign arbitrary values (e.g., 1 and 2) to the two levels of each variable. • Pearson’s r for two dichotomous variables is called the phi coefficient, and it can be obtained directly from the chi-square statistic, as shown next. Prepared by Samantha Gaies, M.A.

  13. Phi (φ) Coefficient (can also be called the fourfold point correlation) • Measure of Effect Size for the 2 x 2 case only: • Because the square root is positive, the value of φranges between 0 and +1.0. • The square of φ is a measure of the proportion of variance accounted for in one variable by the other. Cramer’s phi (φC) • For contingency tables larger than 2 x 2 • k = # of rows or columns, whichever is smaller, or either if both are the same • Ranges from 0 to +1.0 Prepared by Samantha Gaies, M.A.

  14. Assumptions of the χ2 test 1. The categories are mutually exclusive and exhaustive (all cases fall into one and only one category) 2. Independence of observations: • This assumption is usually violated when the same subject is categorized more than once. • A violation of this assumption seriously undermines the validity of the test. 3. Minimal size of expected frequencies: • The use of the chi-square distribution as an approximation becomes inaccurate if the fes are too low. • Strict rule:fe for each cell should be at least 5 when df > 1; 10 when df = 1 • Less strict: no fe < 1, and no more than 20% of the fes < 5 Prepared by Samantha Gaies, M.A.

  15. Some Uses for the χ2Test for Independence • It can measure the strength of the relationship between two categorical variables. • It can be used for a study with a quantitative DV, originally designed to be analyzed with a t test or ANOVA • If the distribution of the DV is very far from normal, and N is not very large, it can be desirable to transform the DV to a few distinct categories (e.g., annual income < $30K; between $30K and $50K; $50K to $80K; > $80K). • Some power is lost by throwing away most of the quantitative information, but the validity of the chi-square test does not depend on making an assumption about the distribution of the DV. Prepared by Samantha Gaies, M.A.

More Related