Contingency Tables • Often we have limited measurement of our data. • Contingency Tables are a means of looking at the impact of nominal and ordinal measures on each other. • They are called contingency tables because one variables value is contingent upon the other. • Also called cross-tabulation or crosstabs.
Contingency Tables • The procedure is quite simple and intuitively appealing • Construct a table with the independent variable across the top and the dependent variable on the side • This works fairly well for low numbers of categories (r,c < 6 or so)
Contingency Tables An example • Presidents are often suspected of using military force to enhance their popularity. • What do you suppose the data actually look like? • Any conjectures • Let’s categorize presidents as using force,or not, and as having popularity above and below 50% • Are there definition problems here? • Which is independent and which is dependent?
Measures of Independence • Are the variables actually contingent upon each other? • Is the use of force contingent upon the president’s level of popularity? • We would like to know if these variables are independent of each other, or does the use of force actually depend upon the level of approval that the president have at that time?
2 Test of Independence • The 2 Test of Independence gives us a test of statistical significance. • It is accomplished by comparing the actual observed values to those you would expect to see if the two variables are independent.
2 Test of Independence • Formula • Where
Interpreting the 2 • The Table gives us a 2 of 5.55 with 1 degree of freedom [d.f. = (r-1)*(c-1)] • The critical value of 2 with 1 degree of freedom is 3.84 (see 2 Table) • We therefore conclude that Presidential popularity and use of force are related. • We technically “reject the null hypothesis that Presidential popularity and use of force are independent.” • Note: 2 is influenced by sample size. • It ranges from 0.0 to .
Corrected 2 measures • Small tables have slightly biased measures of 2 • If there are cell frequencies that are low, then there are some adjustments to make that correct the probability estimates that 2 provides.
Yate’s Corrected 2 • For use with a 2x2 table with low cell frequencies (5<n<10) • If there are any cell frequencies < 5, the 2 is invalid. • Use Fisher’s Exact Test
Measures of Association • Not only do we want to see whether the variables of a cross-tabulation are independent, we often want to see if the relationship is a strong or weak one. • To do this, we use what are referred to as measures of association. • The level of measurement determines what measure of association we might use.
Measures of Association • We group them according to whether the variables are nominal or ordinal. • If one variable is nominal, use nominal measures. • If both are ordinal, use an ordinal measure. • If either is interval, generally we use a different statistical design.
Measures based on 2 • Contingency Coefficient • Kramer’s V
Yule’s Q • May be used on any 2x2 table, nominal or ordinal • If we define out table with cell counts as • Yule’s Q is calculated as: • Q ranges from 0 to 1.0 • Q compares concordant pairs to discordant pairs
Gamma • Will equal 1.0 if any cell is empty
Lambda • Asymmetric measure of association • Calculation depends on whether the column variable or the row variable is independent
Ordinal Measures • Goodman & Kruskal’s Gamma • For Ordinal x Ordinal tables • May also be used if one of the variables is a nominal dichotomy
Lambda • Asymmetrioc
Tau-b & Tau-c • Similar to Gamma • If r=c, use tau-b; if r<>c, use tau-c