1 / 21

Contingency Table Analysis

Contingency Table Analysis. Mary Whiteside, Ph.D. Overview. Hypotheses of equal proportions Hypotheses of independence Exact distributions and Fisher’s test The Chi squared approximation Median test Measures of dependence The Chi squared goodness-of-fit test Cochran’s test.

macey-nunez
Télécharger la présentation

Contingency Table Analysis

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Contingency Table Analysis Mary Whiteside, Ph.D.

  2. Overview • Hypotheses of equal proportions • Hypotheses of independence • Exact distributions and Fisher’s test • The Chi squared approximation • Median test • Measures of dependence • The Chi squared goodness-of-fit test • Cochran’s test

  3. Contingency Table Examples • Countries - religion by government • States – dominant political party by geographic region • Mutual funds - style by family • Companies - industry by location of headquarters

  4. More examples - • Countries - government by GDP categories • States - divorce laws by divorce rate categories • Mutual funds - family by Morning Star rankings • Companies - industry by price earnings ratio category

  5. Statistical Inference hypothesis of equal proportions H0: all probabilities (estimated by proportions, relative frequencies) in the same column are equal, H1:at least two of the probabilities in the same column are not equal Here, for an r x c contingency table, r populations are sampled with fixed row totals, n1, n2, … nr.

  6. Hypothesis of independence H0: no association i.e. row and column variable are independent, H1: an association, i.e. row and column variable are not independent Here, one populations is sampled with sample size N. Row totals are random variables.

  7. Exact distribution for 2 x 2 tables: hypothesis of equal proportions; n1 = n2 = 2

  8. Fisher’s Exact Test • For 2 x 2 tables assuming fixed row and column totals r, N-r, c, N-c: • Test statistic = x, the frequency of cell11 • Probability = hyper-geometric probability of x successes in a sample of size r from a population of size N with c successes

  9. Large sample approximation for either test • Chi squared = S [Observed - Expected]2 /Expected • Observed frequency for cell ij comes from cross-tabulation of data • Expected frequency for cell ij =Probability Cell ij * N • Degrees of freedom (r-1)*(c-1)

  10. Computing Cell Probabilities Assumes independence or equal probabilities (the null hypothesis) • Probability Cell ij = Probability Row i * Probability Column j = (R i/N) * (C j/N) • Expected frequency ij = (R/N)*(C/N)*N = R*C/N.

  11. Distribution of the Sum • Chi Square with (r-1)*(c-1) degrees of freedom • Assumes [Observed - Expected]2 /Expected is standard normal squared

  12. Implies [Observed - Expected] /Square root[Expected] is standard normal • Implies • = s2 and Observed is a Poisson RV • Poisson is approximately normal if m> 5, traditional guideline • Conover’s relaxed guideline page 201

  13. Measures of Strength: Categorical Variables • Phi 2x2 • Cramer's V for rxc • Pearson's Contingency Coefficient • Tschuprow's T

  14. Measures of Strength: Ordinal Variables • Lambda A .. Rows dependent • Lambda B .. Columns dependent • Symmetric Lambda • Kendall's tau-B • Kendall's tau-C • Gamma

  15. Steps of Statistical Analysis Significance - Strength 1- Test for significance of the observed association 2 -If significant, measure the strength of the association

  16. Consider the correlation coefficient a measure of association (linear relationship between two quantitative variables) • significant but not strong • significant and strong • not significant but “strong” • not significant and not strong

  17. r and Prob (p-value) • r = .20 p-value < .05 • r = .90 p-value < .05 • r = .90 p-value > .05 • r = .20 p-value > .05

  18. Concepts • Predictive associations must be both significant and strong • In a particular application, an association may be important even if it is not predictive (I.e. strong)

  19. More concepts • Highly significant , weak associations result from large samples • Insignificant “strong” associations result from small samples - they may prove to be either predictive or weak with larger samples

  20. Examples • Heart attack Outcomes by Anticoagulant Treatment • Admission Decisions by Gender

  21. Summary • Is there an association? • Investigate with Chi square p-value • If so, how strong is it? • Select the appropriate measure of strength of association • Where does it occur? • Examine cell contributions

More Related