Understanding Types of Sample Designs and the Chi-Square Test in Research
This chapter provides an overview of various sampling techniques used in research, including naturalistic, purposive cohort, and case-control samples. It explains how to implement these designs using illustrative examples, particularly focusing on studying the association between CMV (cytomegalovirus) and restenosis. Additionally, the chapter covers the Chi-Square Test of Association, including hypotheses testing, P-values, and the importance of avoiding small sample pitfalls. The applications of these statistical methods in epidemiological studies are highlighted, emphasizing their relevance for understanding disease trends and risk factors.
Understanding Types of Sample Designs and the Chi-Square Test in Research
E N D
Presentation Transcript
In Chapter 18: • 18.1 Types of Samples • 18.2 Naturalistic and Cohort Samples • 18.3 Chi-Square Test of Association • 18.4 Test for Trend • 18.5 Case-Control • 18.6 Matched Pairs
Types of Samples I. Naturalistic Samples ≡simple random sample or complete enumeration of the population II. Purposive Cohorts ≡ select fixed number of individuals in each exposure group III. Case-Control ≡ select fixed number of diseased and non-diseased individuals
Naturalistic (Type I) Sample Random sample of study base
Naturalistic (Type I) Sample Random sample of study base • How did we study CMV (the exposure) and restenosis (the disease) with a naturalistic sample? • A population was identified and sampled • The sample was classified as CMV+ and CMV− • The outcome (restenosis) was studied and compared in the groups.
Purposive Cohorts (Type II sample) Fixed numbers in exposure groups • How would I do study CMV and restenosis with a purposive cohort design? • A population of CMV+ individuals would be identified. • From this population, select, say 38, individuals. • A population of CMV− individuals would be identified. • From this population, select, say, 38 individuals. • The outcome (restenosis) would be studied and compared among the groups.
Case-control (Type III sample) Set number of cases and non-cases • How would I do study CMV and restenosis with a case-control design? • A population of patents who experienced restenosis (cases) would be identified. • From this population, select, say 38, individuals. • A population of patients who did not restenose (controls) would be identified. • From this population, select, say, 38 individuals. • The exposure (CMV) would be studied and compared among the groups.
Case-Control (Type III sample) Set number of cases and non-cases
Naturalistic Sample Illustrative Example • SRS of 585 • Cross-classify education level (categorical exposure) and smoking status (categorical disease) • Talley R rows by C columns “cross-tab”
Table Margins Row margins Total Column margins
Example Prevalence of smoking by education: Example, prevalence group 1:
Relative Risks Let group 1 represent the least exposed group
Illustration: RRs Note trend
k Levels of Response Efficacy of Echinacea. Randomized controlled clinical trial: echinacea vs. placebo in treatment of URI in children. Response variable ≡ severity of illness Source: JAMA 2003, 290(21), 2824-30
Echinacea Example • Purposive cohorts row percents • % severe, echinacea = 48 / 329 = .146 = 14.6% • % severe, placebo = 40 / 367 = .109 = 10.9% • Echinacea group fared worse than placebo
§18.3 Chi-Square Test of Association A. Hypotheses. H0: no association in population Ha: association in population B. Test statistic – by hand or computer C. P-value. Via Table E or software
Chi-Square Example H0: no association in the population Ha: association in the population Data
Chi-Square P-value • X2stat= 13.20 with 4 df • Table E 4 df row bracket chi-square statistic look up tail regions (approx P-value) • Example (below) shows bracketing values for example are 11.14 (P = .025) and 13.28 (P = .01) thus .01 < P < .025
Illustration: X2stat= 13.20 with 4 df The P-value = AUC in the tail beyond X2stat
WinPEPI > Compare2 > F1 Input screen row 5 not visible Output
Continuity Corrected Chi-Square • Two different chi-square statistics • Both used in practice • Pearson’s (“uncorrected”) chi-square • Yates’ continuity-corrected chi-square:
Chi-Square, cont. • How the chi-square works. When observed values = expected values, the chi-square statistic is 0. When the observed minus expected values gets large evidence against H0 mounts • Avoid chi-square tests in small samples. Do not use a chi-square test when more than 20% of the cells have expected values that are less than 5.
Chi-Square, cont. 3. Supplement chi-squares with measures of association. Chi-square statistics do not quantify effects (need RR, RD, or OR) 4. Chi-square and z tests (Ch 17) produce identical P-values. The relationship between the statistics is:
18.4 Test for Trend See pp. 431 – 436
§18.5 Case-Control Sampling • Identify all cases in source population • Randomly select non-cases (controls) from source population • Ascertain exposure status of subjects • Cross-tabulate Efficient way to study rare outcomes
Case-Control Sampling Select non-case at random when case occurs Miettinen. Am J Epidemiol 1976; 103, 226-235.
Odds Ratio Cross-tabulate exposure (E) & disease (D) Calculate cross-product ratio OR stochastically = RR
BD1 Data • Cases: esophageal cancer • Controls: noncases selected at random from electoral lists • Exposure: alcohol consumption dichotomized at 80 gms/day Relative risk associated with exposure
WinPEPI’s Mid-P interval similar to ours WinPEPI > Compare2 > A. Data entry Output
Ordinal Exposure Break data up into multiple tables, using the least exposed level as baseline each time
Dose-response Ordinal Exposure
18.6 Matched Pairs • Cohort matched pairs: each exposed individual uniquely matched to non-exposed individual • Case-control matched pairs: each case uniquely matched to a control • Controls for matching (confounding) factor • Requires special matched-pair analysis
Matched-PairsCase-Cntl Example Cases = colon polyps; Controls = no polyps Exposure = low fruit & veg consumption 88% higher risk w/ low fruit/veg consumption
WinPEPI > PairEtc > A. Input Output
Hypothesis TestMatched Pairs A. H0: OR = 1 B. McNemar’s test (z or chi-square) C. P-value from z stat Avoid if fewer than 5 discordancies expected