Bias, Confounding and Fallacies in Epidemiology M. Tevfik DORAK http://www.dorak.info/epi
BIAS Definition Types Examples Remedies CONFOUNDING Definition Examples Remedies FALLACIES Definition (Effect Modification)
What is Bias? Bias is one of the three major threats to internal validity: Bias Confounding Random error / chance
What is Bias? Any trend in the collection, analysis, interpretation, publication or review of data that can lead to conclusions that are systematically different from the truth (Last, 2001) A process at any state of inference tending to produce results that depart systematically from the true values (Fletcher et al, 1988) Systematic error in design or conduct of a study (Szklo et al, 2000)
Bias is systematic error • Errors can be differential (systematic) or non-differential (random) • Random error: use of invalid outcome measure that equally misclassifies cases and controls • Differential error: use of an invalid measures that misclassifies cases in one direction and misclassifies controls in another • Term 'bias' should be reserved for differential or systematic error
Random Error Per Cent Size of induration (mm) WHO (www)
Systematic Error Per Cent Size of induration (mm) WHO (www)
Chance vs Bias Chance is caused by random error Bias is caused by systematic error Errors from chance will cancel each other out in the long run (large sample size) Errors from bias will not cancel each other out whatever the sample size Chance leads to imprecise results Bias leads to inaccurate results
Types of Bias • Selection bias • Unrepresentative nature of sample • Information (misclassification) bias • Errors in measurement of exposure of disease • Confounding bias • Distortion of exposure - disease relation by some other factor • Types of bias not mutually exclusive • (effect modification is not bias) • This classification is by Miettinen OS in 1970s • See for example Miettinen & Cook, 1981 (www)
Selection Bias • Selective differences between comparison groups that impacts on relationship between exposure and outcome • Usually results from comparative groups not coming from the same study base and not being representative of the populations they come from
Selection Bias Examples (www)
Selection Bias Examples (www)
Selection Bias Examples (www)
Selection Bias Examples (www)
Selection Bias Examples Selective survival (Neyman's) bias (www)
Selection Bias Examples • Case-control study: • Controls have less potential for exposure than cases • Outcome = brain tumour; exposure = overhead high voltage power lines • Cases chosen from province wide cancer registry • Controls chosen from rural areas • Systematic differences between cases and controls
Case-Control Studies: Potential Bias Schulz & Grimes, 2002 (www) (PDF)
Selection Bias Examples • Cohort study: • Differential loss to follow-up • Especially problematic in cohort studies • Subjects in follow-up study of multiple sclerosis may differentially drop out due to disease severity • Differential attrition selection bias
Selection Bias Examples Self-selection bias: - You want to determine the prevalence of HIV infection - You ask for volunteers for testing - You find no HIV - Is it correct to conclude that there is no HIV in this location?
Selection Bias Examples • Healthy worker effect: • Another form of self-selection bias • “self-screening” process – people who are unhealthy “screen” themselves out of active worker population • Example: • - Course of recovery from low back injuries in 25-45 year olds • - Data captured on worker’s compensation records • - But prior to identifying subjects for study, self-selection has already taken place
Selection Bias Examples • Diagnostic or workup bias: • Also occurs before subjects are identified for study • Diagnoses (case selection) may be influenced by physician’s knowledge of exposure • Example: • - Case control study – outcome is pulmonary disease, exposure is smoking • - Radiologist aware of patient’s smoking status when reading x-ray – may look more carefully for abnormalities on x-ray and differentially select cases • Legitimate for clinical decisions, inconvenient for research
Types of Bias • Selection bias • Unrepresentative nature of sample • ** Information (misclassification) bias ** • Errors in measurement of exposure of disease • Confounding bias • Distortion of exposure - disease relation by some other factor • Types of bias not mutually exclusive • (effect modification is not bias)
Information / Measurement / Misclassification Bias Method of gathering information is inappropriate and yields systematic errors in measurement of exposures or outcomes If misclassification of exposure (or disease) is unrelated to disease (or exposure) then the misclassification is non-differential If misclassification of exposure (or disease) is related to disease (or exposure) then the misclassification is differential Distorts the true strength of association
Information / Measurement / Misclassification Bias Sources of information bias: Subject variation Observer variation Deficiency of tools Technical errors in measurement
Information / Measurement / Misclassification Bias • Recall bias: • Those exposed have a greater sensitivity for recalling exposure (reduced specificity) • - specifically important in case-control studies • - when exposure history is obtained retrospectively • cases may more closely scrutinize their past history looking for ways to explain their illness • - controls, not feeling a burden of disease, may less closely examine their past history • Those who develop a cold are more likely to identify the exposure than those who do not – differential misclassification • - Case: Yes, I was sneezed on • - Control: No, can’t remember any sneezing
Information / Measurement / Misclassification Bias Reporting bias: Individuals with severe disease tends to have complete records therefore more complete information about exposures and greater association found Individuals who are aware of being participants of a study behave differently (Hawthorne effect)
Controlling for Information Bias • - Blinding • prevents investigators and interviewers from knowing case/control or exposed/non-exposed status of a given participant • - Form of survey • mail may impose less “white coat tension” than a phone or face-to-face interview • - Questionnaire • use multiple questions that ask same information • acts as a built in double-check • - Accuracy • multiple checks in medical records • gathering diagnosis data from multiple sources
Types of Bias • Selection bias • Unrepresentative nature of sample • Information (misclassification) bias • Errors in measurement of exposure of disease • ** Confounding bias ** • Distortion of exposure - disease relation by some other factor • Types of bias not mutually exclusive • (effect modification is not bias)
Cases of Down Syndrome by Birth Order EPIET (www)
Cases of Down Syndrome by Age Groups EPIET (www)
Cases of Down Syndrome by Birth Order and Maternal Age EPIET (www)
Confounding • A third factor which is related to both exposure and outcome, and which accounts for some/all of the observed relationship between the two • Confounder nota result of the exposure • e.g., association between child’s birth rank (exposure) and Down syndrome (outcome); mother’s age a confounder? • e.g., association between mother’s age (exposure) and Down syndrome (outcome); birth rank a confounder?
Confounding To be a confounding factor, two conditions must be met: Exposure Outcome Third variable Be associated with exposure - without being the consequence of exposure Be associated with outcome - independently of exposure (not an intermediary)
Confounding Birth Order Down Syndrome Maternal Age Maternal age is correlated with birth order and a risk factor even if birth order is low
Confounding ? Down Syndrome Maternal Age Birth Order Birth order is correlated with maternal age but not a risk factor in younger mothers
Confounding Coffee CHD Smoking Smoking is correlated with coffee drinking and a risk factor even for those who do not drink coffee
Confounding ? Smoking CHD Coffee Coffee drinking may be correlated with smoking but is not a risk factor in non-smokers
Confounding Alcohol Lung Cancer Smoking Smoking is correlated with alcohol consumption and a risk factor even for those who do not drink alcohol
Confounding ? Smoking CHD Yellow fingers Not related to the outcome Not an independent risk factor
Confounding ? Diet CHD Cholesterol On the causal pathway
Confounding Imagine you have repeated a positive finding of birth order association in Down syndrome or association of coffee drinking with CHD in another sample. Would you be able to replicate it? If not why? Imagine you have included only non-smokers in a study and examined association of alcohol with lung cancer. Would you find an association? Imagine you have stratified your dataset for smoking status in the alcohol - lung cancer association study. Would the odds ratios differ in the two strata? Imagine you have tried to adjust your alcohol association for smoking status (in a statistical model). Would you see an association?
Confounding Imagine you have repeated a positive finding of birth order association in Down syndrome or association of coffee drinking with CHD in another sample. Would you be able to replicate it? If not why? You would not necessarily be able to replicate the original finding because it was a spurious association due to confounding. In another sample where all mothers are below 30 yr, there would be no association with birth order. In another sample in which there are few smokers, the coffee association with CHD would not be replicated.
Confounding Imagine you have included only non-smokers in a study and examined association of alcohol with lung cancer. Would you find an association? No because the first study was confounded. The association with alcohol was actually due to smoking. By restricting the study to non-smokers, we have found the truth. Restriction is one way of preventing confounding at the time of study design.
Confounding Imagine you have stratified your dataset for smoking status in the alcohol - lung cancer association study. Would the odds ratios differ in the two strata? The alcohol association would yield the similar odds ratio in both strata and would be close to unity. In confounding, the stratum-specific odds ratios should be similar and different from the crude odds ratio by at least 15%. Stratification is one way of identifying confounding at the time of analysis. If the stratum-specific odds ratios are different, then this is not confounding but effect modification.
Confounding Imagine you have tried to adjust your alcohol association for smoking status (in a statistical model). Would you see an association? If the smoking is included in the statistical model, the alcohol association would lose its statistical significance. Adjustment by multivariable modelling is another method to identify confounders at the time of data analysis.
Confounding For confounding to occur, the confounders should be differentially represented in the comparison groups. Randomisation is an attempt to evenly distribute potential (unknown) confounders in study groups. It does not guarantee control of confounding. Matching is another way of achieving the same. It ensures equal representation of subjects with known confounders in study groups. It has to be coupled with matched analysis. Restriction for potential confounders in design also prevents confounding but causes loss of statistical power (instead stratified analysis may be tried).
Confounding Randomisation, matching and restriction can be tried at the time of designing a study to reduce the risk of confounding. At the time of analysis: Stratification and multivariable (adjusted) analysis can achieve the same. It is preferable to try something at the time of designing the study.
Effect of randomisation on outcome of trials in acute pain Bandolier Bias Guide (www)
Confounding Obesity Mastitis Age In cows, older ones are heavier and older age increases the risk for mastitis. This association may appear as an obesity association