1 / 65

EPI 5240: Introduction to Epidemiology Screening and diagnostic test evaluation November 2, 2009

EPI 5240: Introduction to Epidemiology Screening and diagnostic test evaluation November 2, 2009. Dr. N. Birkett, Department of Epidemiology & Community Medicine, University of Ottawa. Session Overview. Review key features of tests for disease. Diagnostic test evaluation Study designs

ayanna
Télécharger la présentation

EPI 5240: Introduction to Epidemiology Screening and diagnostic test evaluation November 2, 2009

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. EPI 5240:Introduction to EpidemiologyScreening and diagnostic test evaluationNovember 2, 2009 Dr. N. Birkett, Department of Epidemiology & Community Medicine, University of Ottawa

  2. Session Overview • Review key features of tests for disease. • Diagnostic test evaluation • Study designs • Key biases • Screening programmes • Overview • Criteria for utility • Issues in evaluation and implementation • Regression to the mean

  3. Scenario (1) A 54 year old female teacher visited her FP for an ‘annual checkup’. She reported no illnesses in the previous year, felt well and had no complaints. Hot flashes related to menopause had resolved. A detailed physical examination, included breast palpation, was unremarkable. A screening mammogram was recommended as per current guidelines.

  4. Scenario (2) The mammogram results were ‘not normal’ and a follow-up breast biopsy was recommended. The surgeon confirmed the negative clinical exam. But, based on the abnormal mammogram, a fine-needle aspiration biopsy of the abnormal breast under radiological guidance was recommended. Pathological review of the biopsy revealed the presence of a malignant breast tumor. Further surgery was scheduled to pursue this abnormal finding.

  5. Why not here?

  6. FNA positive risk • 100% vs 64% • Depends on definition of a ‘positive’ FNA. • Must be clear carcinoma • 100% positive (0% false positives) • Abnormal cells, may not be cancer •  64% positive (36% false positive) • Why use second approach? • Reduces the risk that you will miss someone who has a true cancer • Tradeoff of sensitivity and specificity • More later

  7. Test Properties (1) • Most common situation (for teaching at least) assumes: • Dichotomous outcome (ill/not ill) • Dichotomous test results (positive/negative) • Represented as a 2x2 table (yet another variant!). • Advanced methods can consider tests with multiple outcomes • advanced; moderate; minimal; no disease

  8. Test Properties (2) True positives False positives False negatives True negatives

  9. Test Properties (4) Sensitivity = 0.90 Specificity = 0.95

  10. Test Properties (5) Sensitivity Specificity

  11. Test Properties (6) • Sensitivity = Pr(test positive in a person with disease) • Specificity = Pr(test negative in a person without disease) • Range: 0 to 1 • > 0.9: Excellent • 0.8-0.9: Not bad • 0.7-0.8: So-so • < 0.7: Poor

  12. Test Properties (7) • Generally, high sensitivity associated with low specificity and vice-versa (more later). • Do you want a test with high sensitivity or specificity? • Depends on cost of ‘false positive’ and ‘false negative’ cases. • PKU – one false negative is a disaster. • Ottawa Ankle Rules

  13. Test Properties (8) • Patients don’t ask: if I’ve got the disease how likely is it that the test will be positive? • They ask: My test is positive? Does that mean I have the disease? • Predictive values.

  14. Test Properties (9) PPV = 0.95 NPV = 0.90

  15. Test Properties (10) PPV NPV

  16. Test Properties (11) • PPV = Pr(subject has disease given that their test was positive) • NPV = Pr(subject doesn’t have disease given that their test was negative) • Range: 0 to 1 • PPV is affected by the prevalence of the disease in the target population. Sensitivity & specificity are not affected by prevalence. • To use test in new population, you need to ‘calibrate’ the PPV/NPV. • Example: sens = 0.85; spec = 0.9

  17. Test Properties (12) Tertiary care: research study. Prevalence=0.5 PPV = 0.89

  18. Test Properties (13)Calibration by hypothetical table Fill cells in following order: “Truth” Disease Disease Total PV Present Absent Test Pos 4th 7th 8th 10th Test Neg 5th 6th 9th 11th Total 2nd 3rd 1st (10,000)

  19. Test Properties (14) Primary care: Prevalence=0.01 1,075 85 990 0.85*100 PPV = 0.08 15 8,910 8,925 0.9*9900 9,900 100 0.01*10000

  20. Test Properties (16)Likelihood Ratio Post-test odds Pre-test odds post-test odds LR+ve = ----------------------- pre-test odds

  21. Test Properties (15)Likelihood ratio Post-test odds = 18.0 Pre-test odds = 1.00 Likelihood ratio (+ve) = LR(+) = 18.0/1.0 = 18.0

  22. Test Properties (17) • LR(+ve) gives the amount by which the odds of disease increase if the test is positive. • Big values are good. Need at least 8-10 to have an acceptable test. a * (b+d) sensitivity LR(+ve) = ------------------- = --------------------- (a+c) * b (1 – specificity) • LR(+ve) is not affected by disease prevalence. • Can be used to adjust PPV/NPV for differences in prevalence.

  23. Test Properties (18) • Adjusting PPV/NPV using LR(+ve) • Compute LR (+ve) from your test sample (LRtest) • Convert the new disease prevalence into odds (pre-test odds): • pre-test odds = p/(1-p) • Multiply pre-test odds by LRtest to give post-test odds (oddspost) • Convert oddspost to PPV: • PPV = oddspost/(1 + oddspost)

  24. Test Properties (19)PPV via LR(+ve) • Previous example • Prevalence = 1%; sens = 85%; spec = 90% • Pretest odds = .01/.99 = 0.0101 • LR+ = .85/.1 = 8.5 (>1, but not that great) • Post-test odds (+ve) = .0101*8.5 = .0859 • PPV = .0859/1.0859 = 0.079 = 7.9% • Compare to the ‘hypothetical table’ method (PPV=8%)

  25. Test Properties (20) • Most tests give continuous readings • Serum hemoglobin • PSA • X-rays • How to determine ‘cut-point’ for normal vs diseased (negative vs positive)? • ↑ sensitivity  ↓specificity • Receiver Operating Characteristic (ROC) curves

  26. Negative Positive False -ve False +ve

  27. Negative Positive False +ve False -ve

  28. AUC = Area Under Curve

  29. Diagnostic test study issues (1) • How do you select the subjects for a study to evaluate the properties of a diagnostic test? • Most test evaluations are done in tertiary care settings  PPV/NPV issues. • Three main methods of choosing subjects: • Take ‘all comers’ • Select a group of people with disease and a group without disease • Select a group who are test positive and a group who are test negative.

  30. Diagnostic test study issues (2) 3 2 1

  31. Diagnostic test study issues (3) • Method 1: • Inefficient – most people won’t have disease. • Method 2: • Hard to implement if test must be administered before outcome is known (e.g. a measure of reactive arterial narrowing and diagnosis of a heart attack) • Method 3: • Gives biased estimates of sensitivity/specificity (Work-up Bias)

  32. Diagnostic test study issues (4) • Spectrum Bias • It’s easy to diagnose a broken leg in a person with a compound fracture. • It’s much harder to distinguish someone with a hairline fracture from a person with a deep bruise or ligament injury. • Study must include subjects with the relevant spectrum of disease states. • Spectrum needed depends on purpose of the test.

  33. Diagnostic test study issues (5) • Work-up bias • The study selects patients based on the result of the diagnostic test (e.g. 100 test +ve and 100 test –ve). • sens/spec will be biased. • Example: • Evaluate a new method to screen men with chest pain. It’s hard to get men with known CHD (can’t be done in ED alone). You might try to select men based on results of the screening test.

  34. Work-up Bias TRUE TEST PERFORMANCE Sensitivity = 150/250 = 60% Specificity = 900/950 = 95% NOW, suppose we only studied 100 people with a negative test but everyone with a positive test?

  35. Work-up Bias (2) TEST PERFORMANCE FROM STUDY 0.1 * 1000 90 100 10 0.1 * 100 160 140 300 Sensitivity = 150/160 = 94% not 60% BIAS! Specificity = 90/140 = 64% not 95%

  36. Screening (1) • Screening • The presumptive identification of an unrecognized disease or defect by the application of tests, examinations or other procedures • Can be applied to an unselected population or to a high risk group. • Examples • Pap smears (cervical cancer) • Mammography (breast cancer) • Early childhood development • PKU

  37. Screening (2) • Levels of prevention: • Primary prevention • Secondary prevention • Tertiary prevention

  38. Screening (3) DPCP§ § Detectable Pre-Clinical Phase

  39. Screening (4)

  40. Screening (5) Criteria to determine if a screening programme should be implemented • Disease Factors • Severity • Presence of a lengthy DPCP • Evidence that earlier treatment improves prognosis

  41. Screening (6) • Test Factors • Valid - sensitive and specific with respect to DPCP • Reliable and reproducible (omitted from most lists, but shouldn't be) • Acceptable - cf. sigmoidoscopy • Easy • Cheap • Safe

  42. Screening (7) • Test Factors (cont) • Test must reach high-risk groups - cf Pap smears • Sequential vs parallel tests • Sequential  higher specificity • Parallel  higher sensitivity • System Factors • Follow-up provided and available to all • Treatment resources adequate

  43. Screening (8) • Evaluation of Screening • Can it work? • Does it work in the real world? • Case-control vs. cohort vs. RCT • Are we evaluating • Screening alone • Mammography and breast cancer detection • Screening plus therapy • Mammography and survival

  44. Screening (9) • Biases in interpreting evaluations of screening programmes. • Lead-time Bias • Detecting disease early gives more years of ‘illness’ but doesn’t prolong life • Length Bias • Slowly progressive cases are more likely to be detected than rapidly progressive cases

  45. Screening (10)

  46. Screening (11)

  47. Screening (12) • Study proposes to evaluate a screening programme in an RCT by comparing survival (adjusted for lead time bias) in people who were screened to those who were not screened. • Will give a biased estimate of effectiveness (screening will look ‘too good’).

More Related