1 / 28

TESTING A TEST

TESTING A TEST. Ian McDowell Department of Epidemiology & Community Medicine November, 2004. A Lab Report (Montfort Hospital Biochem Lab). The Challenge of Clinical Measurement. Diagnoses are based on information, from formal measurements or from your clinical judgment

eknutson
Télécharger la présentation

TESTING A TEST

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. TESTING A TEST Ian McDowell Department of Epidemiology & Community Medicine November, 2004

  2. A Lab Report(Montfort Hospital Biochem Lab)

  3. The Challenge of Clinical Measurement • Diagnoses are based on information, from formal measurements or from your clinical judgment • This information is seldom perfectly accurate: • Random errors can occur • Biases in judgment or measurement can occur • Due to biological variability, this patient may not fit the general rule • Diagnosis (e.g., hypertension) involves a categorical judgment; this often requires dividing a continuous score (blood pressure) into categories. Choosing the cutting-point may be arbitrary

  4. Therefore… • You need to be aware … • That diagnosis is a matter of probabilities • That using a quantitative approach is better than just guessing! • That you will ultimately become familiar with the typical accuracy of measurements in your chosen clinical field • Of some of the ways to describe the accuracy of a measurement • That the principles apply to both diagnostic and screening tests

  5. Attributes of Tests or Measures • Cost, Safety, Acceptability, etc. • Reliability: reproducibility; this considers chance or random errors • Validity: Does it measure what it is supposed to measure? By extension, what diagnostic conclusion can I draw from a particular score on the test? Validity may be affected by bias, or systematic errors

  6. Reliability and Validity Reliability LowHigh • • • • • • • • Validity Low • • • • • • • • High • • • • • • • •

  7. Ways of Assessing Validity • Face, Content validity: does it make clinical or biological sense? Does it include the relevant symptoms? • Criterion: comparison to a “gold standard” definitive measure • Expressed as sensitivity and specificity • Construct validity (this is used with abstract themes, such as “quality of life” for which there is no definitive standard)

  8. “Gold Standards” Sensitivity and specificity are judged against • More definitive (but expensive or invasive) tests, such as a complete work-up, Or against • Eventual outcome (for screening tests, when workup of well patients is unethical)

  9. 2 x 2 Table for Testing a Test Gold standard Disease Disease Present Absent Positive test a (TP) b (FP) Negative test c (FN) d (TN) Validity: Sensitivity Specificity = a/(a+c) = d/(b+d) TP = true positive; FP = false positive…

  10. A Bit More on Sensitivity = Ability to detect disease when it is present • a/(a+c) = TP/(TP+FN) • Mnemonics: a sensitive person is one who can detect your feelings(1 – seNsitivity) = false Negative rate (i.e., How many cases are missed by the screening test?) • Cf. power of statistical test (1-)

  11. …and More on Specificity Ability to detect absence of disease when it is truly absent (can it detect non-disease?) • d/(b+d) = TN/(FP+TN) • Mnemonics: • a specific test would identify only that type of disease. “Nothing else looks like this” • (1- sPecificity) = false Positive rate (How many are falsely classified as having the disease?)

  12. Clinical applications • A specific test can be useful to rule in a disease. If the result on a specific test is positive, you can be sure the patient has the condition: “SpPin” • A sensitive test can be useful for ruling a disease out. A negative result on a very sensitive test reassures you that the patient does not have the disease: (“SnNout”)

  13. The Selection of a Cutting Point Well population Sick population Healthyscores Pathologicalscores Move this way to increase sensitivity Move this wayto increase specificity Crucial issue: changing cut-point can improve sensitivity or specificity, but at expense of the other

  14. Problems with Wrong Results • False Positives can arise due to other factors (such as taking other medications, diet, etc.) They entail cost and danger of investigations, labeling, worry • This is similar to Type I or alpha error in a test of statistical significance: the possibility of falsely concluding that there is an effect of an intervention. • False Negatives imply missed cases, so potentially bad outcomes if untreated • cf Type II or beta error: the chance of missing a true difference

  15. The Crucial Point: Predictive Values • Sensitivity & specificity are characteristics of the test • But the clinician, of course, gets the test result and do not know if this person is a true positive or a false positive (or a true or false negative). Hmmm… • How do we assess the predictive value of a positive or negative result?

  16. Predictive Values D + D - D + D - a a b T + T - • Based on rows, not columns • PPV = a/(a+b); interprets positive test • NPV = d/(c+d); interprets negative test • Immediately useful to clinician: they tell us about the population and thus the patient • Depend upon prevalence of disease, so must be determined for each clinical setting • As prevalence goes down, PPV goes down and NPV rises c d

  17. Same Test, Two Clinical Situations B. Primary Care: Prevalence = 55/1155 = 3% A. Referral hospital: Prevalence = 55/165 = 33% D + D - D + D - 50 100 50 10 T + T - T + T - 5 1000 5 100 Sensitivity = 50/55 = 91% Specificity = 1000/1100 = 91% Sensitivity = 50/55 = 91% Specificity = 100/110 = 91% PPV = 50/60 = 83% NPV = 100/105 = 95% PPV = 50/150 = 33% NPV = 1000/1005 = 99.5%

  18. Practical Question:“Doctor, what’s my likelihood of having the disease?” To answer this question • You need to have a general idea of the sensitivity & specificity of the test • To interpret the results, you also need to know roughly the prevalence of the condition in your practice. You can then work out the PPV and answer the patient’s question. “Give me a break, dude … Surely there is an easier way to bring all this together?”

  19. Prevalence of Disease • We have seen how this influences the interpretation of a test score • Before you do the test, prevalence gives your best guess about the probability that the patient has the disease • Also known as Pretest Probability of Disease: (a+c) / N in 2 x 2 table • Or, can be expressed as odds of disease: (a+c) / (b+d) a b c d N

  20. Estimating predictive values for a specific setting is called ‘calibrating’ the test You could: • Apply a the test and a definitive test to a consecutive series of patients (rarely feasible) • Calculate from Bayes’s Theorem (ouch!) • Draw a hypothetical table (maybe?) • Use a nomogram (tell me how)

  21. Calibration by hypothetical table Fill cells in following order: “Truth” Disease Disease Total PV Present Absent Test Pos Test Neg Total 4th 5th 7th 6th 8th 9th 10th 11th 2nd 3rd 1st (from prevalence) (from sensitivity) (from specificity)

  22. Combining Sensitivity and Specificity:Receiver Operating Characteristic Curves Work out Sen and Spec at every possible cut-point, then plot these. Area under the curve indicates the information provided by the test 1 0.8 0.6 Sensitivity Note: the theme of sensitivity & (1-specificity) will appearagain! 0.4 0.2 0 0 0.2 0.4 0.6 0.8 1 1-Specificity (= false positives)

  23. Likelihood Ratios • Defined as the odds that a given level of a diagnostic test result would be expected in a patient with the disease, as opposed to a patient without: true positives / false positives. • Advantages: • Express sensitivity and specificity in one number • Can be calculated for many levels of the test • Can be turned into predictive values • LR for positive test = Sensitivity / (1-Specificity) • LR for negative test = (1-Sensitivity) / Specificity

  24. Calibration with a Nomogram 1) You need the LR.2) Select pretest probability(prevalence) on left axis3) Select likelihood ratio on center axis 4) Draw line throughright axis to indicate post-test probability of disease Example: Prevalence = 30% LR+ = 20; Post-test probability = 91%

  25. Chaining LRs Together • Example: 45 year-old woman with 1-month history of intermittent chest pain. • Pretest probability about 1% for CAD • History suggestive of angina (substernal pain; radiating down arm; induced by effort; relieved by rest…). • LR of this history for angina is about 100

  26. The previous example: 1. From the History: She’s young;pretest probabilityabout 1% Pretest probabilityrises to 50%based on history

  27. Chaining LRs Together • 45 year-old woman with 1-month history of intermittent chest pain… After the history, post test probability is now about 50%. What will you do? Record an ECG • Results = 2.2 mm ST-segment depression. LR for ECG 2.2 mm = 10. • Overall post test probability is now >90% for coronary artery disease (see next slide)

  28. The previous example: ECG Results Post-test probabilitynow rises to 90% Now start pretest probability (i.e. prior to ECG)at 50%, based onhistory:

More Related