Facing Challenging Situations When Grading Strength of Evidence

Facing Challenging Situations When Grading Strength of Evidence Presenters: Nancy Santesso, RD, MLIS, McMaster University Nancy Berkman, PhD, RTI International

Background • Systematic reviewers need to provide clear judgments about the evidence that underlies conclusions of the review to enable decision-makers to use them effectively. • “Strength of evidence” grading is a key indicator of a review team’s level of confidence that the studies included in the review collectively reflect the true effect of an intervention on a health outcome. • Deciding on the appropriate strength of evidence grades can be challenging because of the complexity and unique characteristics of the evidence included in the review.

Session approach and goals • Briefly review the AHRQ approach to grading the strength of evidence • Assume some prior experience in grading • Present a series of strength of evidence grading challenges • Not necessarily one “right answer” and would like session participants to share their thoughts with their neighbor and then discuss with the full group • Nancy S. will review how GRADE would approach the decision

Steps in AHRQ EPC Approach to Grading SOE • Separately for RCT and observational study evidence, aggregated across studies, for each outcome • Score 5 required domains • Risk of bias (Study limitations), Consistency, Directness, Precision • Maybe Publication bias • Considering, possibly scoring, 3 additional domains • Dose-response association • Plausible confounding • Strength of association • Combine into a separate SOE grade for RCTs and observational studies and then combine into final grade

Risk of bias domain score • Concerns adequate control for bias based on both study design and study conduct of individual studies • Assesses the aggregate risk of bias of studies separately for RCTs and observational studies • Scores: high, medium, or low • Based on design, RCTs start as low Risk of Bias and Observational studies start as higher Risk of Bias • May be adjusted based on individual study conduct

Consistency domain score • Degree of similarity in the magnitude (or direction of effect) of different studies within the evidence base. • Consistent: same direction of effect (same side of “no effect”) and narrow range of effect sizes • Inconsistent: non-overlapping confidence intervals, significant unexplained clinical or statistical heterogeneity, etc • Unknown or not applicable: single study so cannot be assessed

Directness domain score • Whether evidence reflects a single, direct link between the intervention of interest and the ultimate health outcome under consideration • Direct: single direct link between the intervention and health outcome • Indirect: evidence relies on • Surrogate or proxy outcomes • More than one body of evidence (no head-to-head studies)

Precision domain score • Degree of certainty for estimate of effect with respect to a specific outcome • Precise: estimate allows a clinically useful decision • Imprecise: confidence interval is so wide that it could include clinically distinct (even conflicting) conclusions • Unknown: measures of dispersion not provided

Reporting Bias domain score • Publication bias: nonreporting of results • Selective outcome reporting: nonreporting of planned outcomes • Selective analysis reporting: reporting only the most favorable analyses • Suspected • Undetected

Additional “discretionary” domains • Dose-response association (pattern of larger effect with greater exposure): present, not present, NA • Plausible confounders (confounding that works in the direction opposite, “weakens” effect): present, absent • Strength of association (effect so large that cannot have occurred solely as a result of bias from confounders): strong, weak • Applicability is considered separately

Integrating domain scores into a SOE grade • EPCs can use different approaches to incorporating multiple domains into an overall strength of evidence grade • Important that it is consistent within the review and transparent • Evaluation needs to be made by (at least) 2 reviewers • Must document approach used

AHRQ and GRADE Grading Categories

Challenge 1: 1 study, continuous outcome, ‘significant effects’ • Question: What are the effects of a ‘fasting followed by vegan’ diet for rheumatoid arthritis in adults? • Outcome: Pain (13 months) – measured on a 10 cm VAS scale • Kjeldsen-Kragh 1991 - population (age 18-75), mild to severe rheumatoid arthritis

Challenge 1: 1 study, continuous outcome, ‘significant effects’ Risk of Bias: LOW • Allocation concealment • Random sequence generation by computerised random number generator • Blinding: no participants; outcome assessors, investigators and data analysts blinded • No loss to follow-up • Other biases – none Reporting bias: UNDETECTED: Comprehensive search of major databases, grey literature, contacting authors in field, & government funding--no additional studies

Challenge 1: 1 study, continuous outcome, ‘significant effects’

Challenge 1: What is the strength of evidence and why? Discuss with your neighbor Vote! Strength of evidence • High • Moderate • Low • Insufficient

Challenge 1: Assessment • Risk of bias LOW • Consistency: Unknown (one study) • Reporting bias: Undetected • Directness: Direct (outcome, population, intervention) • Precision? • Confidence intervals? • 34 people?

Optimal information size • We suggest the following: if the total number of patients included in a systematic review is less than the number of patients generated by a conventional sample size calculation for a single adequately powered trial, consider rating down for imprecision. Authors have referred to this threshold as the “optimal information size” (OIS) • http://stat.ubc.ca/~rollin/stats/ssize/

Rule of thumb • For continuous outcomes: suggest at least a sample size of 400 • More empirical evidence needed • Minimally Important Differences

Challenge 1 Assessment (modification) • Risk of bias: MEDIUM – no allocation concealment; 30% loss to follow-up - most treatment related but evenly distributed • Consistency: Unknown (single study) • Reporting bias: Undetected • Directness: Indirect (outcome, population – age >65 only, intervention) • Precision: Imprecision • Rating???

Challenge 2: 1 study, dichotomous outcome, ‘non-significant effects’ • Question: What are the effects of over the counter medications in acute pneumonia in children? • Outcome: not cured or not improved • Principi 1986 - population – inpatients age 2-16

Challenge 2: Assessment Risk of bias: LOW • Allocation concealment – unclear? • Adequate sequence generation – computer generated random numbers • Blinding of participants and outcome assessors; unclear for data analysts • Complete outcome data Reporting bias • Undetected; Selective outcome reporting bias: no, one study found for this medication and reported this outcome

Precision? • Confidence intervals • Power calculation • Rules of thumb

Sample size: Optimal information size given alpha of 0.05 and beta of 0.2 for varying control event rates and relative risks For any chosen line, evidence meets optimal information size criterion if sample size above the line

Number of events

Precision: • Confidence intervals • Power calculation • Rules of thumb

Challenge 3: Inconsistency and Precision Question: Effects of taxane chemotherapy in early breast cancer Outcome: febrile neutropaenia (adverse event) A priori exploration of heterogeneity: type of cancer; age; dose of taxane – could not explain heterogeneity

Challenge 3: Assessment Risk of bias: LOW Reporting bias: undetected Direct (population, intervention, outcome) Discuss with your neighbor Vote! Strength of evidence • High • Moderate • Low • Insufficient

Consistency and Precision Confidence intervals - Non significant?? Rules of thumb Optimal Information size - power calculation Unexplained inconsistency Overlapping confidence intervals I2, p value of Chi2

Challenge 4: RCT and observational study data • Major bleeding: Cold Knife Conization vs. LEEP for women with confirmed cervical abnormalities • What is the overallSOE grade?

Challenge 4: Are you more or less confident in the RCT data given the observational data? Discuss with your neighbor: Does the addition of the observational studies data make you more or less confident? Vote! Overall strength of evidence • High • Moderate • Low • Insufficient

Challenge 5: Telephone counselling to improve adherence to diet Narrative synthesis Total number of studies: 4 Total number of participants: 255

Challenge 5: Assessment • Risk of bias • Medium • Precision • All together 162 participants with small effect: OIS not met • Consistency • Some inconsistency • Directness • No concern • Reporting bias • No small negative study?

Nancy Santesso RD, MLIS, PhD Cand Department of Clinical Epidemiology and Biostatistics McMaster University santesna@mcmaster.ca Nancy Berkman, PhD Senior Health Policy Research Analyst Program on Healthcare Quality and Outcomes berkman@rti.org More Information

Facing Challenging Situations When Grading Strength of Evidence

Facing Challenging Situations When Grading Strength of Evidence

Presentation Transcript

Challenging Situations

GRADing Evidence

Grading evidence and recommendations

Grading the Strength of a Body of Evidence on Diagnostic Tests

Unit 6: Challenging Situations

Grading Strength of Evidence

Grading Evidence in Medicine

Facing Challenging Situations When Grading Strength of Evidence

Unit 6: Challenging Situations

Unit 6: Challenging Situations

Grading evidence and recommendations

SYNTHESIZING THE EVIDENCE Grading the Evidence

Evaluating and grading evidence

Grading the quality of evidence

Systematic Review Module 11: Grading Strength of Evidence

Grading Strength of Evidence

Systematic Review Module 11: Grading Strength of Evidence Interactive Quiz

Schools Facing Challenging Circumstances

Challenging Situations

The worst situations of low libido, strength

Grading the Strength of a Body of Evidence on Diagnostic Tests

Grading Strength of Evidence