Rating the Evidence: Using GRADE to Develop Clinical Practice Guidelines

AHRQ Annual Meeting 2009:"Research to Reform: Achieving Health System Change" September 14, 2009 Yngve Falck-Ytter, M.D. Case Western Reserve University, Cleveland, Ohio Holger Schünemann, M.D., Ph.D. Chair, Department of Clinical Epidemiology & Biostatistics Michael Gent Chair in Healthcare Research McMaster University, Hamilton, Canada Rating the Evidence: Using GRADE to Develop Clinical Practice Guidelines

Disclosure In the past 5 years, Dr. Falck-Ytter received no personal payments for services from industry. His research group received research grants from Three Rivers, Valeant and Roche that were deposited into non-profit research accounts. He is a member of the GRADE working group which has received funding from various governmental entities in the US and Europe. Some of the GRADE work he has done is supported in part by grant # 1 R13 HS016880-01 from the Agency for Healthcare Research and Quality (AHRQ).

Content Part 1 • Introduction Part 2 • Why revisiting guideline methodology? Part 3 • The GRADE approach • Quality of evidence Part 4 • The GRADE approach • Strength of recommendations

Q to audience • Involved in giving recommendations? • Using any form of grading system? • Familiarity with GRADE: • Heard about GRADE before this conference? • Read a GRADE article published by the GRADE working group? • Attended a GRADE presentation? • Attended a hands-on GRADE workshop?

Reassessment of clinical practice guidelines • Editorial by Shaneyfelt and Centor (JAMA 2009) • “Too many current guidelines have become marketing and opinion-based pieces…” • “AHA CPG: 48% of recommendations are based on level C = expert opinion…” • “…clinicians do not use CPG […] greater concern […] some CPG are turned into performance measures…” • “Time has come for CPG development to again be centralized, e.g., AHQR…”

Evidence-based clinical decisions Clinical state and circumstances Patient values and preferences Expertise Research evidence Equal for all Haynes et al. 2002

Before GRADE Source of evidence Grades of recomend. Level of evidence I SR, RCTs A II Cohort studies B III Case-control studies IV Case series C V Expert opinion D Oxford Centre of Evidence Based Medicine; http://www.cebm.net

Where GRADE fits in Prioritize problems, establish panel Systematic review Searches, selection of studies, data collection and analysis Assess the relative importance of outcomes Prepare evidence profile: Quality of evidence for each outcome and summary of findings GRADE Assess overall quality of evidence Decide direction and strength of recommendation Draft guideline Consult with stakeholders and / or external peer reviewer Disseminate guideline Implement the guideline and evaluate

GRADE uptake

GRADE – Why revisiting guideline methodology?

Disclosure Dr. Schünemann receives no personal payments for service from the pharmaceutical industry. The research group he belongs to received research grants from the industry that are deposited into research accounts. Institutions or organizations that he is affiliated with likely receive funding from for-profit sponsors that are supporting infrastructure and research that may serve his work. He is documents editor for the American Thoracic Society and co-chair of the GRADE Working Group.

Content • Why grading • Confidence in information and recommendations Intro to: • Quality of evidence • Strength of recommendations

Please discuss the difference between consensus statements and guidelines?Be prepared to discuss your answer

There are no RCTs! • Do you think that users of recommendations would like to be informed about the basis (explanation) for a recommendation or coverage decision if they were asked (by their patients)? • I suspect the answer is “yes” • If we need to provide the basis for recommendations, we need to say whether the evidence is good or not so good; in other words perhaps “no RCTs”

Hierarchy of evidence • STUDY DESIGN • Randomized Controlled Trials • Cohort Studies and Case Control Studies • Case Reports and Case Series, Non-systematic observations BIAS Expert Opinion

Confidence in evidence • There always is evidence • “When there is a question there is evidence” • Better research  greater confidence in the evidence and decisions

Who can explain the following? • Concealment of randomization • Bias, confounding and effect modification • Blinding (who is blinded in a double blinded trial?) • Intention to treat analysis and its correct application • Why trials stopped early for benefit overestimate treatment effects? • P-values and confidence intervals

Hierarchy of evidence • STUDY DESIGN • Randomized Controlled Trials • Cohort Studies and Case Control Studies • Case Reports and Case Series, Non-systematic observations BIAS Expert Opinion Expert Opinion Expert Opinion

Reasons for grading evidence? Appraisal of evidence has become complex and daunting • People draw conclusions about the • quality of evidence and strength of recommendations • Systematic and explicit approaches can help • protect against errors, resolve disagreements • communicate information and fulfil needs • Change practitioner behavior • However, wide variation in approaches GRADE working group. BMJ. 2004 & 2008

Evidence Recommendation B Class I A 1 IV C Organization AHA ACCP SIGN Which grading system? Recommendation for use of oral anticoagulation in patients with atrial fibrillation and rheumatic mitral valve disease

What to do?

Recommendations vs statements!

Limitations of older systems & approaches • confuse quality of evidence with strength of recommendations

Levels of evidence

Recommendations

Limitations of older systems & approaches • confuse quality of evidence with strength of recommendations • lack well-articulated conceptual framework • criteria not comprehensive or transparent • focus on single outcomes

GRADE Quality of Evidence In the context of a systematic review • The quality of evidence reflects the extent to which we are confident that an estimate of effect is correct. In the context of making recommendations • The quality of evidence reflects the extent to which our confidence in an estimate of the effect is adequate to support a particular recommendation.

What makes you confident in health care decisions

Confident in the evidence? A meta-analysis of observational studies showed that bicycle helmets reduce the risk of head injuries in cyclists. OR: 0.31, 95%CI: 0.26 to 0.37 A meta-analysis of observational studies showed that warfarin prophylaxis reduces the risk of thromboembolism in patients with cardiac valve replacement. RR: 0.17, 95%CI: 0.13 to 0.24

GRADE: Quality of evidence The extent to which our confidence in an estimate of the treatment effect is adequate to support a particular recommendation. GRADE defines 4 categories of quality: • High • Moderate • Low • Very low

Quality of evidence across studies Outcome #1 Quality: High Outcome #2 Quality: Moderate Outcome #3 Quality: Low I B II V III

Determinants of quality • RCTs start high • Observational studies start low

What is the study design?

Determinants of quality What lowers quality of evidence? 5 factors: Methodological limitations Inconsistency of results Indirectness of evidence Imprecision of results Publication bias

Assessment of detailed design and execution (risk of bias) For RCTs: • Lack of allocation concealment • No true intention to treat principle • Inadequate blinding • Loss to follow-up • Early stopping for benefit Methodological limitations Inconsistency of results Indirectness of evidence Imprecision of results Publication bias

Allocation concealment 250 RCTs out of 33 meta-analysesAllocation concealment: Effect (Ratio of OR) adequate 1.00 (Ref.) unclear 0.67 [0.60 – 0.75] not adequate 0.59 [0.48 – 0.73] * • * significant Schulz KF et al. JAMA 1995

5 vs 4 chemo-Rx cycles for AML

Studies stopped early becasue of benefit

What about scoring tools? Example: Jadad score Was the study described as randomized? 1 Adequate description of randomization? 1 Double blind? 1 Method of double blinding described? 1 Description of withdrawals and dropouts? 1 Max 5 points for quality Jadad AR et al. Control Clin Trials 1996

Cochrane Risk of bias graph in RevMan 5

Look for explanation for inconsistency • patients, intervention, comparator, outcome, methods • Judgment • variation in size of effect • overlap in confidence intervals • statistical significance of heterogeneity • I2 Methodological limitations Inconsistency of results Indirectness of evidence Imprecision of results Publication bias

Heterogeneity Neurological or vascular complications or death within 30 days of endovascular treatment (stent, balloon angioplasty) vs. surgical carotid endarterectomy (CEA)

Indirect comparisons • Interested in head-to-head comparison • Drug A versus drug B • Tenofovir versus entecavir in hepatitis B treatment • Differences in • patients (early cirrhosis vs end-stage cirrhosis) • interventions (CRC screening: flex. sig. vs colonoscopy) • comparator (e.g., differences in dose) • outcomes (non-steroidal safety: ulcer on endoscopy vs symptomatic ulcer complications) Methodological limitations Inconsistency of results Indirectness of evidence Imprecision of results Publication bias

Small sample size • small number of events • wide confidence intervals • uncertainty about magnitude of effect Methodological limitations Inconsistency of results Indirectness of evidence Imprecision of results Publication bias

Imprecision Any stroke (or death) within 30 days of endovascular treatment (stent, balloon angioplasty) vs. surgical carotid endarterectomy (CEA)

Reporting of studies • publication bias • number of small studies Methodological limitations Inconsistency of results Indirectness of evidence Imprecision of results Publication bias

All phase II and III licensing trial for antidepressant drugs between 1987 and 2004 (74 trials – 23 were not published)

Quality assessment criteria Study design Lower if… Higher if… Quality of evidence Randomized trial Study limitations (design and execution) High Moderate Inconsistency What can raise the quality of evidence? Observational study Low Indirectness Very low Imprecision Publication bias

Rating the Evidence: Using GRADE to Develop Clinical Practice Guidelines