Analysis Issues in Assessing Efficacy in Randomized Clinical Trials “Intention to Treat” and Compliance Elizabeth Garrett-Mayer Oncology Biostatistics April 26, 2004
Randomized Clinical Trials • Why are randomized trials the “gold-standard” for assessing treatment efficacy? • Randomization! • Balances factors that might be related to treatment effects across groups • Controls confounding. • Avoids selection bias in forming groups.
General Problem • Study subjects do not always adhere to protocol • Drop-out • Switch treatments • Take only a portion of assigned treatment • How do we account for ‘compliance’? • Most would say: “we don’t and we shouldn’t!”
Example: Coronary Drug Project • Total mortality using clofibrate vs. placebo in men with history of myocardial infarction • Good adherers, clofibrate: 15% mortality • Poor adherers, clofibrate: 25% mortality • Good adherers, placebo: 15% mortality • Poor adherers, placebo: 28% mortality • Tried to ‘adjust’ but it didn’t help.
Intention to Treat (ITT) • What is “intention to treat”? • Analyze the data based purely on the randomization • Ignore the following: • Cross-overs • Non-compliance/Drop-outs • Sounds illogical, but, in principle, it isn’t. • Some encourage ‘supplementary analyses’ which look at compliers only
Examples of Violation of ITT • Compare only patients who actually received assigned treatment. • Assign patients to comparison groups based on the treatment they received. • Exclude patients with low adherence/compliance
What do we know about compliance? • In general, compliance ….. • Is not random • Individuals who are not compliant might also have other ‘factors’ which are related to the outcome • Is not dichotomous • Non-compliers can have varying levels of non-compliance • E.g. might only take ½ of prescribed medications, might only take ¼. • Can fluctuate over time • Often, compliance is good early in study and then tapers off. • Sometimes, patients will take lots of meds close to office visit to ‘make-up’ for non-compliance. • Is hard to measure • Reliability • Completeness • Inequality of follow-up across arms
So….. • It is potentially “hazardous” to rely on analyses that allow for non-compliance • ITT is unbiased: it measures ‘effect’ in global sense • If people are non-compliant on trial, they are likely to be non-compliant in “real-life” • If people switch medications, or self-medicate on trial, they are likely to do that in “real-life” • And, compliance analyses are usually an afterthought: • Not part of the clinical trial protocol • Ad hoc analyses decided after the study is over.
Tempting… • It is tempting to analyze by ‘treatment received’, BUT! • The groups are no longer comparable • Effectiveness of treatment should incorporate compliance (outside trial people may be even LESS compliant)
But, ITT is not always ideal • Supplementary analyses are often warranted • They can provide additional information • But, by and large, experts agree: ANALYSIS BY “INTENTION TO TREAT” SHOULD REMAIN THE MAIN STATISTICAL APPROACH FOR PRESENTING COMPARATIVE RESULTS FROM RANDOMIZED CLINICAL TRIALS.
Example • Serum Cholesterol in elderly hypertension trial • Patients were randomized to either (A) diuretic, (B) beta-blockers, or (C) placebo • 1 year post randomization: • (A) vs. (C): +0.12 mmol/l change in serum cholesterol (p=0.001) • (B) vs. (C): +0.08 mmol/l change in serum cholesterol (p=0.003) • SURPRISING: Why would there be a lipid effect of beta-blockers?
Compliance issues • 30% of beta-blocker group were also receiving diuretic by 1 year either instead of or in addition to beta-blocker. • Alternative analysis: Consider 3 groups • Diuretic alone • Beta-blocker alone • Both • Results: • Diuretic alone: +0.11 (p<0.001) • Beta-blocker alone: +0.03 (p=0.20)
How to interpret these results? • ITT is not “wrong” analysis • But, the additional analysis provides insight. • Sometimes, however, it gets messy and hard to interpret.
Example: Febrile Seizures • Use of phenobarbitol for the prevention of recurrence of febrile seizures in children. • Question: it might help seizures, but does it hurt child’s cognition? • Randomized double blind placebo controlled trial • Outcomes: Seizure recurrence, change in IQ • Some failed compliance • Some crossed-over • Depending on how adherence is defined, different results and different inferences.
Sometimes ITT is not an option • Two kinds of outcomes (generally): • Visit-related: quantitative lab measures, symptoms • Events: death, relapse, development of disease. • Visit-related endpoints are harder for follow-up • Patients may drop out between the baseline and follow-up visit. • Non-compliance with treatment is related to non-compliance with follow-up. • Non-compliance is not independent of treatment group.
Example: Incomplete Follow-Up • MAAS: Multicentre Anti-Atheroma Study • Simvastin versus placebo • N = 381 patients with coronary artery disease (CAD) • Outcomes: Mean change in 4 year mean and minimum lumen diameter of preselected segments of coronary arteries • Study planners realized four year follow-up would only be achieved by a subset of patients
Example: Incomplete Follow-Up • How can we plan ahead for that? • Options: • Increase sample size? • Use 4 year data on completers only? • Use LOCF (last observation carried forward)? • Problems: • Sample size increase will still not help with the bias • Completers only analysis introduces bias • LOCF has validity issues: assumes that patients observation at, for example, 2 years is the same as at 4 years.
Example: Incomplete Follow-Up • Planners decided to use LOCF • Preserved the ITT approach • Introduced bias into the measurements
Another example: Differential Dropout • Inhaled corticosteroids vs. placebo • 116 kids with asthma • Outcome measure is FEV (forced expiratory volume) • More patients withdrew on placebo arm than on corticosteroid arm (26 vs. 3). • Dropout due to exaccerbation of symptoms (so, maybe treatment works!) • Difficult to interpret quantitative results • “Informative censoring”
Using Compliance Data • Example: Obesity study • European multi-centre double-blind randomized trial of dexfenfluramine (dF) versus placebo. • 1 year follow-up of 822 obese patients • Compliance data: • Plasma concentrations of fenfluramine(F) and its metabolite norfenflurmaine (nF) taken at 6 and 12 months. • Compliance “outcome” is nF+F. • Original study found significant effect of dF, but wanted to address the issue of compliance
So, now what? • How can we use the compliance information in assessing efficacy? • Think of a regression approach: Pocock et al.
How to understand the equation: dF: Placebo :
What does this tell us? • It helps understand the mechanism • Model makes certain assumptions • “Linear” change in weight loss • Placebo treated are “like” dF treated patients • But, we can make useful inferences • Missing data???
Other compliance approaches • Pill counts • Pros • Easy and non-invasive approach • Can ‘blind’ the patients • Cons • Easy for patient to pretend (by getting rid of pills) • Compliance may vary • Patient may take many pills just prior to visit • “Mems caps”: Medication Event Monitoring System • Diaries: interesting mechanism that not only ‘records’, but also might change the behavior.
Broader Issue • Confounding? Compliance associated with treatment. Compliance associated with outcome. Treatment associated with outcome???? treatment compliance ? outcome
Why then perform ITT and ignore compliance? • First, compliance is hard to measure • Second, we don’t want to make inferences where we have to ‘condition’ on compliance. • Third, and most importantly, it is a mistake to adjust for something that is related to treatment (e.g. compliance)! Recall “causal pathway” idea. treatment compliance outcome
What if compliance is not related to treatment? • No longer have confounding! treatment compliance outcome
Notice directionality of arrows treatment CONFOUNDING! treatment compliance compliance ? outcome outcome Compliance is NOT on causal pathway. What could give rise to this figure? Compliance is on causal pathway between treatment and outcome. If treatment can be self-selected, non-compliers might choose different treatment.
Broader Issue: Adjustment • My favorite confounding example • Observational study of the effects of coffee on lung cancer coffee smoking associated with coffee. smoking associated with cancer. But, coffee NOT associated with cancer. smoking ? cancer
What if? • What if coffee consumption was causally associated with smoking (i.e. coffee causes smoking?) coffee coffee causes smoking. smoking causes cancer. Does coffee cause cancer? smoking ? cancer
Adjustment • Attempt to remove effect of differences in baseline composition of groups on the outcome of interest. • Analytic procedure • Only for observational studies? • No: randomized studies might have imbalance that can be adjusted • How to adjust? • stratification or subgroup analyses • regression approaches (e.g. linear or logistic regression) • Adjustment factors SHOULD be measured prior to treatment assignment • Do not want to adjust for factors that are a result of protocol!