Correcting for Selection Bias in Randomized Clinical Trials

Correcting for Selection Bias in Randomized Clinical Trials Vance W. Berger, NCI 9/15/05 FDA/Industry Workshop, DC

Outline • 1. What do we expect of randomization (4)? • 2. Chronological bias (2). • 3. Randomized blocks (3). • 4. Selection bias (7). • 5. Correcting selection bias (5). • 6. Further reading (4).

1. What Do We Expect? (1/4) • The success of randomization has often been questioned in randomized trials, because of baseline imbalances [1]. • For example, Schor [2] raised this concern in The University Group Diabetes Program. • Altman [3] raised this concern for a randomized comparison of talc to mustine for control of pleural effusions [4].

1. What Do We Expect? (2/4) • Because of an imbalance in the numbers of patients randomized to each group (134 vs. 116), the Western Washington Intracoronary Streptokinase Trial statisticians were “particularly concerned in verifying that the randomization process had been carried out as planned” [5]. • Weiss, Gill, and Hudis [6] audited a randomized South African trial of high-dose chemotherapy for metastatic breast cancer [7], noted imbalances in the numbers of patients allocated over time, and concluded that “It is unlikely that this sequence of treatment assignments could have occurred if the study were truly randomized.”

1. What Do We Expect? (3/4) • In a randomized study of a culturally sensitive AIDS education program [8], Marcus [9] hypothesized that “subjects with lower baseline knowledge scores … may have been channeled into the treatment group”, because of baseline imbalances across the randomized groups. • Jordhoy et al. [10] discussed a cluster randomized trial of palliative care conducted at the Palliative Medicine Unit of Trondheim University Hospital and noted that “The individual patient results [meaning baseline imbalances] suggested that diagnosis was not randomly distributed across the two groups”.

1. What Do We Expect? (4/4) • Two common themes emerge from all of these challenges of ostensibly randomized trials. • Questions are raised when either 1) the numbers of subjects do not match expectations or 2) the baseline characteristics of the participants differ greatly across the randomized groups. • Clearly, then, we expect more from randomized trials than just that they be randomized, and in fact randomization does not always create the balanced groups we would have hoped for.

2. Chronological Bias (1/2) • How can baseline imbalances be large enough that one would question the success of the randomization? • Completely unrestricted randomization ensures independence, but allows for unbalanced group sizes, and so is not used very often in practice. • Instead, some form of restricted randomization is used to ensure balanced group sizes at the end of the trial. • The random allocation rule makes this terminal balance in group sizes its only restriction, and so it allows for large baseline imbalances during the trial. • Suppose that many more early allocations are to one group, and more late allocations are to the other group. • Suppose further that the covariate distribution changes during the course of the trial; this is quite likely.

2. Chronological Bias (2/2) • There could be more females early, but during the trial another trial opens up just for females, so there are more males in this trial henceforth. • Gender is confounded with time, which, because of the imbalance, is confounded with treatments. • This is chronological bias [11], although the name is a misnomer as chronological bias does not systematically favor one group or the other. • Still, it is one cause of baseline imbalances. • The only way to control chronological bias is to introduce restrictions on the randomization.

3. Randomized Blocks (1/3) • Perhaps the most common form of restricted randomization is randomized or permuted blocks. • The idea is to force perfect balance every so often. • Block sizes may be fixed (e.g., 4) or varied (e.g., 2 & 4), and the random allocation rule is used within each block to ensure perfect balance in the block. • In unmasked trials, prior allocations are known. • Once all but one group has been exhausted in the block (e.g., EECC with size 4), all remaining allocations to that block will be deterministic.

3. Randomized Blocks (2/3) • In fact, in an EECC block even the 2nd is predictable, as one can use knowledge of the 1st allocation to do better than guessing. • Let P{E} be the proportion of remaining assignments to the experimental group E. • If there is 1:1 allocation between experimental group E and control C, with block size 4: • CCEE 2/4, 2/3, 2/2, 1/1 EECC 2/4, 1/3, 0/2, 0/1 • CECE 2/4, 2/3, ½. 1/1 ECEC 2/4, 1/3, ½, 0/1 • CEEC 2/4, 2/3, ½, 0/1 ECCE 2/4, 1/3, ½, 1/1

3. Randomized Blocks (3/3) • Only the 1st allocation of an EECC or CCEE block is unpredictable, and only the 1st and 3rd of CECE, CEEC, ECEC, or ECCE blocks are unpredictable. • Even if the investigator has never actually seen the allocation sequence, he or she will still know P{E} at the time a patient is considered for trial entry. • In fact, the investigator will know both P{E} (the predicted treatment assignment) and the set of covariates specific to the patient being considered. • Only if P{E} equals the unconditional probability (or 0.5 with 1:1 allocation) is there no prediction.

4. Selection Bias Mechanism (1/7) • Many authors state that, as a consequence of randomization, any baseline imbalances in a randomized trial must be random in origin. • Yet selection bias occurs if healthier patients are enrolled when P{E}>0.5 and sicker patients are enrolled when P{E}<0.5 (or vice versa). • Of course, this is not a concern in masked trials, because unmasking is required for P{E} to assume any value other than the uninformative 0.5. • But in practice, are there any truly masked trials?

4. Selection Bias Mechanism (2/7) • It will help to define our terms carefully. • Some define masked trials as those in which nobody knows who got what until the end. • Indeed, this is the objective of masking; to define randomization similarly in terms of its objective is to define a trial to be randomized if and only if any of its baseline imbalances are random. • And yet one cannot help but recall Socrates asking if an act was pious because the heavens approved, or if the heavens approved because it was pious.

4. Selection Bias Mechanism (3/7) • Just as one cannot confer with Zeus to inquire as to his approval of an action one is contemplating, so too is one unable to verify that each observed baseline imbalance was of a random origin. • This ideal would have to be a consequence, and not the definition, of randomization, and we are now left to wonder – what is randomization? • To make randomization, masking, and allocation concealment useful concepts, and avoid circular logic, we must define these three terms as actions that one can take (processes), and not as the realization of their intended outcomes [12].

4. Selection Bias Mechanism (4/7) • The process of randomization is nothing more, or less, than constructing treatment groups by randomly selecting non-overlapping subsets of the set of all accession numbers to be used [13]. • Note that this definition allows one to actually conduct a randomized trial (it is an action). • Can one eliminate selection bias as a consequence of randomization according to the definition? • Without allocation concealment (often defined as masking of each allocation only until a treatment is assigned to the patient in question), the answer is clearly no, but perfect masking implies perfect allocation concealment, which implies no bias.

4. Selection Bias Mechanism (5/7) • But do masking & allocation concealment claims confer true allocation concealment (and no bias)? • The process of masking, or not telling patients or physicians who got what, is clearly worthwhile, but information is not often contained very well. • Tell-tale side effects, e.g., may lead to unmasking. • Sealed envelopes have been held up to lights, files have been raided, and fake patients have been called in to ascertain the next allocation [14]. • So the effect of masking may not match its goal. • Unmasking may lead to evaluation biases; if it occurs after the patients have been selected then it should not lead to selection bias; however …

4. Selection Bias Mechanism (6/7) • Most RCTs use restricted randomization (blocks). • The patterns in the allocation sequence allow for prediction of the future allocations based on knowledge of the past ones, and selection bias [1]. • So even “masked” randomized trials with planned allocation concealment are not immune [12]. • One can compute the expected imbalance in a binary covariate to be 50% with blocks of size 2, 42% (block size 4), or 28% (block size 6) [15]. • The result is artificially large test statistics and posterior probabilities, artificially low p-values, and artificially narrow confidence intervals.

All patients randomized (20 male, 20 female) P{E}=0.0 (10 male) P{E}=0.5 (10 male, 10 female) P{E}=1.0 (10 female) 4. Selection Bias Mechanism (7/7) 20 blocks of size two each 10 ‘CE’ blocks, 10 ‘EC’ blocks For ‘CE’, P{E}=0.5, then 1.0 For ‘EC’, P{E}=0.5, then 1.0 Females respond better than males Selectively Semi-permeable Selectively Semi-permeable Permeable 50% 50% 100%t 100% Control Group (25% female, 75% male) Experimental Group (75% female, 25% male)

5. Correcting Selection Bias (1/5) • Selection bias can be prevented, detected, and corrected, but specialized methods are needed. • Recall that E & C are the experimental & control treatment groups (TG), respectively; P{E} is the proportion of E allocations remaining in the block. • If E is superior to C, then treatment group TG and response Y are correlated, as are P{E} and TG. • P{E} should be unbalanced, possibly prognostic. • But P{E} should not predict Y within a given TG. • Consider two patients who receive E, one known up front to get E (P{E}=1), one not (P{E}=0.50).

5. Correcting Selection Bias (2/5) • If E[Y|TG=E, P{E}] depends on P{E}, then P{E} is on the causal pathway of the mechanism of action of E; this would suggest selection bias. • For example, consider a study with 24 patients, 12 blocks of size two each, six each of EC and CE. • P{E}=0.5 if block position BP=1, P{E}=0 if BP=2 (EC block), and P{E}=1 if BP=2 (CE block). • Suppose that the response data turn out as follows. • BP=2, P{E}=0 BP=1, P{E}=1/2 BP=2, P{E}=1 T • C 0/6 3/6 0/0 3/12 • E 0/0 3/6 6/6 9/12

5. Correcting Selection Bias (3/5) • Fisher’s exact p-values are 0.04 (two-sided) or 0.02 (one-sided) for comparing either E to C or EC blocks to CE blocks; p=0.0003 one-sided or p=0.0007 two-sided for testing for trend in P{E} binomial proportions (Jonckheere-Terpstra). • So P{E} is even more predictive than treatment is! • Without allocation concealment P{E} is a perfect predictor of treatment group (TG), but allocation concealment (meaning the ability to predict but not observe) separates the effects of P{E} and TG.

5. Correcting Selection Bias (4/5) • The Berger-Exner test of selection bias [16] exploits this separation of effects, and is based on the ability of P{E} to predict Y, adjusting for TG. • The quantity P{E} can also be used to correct for selection bias, because there is no bias within a group of patients with the same P{E} value. • That is, P{E} is a balancing score much like the propensity score (used in observational studies). • P{E} functions as the propensity score, and was termed the “reverse propensity score” [17]. • So compare TGs within P{E} values [17] to ensure that the comparisons are free of bias.

5. Correcting Selection Bias (5/5) • That is, the suggestion is to use the RPS as a covariate, although it is an unusual covariate. • We might call the RPS a “reverse causality” covariate, because it does not bring about better outcomes but rather suggests that the patient was found to possess attributes that would do so. • So the RPS is a credential that reflects selection based on all attributes, but is not itself an attribute. • Further work is needed to clarify if the RPS should replace or supplement other covariates.

6. Further Reading (1/4) • More information is available -- just send me a message and I will send you articles. • Vance Berger • Vb78c@nih.gov • (301) 435-5303

6. Further Reading (2/4) • [1]. Berger VW, Weinstein S (2004). Ensuring the Comparability of Comparison Groups: Is Randomization Enough? Controlled Clinical Trials25, 515-524. • [2]. Schor, S. (1971). The University Group Diabetes Program: A Statistician Looks at the Mortality Results. JAMA217, 12, 1671-1675. • [3]. Altman, D. G. (1985). Comparability of Randomized Groups. The Statistician34, 125-136. • [4]. Fentiman, I. S., Rubens, R. D., Hayward, J. L. (1983). Control of Pleural Effusions in Patients with Breast Cancer. Cancer 52, 737-739. • [5]. Hallstrom, A., Davis, K. (1988). Imbalance in Treatment Assignments in Stratified Blocked Randomization. Controlled Clinical Trials9, 375-382. • [6]. Weiss, R. B., Gill, G. G., and Hudis, C. A. (2001). An On-Site Audit of the South African Trial of High-Dose Chemotherapy for Metastatic Breast Cancer and Associated Publications. Journal of Clinical Oncology19, 11, 2771-2777.

6. Further Reading (3/4) • [7]. Bezwoda, W. R., Seymour, L., and Dansey, R. D. (1995). High-Dose Chemotherapy with Hematopoietic Rescue as Primary Treatment for Metastatic Breast Cancer: A Randomized Trial. Journal of Clinical Oncology13, 2483-2489. • [8]. Stevenson, H. C., Davis, G. (1994). Impact of Culturally Sensitive AIDS Video Education on the AIDS Risk Knowledge of African American Adolescents. AIDS Education and Prevention6, 40-52. • [9]. Marcus SM (2001). Sensitivity Analysis for Subverting Randomization in Controlled Trials. Statistics in Medicine20, 545-555. • [10]. Jordhoy, M. S., Fayers, P. M., Ahlner-Elmqvist, M., Kaasa, S. (2002). Lack of Concealment May Lead To Selection Bias in Cluster Randomized Trials of Palliative Care. Palliative Medicine16, 43-49. • [11]. Matts, J. P. and McHugh, R. B. (1983). Conditional Markov chain design for accrual clinical trials. Biometrical Journal25, 563-577. • [12]. Berger, VW, Christophi, CA (2003). “Randomization Technique, Allocation Concealment, Masking, and Susceptibility of Trials to Selection Bias”, JMASM2, 1, 80-86. • [13]. Berger, VW (2004). “Selection Bias and Baseline Imbalances in Randomized Trials”, Drug Information Journal38, 1-2.

6. Further Reading (4/4) • [14]. Berger, VW (2005). Selection Bias and Covariate Imbalances in Randomized Clinical Trials, John Wiley & Sons, Chichester. • [15]. Berger, VW (2005). “Quantifying the Magnitude of Baseline Covariate Imbalances Resulting from Selection Bias in Randomized Clinical Trials” (with discussion), Biometrical Journal47, 2, 119-139. • [16]. Berger, VW, Exner, DV (1999). “Detecting Selection Bias in Randomized Clinical Trials”, Controlled Clinical Trials20, 319-327. • [17]. Berger, VW (2005). “The Reverse Propensity Score To Manage Baseline Imbalances in Randomized Trials”, Statistics in Medicine24, in press.

Correcting for Selection Bias in Randomized Clinical Trials

Correcting for Selection Bias in Randomized Clinical Trials

Presentation Transcript

Automated Selection of Patients for Clinical Trials

Introduction to Randomized Clinical Trials

Randomized Controlled Clinical Trials

Publication bias in clinical trials

Appraisal of Two Randomized Clinical Trials

RANDOMIZED TRIALS

Practice Guidelines from Randomized Clinical Trials

Survival Analysis for Randomized Clinical Trials

Randomized Clinical Trials (RCTs)

Survival Analysis for Randomized Clinical Trials

Correcting for Selection Bias in Randomized Clinical Trials

Analysis Issues in Assessing Efficacy in Randomized Clinical Trials

Establishing Efficacy through Randomized Controlled Clinical Trials

RANDOMIZED TRIALS

Survival Analysis for Randomized Clinical Trials

Treatment of Missing Data in Randomized Clinical Trials

Introduction to Randomized Clinical Trials

Randomized Trials

Statistics 542 Introduction to Clinical Trials Issues in Analysis of Randomized Clinical Trials

Introduction to Clinical Trials - Bias and the Need for Randomized Studies

Treatment of Missing Data in Randomized Clinical Trials

Appraisal of Two Randomized Clinical Trials