600 likes | 742 Vues
Methodology Research Group. Methods of explanatory analysis for psychological treatment trials workshop. Session 1 Introduction to causal inference and the analysis of treatment effects in the presence of departures from random allocation Ian White. Funded by: MRC Methodology Grant G0600555
E N D
MethodologyResearch Group Methods of explanatory analysis for psychological treatment trials workshop Session 1 Introduction to causal inference and the analysis of treatment effects in the presence of departures from random allocation Ian White Funded by: MRC Methodology GrantG0600555 MHRN Methodology Research Group
Plan of session 1 • Describe departures from random allocation • Intention-to-treat analysis, per-protocol analysis and their limitations • What do we want to estimate? • Estimation methods: principal stratification • Instrumental variables • Structural mean model • Extensions: complex departures, missing data, covariates • Small group discussion Illustrated with data from the ODIN and SoCRATES trials
Get 0 Get X Get S Get E Get X Get 0 ??? ??? ? ?? ?? ? Switches Changes to non-trial treatment Parallel-group trial Recruit Randomise Standardtreatment (S) Experimental treatment (E) Get S Get E Measure outcome Measure outcome
Aim of session 1 • Infer causal effect of treatment in the presence of departures from randomised intervention • Better term than “non-compliance”: includes both non-adherence and changes in prescribed treatment • Types of departure: • Switches to other trial treatment or changes to non-trial (or no) treatment • Yes / no or quantitative (e.g. attend some sessions) • Constant or time-dependent • We’ll start by considering the simplest case: all-or-nothing switches to the other trial treatment • The methods introduced here will be used in later sessions
Plan of session 1 • Describe departures from random allocation • Intention-to-treat analysis, per-protocol analysis and their limitations • What do we want to estimate? • Estimation methods: principal stratification • Instrumental variables • Structural mean model • Extensions: complex departures, missing data, covariates • Small group discussion All illustrated with data from the ODIN trial
Intention-To-Treat (ITT) Principle http://www.consort-statement.org/ glossary: • “A strategy for analyzing data in which all participants are included in the group to which they were assigned, whether or not they completed the intervention given to the group. • “Intention-to-treat analysis prevents bias caused by the loss of participants, which may disrupt the baseline equivalence established by random assignment and which may reflect non-adherence to the protocol.” • Now the standard analysis – and rightly so
Intention-to-treat analysis • Compare groups as randomised, ignoring any departures • Answers an important pragmatic question • e.g. the public health impact of prescribing E • Disadvantage: this may be the wrong question! • may want to explore public health impact of prescribing E outside the trial, when compliance might be less • alternative pragmatic question • may want to know the effect of receiving E • explanatory question
Disadvantage of ITT • “Doctor doctor, will psychotherapy cure my depression?” • “I don’t know, but I expect prescribing psychotherapy to reduce your BDI score by 5 units … • on average … • that’s on average over whether you attend or not” • Clearly, judgements about whether a patient is likely to attend, take a drug, etc., should be a part of prescribing • But we often need to know effects of attendance, the drug, etc. in themselves
Per-protocol (PP) analysis • Alternative to ITT • Exclude any data collected after a departure from randomised treatment • requires careful pre-definition: what will be counted as departures? • Idea is to exclude data that doesn’t allow for the full effect of treatment • However, PP implicitly assumes that individuals with different treatment experience are comparable • rarely true • in practice there can be substantial selection bias
Alternative to ITT and PP • We adopt a “causal modelling” approach that carefully considers what we want to estimate and what assumptions are needed to do so • Estimation will avoid assumptions of comparability between groups as treated • will instead be based on comparisons of randomised groups
Plan of session 1 • Describe departures from random allocation • Intention-to-treat analysis, per-protocol analysis and their limitations • What do we want to estimate? • Estimation methods: principal stratification • Instrumental variables • Structural mean model • Extensions: complex departures, missing data, covariates • Small group discussion All illustrated with data from the ODIN trial
What do we want to estimate? • The effect of the intervention, if everyone had received their randomised intervention? • “average causal effect”, ACE • “average treatment effect”, ATE • conceptual difficulties: • how could we make them receive their randomised intervention? • would this be ethical? • would it have other consequences? • technical difficulties: • turns out to be unidentified (unestimable) without further strong assumptions
What do we want to estimate? (2) Alternatives to the average causal effect: • “Average treatment effect in the treated”, ATT • “Complier-average causal effect”, CACE • to be defined below • Note how we separate what we want to estimate from analysis methods
Counterfactuals • Consider a trial of intervention E vs. control S • Define “counterfactual” or “potential” outcomes: • Yi(1) = outcome for individual i if they received intervention • Yi(0) = outcome for individual i if they received control • We can only observe one of these! • Intervention effect for individual i is Di = Yi(1) - Yi(0) • Then average causal effect of intervention is E[Di] • the average difference between outcome with intervention and outcome with control
Estimation with perfect compliance • With perfect compliance, we observe • Yi(1) in everyone in the intervention arm • Yi(0) in everyone in the control arm • Randomisation means that mean outcome with intervention can be estimated by mean outcome of those who got intervention E[Yi | R=E] – E[Yi | R=S] = E[Yi(1) | R=E] – E[Yi(0) | R=S] = E[Yi(1)] – E[Yi(0)] = E[Di] • Not true with imperfect compliance! • So ITT estimates the average causal effect of intervention
Estimation with imperfect compliance • Assume “all-or-nothing” compliance • everyone gets either intervention or control • In the intervention arm, we observe • Yi(1) in compliers • Yi(0) in non-compliers • In the control arm, we observe • Yi(0) in compliers • Yi(1) in “contaminators” • Need assumptions to estimate the average causal effect of intervention • A very simple assumption is • Yi(1) - Yi(0) = b • b is the (average) causal effect of intervention
Estimation with imperfect compliance (2) • Continuing with “causal model” Yi(1) - Yi(0) = b • can be written as Yi = Yi(0) + b Di • Di = 1 if intervention was received, else 0 • Implies that expected difference in outcome (between randomised groups) = causal effect of intervention x expected difference in intervention receipt • E[Yi|R=E] – E[Yi|R=S] = b{E[Di|R=E] – E[Di|R=S]} • This gives the simplest causal estimator: • causal effect of intervention = expected difference in outcome / expected difference in intervention receipt
But … • Angrist, Imbens and Rubin (1996) took a different perspective and showed that this estimator isn’t what it seems • To see this, consider “counterfactual treatments”: • DiE = treatment if randomised to intervention • DiS = treatment if randomised to control • both are 0/1 (received standard / intervention) • Implies 4 types of person (“compliance-types”): • DiE=1, DiS=1: always-takers • DiE=1, DiS=0: compliers • DiE=0, DiS=0: never-takers • DiE=0, DiS=1: defiers – assumed absent
Introducing the complier-average causal effect • The observed data tell us nothing about the causal effects of treatment in always-takers and never-takers • In fact, our simple estimator estimates the “complier-average causal effect” (CACE) = E[Di| DiE=1, DiS=0] • This is all we can hope to estimate in RCTs!
Problems with the CACE • We don’t know who is a “complier” • In practice, we may want to know what will be observed • if compliance is worse than in the trial (e.g. if rolled out in clinical practice) • if compliance is better than in the trial (e.g. because intervention is well publicised / marketed) This means we want to know the average causal effect in a different subgroup. We might assume this is the CACE – but it is an assumption
Summary of things we can estimate • ITT: E[Y|R=E] – E[Y|R=S] • PP: E[Y|R=E, DE=1] – E[Y|R=S, DS=0] • ACE/ATE: E[Y(1) – Y(0)] • ATT: E[Y(1) – Y(0) | DE=1] • CACE: E[Y(1) – Y(0) | DE=1, DS=0] We are going to explore ways to estimate the CACE
Plan of session 1 • Describe departures from random allocation • Intention-to-treat analysis, per-protocol analysis and their limitations • What do we want to estimate? • Estimation methods: principal stratification • Instrumental variables • Structural mean model • Extensions: complex departures, missing data, covariates • Small group discussion All illustrated with data from the ODIN trial
Principal stratification • An idea of Frangakis and Rubin (1999), generalising the simple compliance-types above • Again, let • DiE = treatment if randomised to intervention • DiS = treatment if randomised to control where both could be complex (e.g. numbers of sessions of psychotherapy) • Principal strata are the levels of the pair (DiE, DiS)
Using principal stratification • We should model outcomes conditional on principal strata • typically allow a different mean for each principal stratum – avoids assuming they are comparable • allow differences between randomised groups within principal strata • these parameters have a causal meaning • Of course this may not be easy, since for every individual we only know one of (DiE, DiS) so we don’t know their principal stratum
Example: ODIN trial • Trial of 2 psychological interventions to reduce depression (Dowrick et al, 2000) • Randomised individuals: • 236 to the psychological interventions (E) • 128 to treatment as usual (S) • Outcome: Beck Depression Inventory (BDI) at 6 months • recorded on 317 randomised individuals
ODIN trial: compliance • Of 236 individuals randomised to psychological interventions, 128 (54%) attended in full • others refused, did not attend or discontinued • Psychological interventions weren’t available to the control arm (no “contaminators”) so DS=0 for all • Only 2 principal strata: • would attend if randomised to intervention • DE=1, “compliers” • would not attend if randomised to intervention • DE=0, “never-takers”
Exclusion restriction • Key assumption used to identify the CACE • In individuals for whom randomisation has no effect on treatment (e.g. in never-takers and always-takers), randomisation has no effect on outcome • Often reasonable: e.g. in a double-blind drug trial, not taking active drug is the same as not taking placebo • But not always reasonable: e.g. not attending counselling despite being invited could be different from not attending because uninvited • “I wouldn’t have gone, but I’d like to have been invited”
Exclusion restriction in ODIN • In ODIN, the exclusion restriction means that randomisation has no effect on outcomes in those who would not attend if randomised to psychological intervention • But recall that we included those who discontinued as “non-attenders” • their partial attendance is very likely to have had some effect on them • the exclusion restriction would be more plausible if we defined compliance as any attendance • we’ll return to this later
complier-average causal effect (CACE) randomisation balance (59*140/177) 46.7 13.22 93.316.13 exclusion restriction CACE analysis (2) Note: 66.7% compliance (118/177)ITT / 0.667 = CACE CACE = 13.32 – 16.13 = -2.81(cf ITT = 13.29 – 15.16 = -1.87)
CACE equal PP equal CACE is based on the “exclusion restriction” assumption Per-protocol analysis estimates the CACE under the “random non-compliance” assumption CACE vs. PP
Plan of session 1 • Describe departures from random allocation • Intention-to-treat analysis, per-protocol analysis and their limitations • What do we want to estimate? • Estimation methods: principal stratification • Instrumental variables • Structural mean model • Extensions: complex departures, missing data, covariates • Small group discussion All illustrated with data from the ODIN trial
Instrumental variables (IV) • Popular in econometrics • Model: • Model of interest: Yi = a + b Di + ei • Error ei may be correlated with Di (“endogenous”) • Example in econometrics: D is years of education, Y is adult wage, e includes unobserved confounders • We can’t estimate b by ordinary linear regression • Instead, we assume error ei is independent of an 3rdinstrumental variable Ri • i.e. Ri only affects outcome through its effect on Di • or: randomisation only affects outcome through its effect on treatment actually received
IV estimation • Estimation by “two-stage least squares”: model implies • E[Yi | Ri] = a + b E[Di | Ri] • so first regress Di on Ri to get E[Di | Ri] • then regress Yi on E[Di | Ri] • NB standard errors not quite correct by this method: general IV uses different standard errors • More generally, we use an estimating equation based onSi Ri (Yi – a – b Di ) = 0
Instrumental variables for ODIN . ivreg bdi6 (treata=z) Instrumental variables (2SLS) regression Source | SS df MS Number of obs = 317 -------------+------------------------------ F( 1, 315) = 2.64 Model | -58.5115086 1 -58.5115086 Prob > F = 0.1049 Residual | 32532.4232 315 103.277534 R-squared = . -------------+------------------------------ Adj R-squared = . Total | 32473.9117 316 102.765543 Root MSE = 10.163 ------------------------------------------------------------------------------ bdi6 | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- treata | -2.803511 1.724143 -1.63 0.105 -6.195802 .5887801 _cons | 15.15714 .8588927 17.65 0.000 13.46725 16.84703 ------------------------------------------------------------------------------ Instrumented: treata Instruments: z ------------------------------------------------------------------------------ Same estimate as before!
Easy to extend to include covariates . ivreg bdi6 (treata=z) bdi0 Instrumental variables (2SLS) regression Source | SS df MS Number of obs = 317 -------------+------------------------------ F( 2, 314) = 43.26 Model | 6808.64828 2 3404.32414 Prob > F = 0.0000 Residual | 25665.2634 314 81.7365076 R-squared = 0.2097 -------------+------------------------------ Adj R-squared = 0.2046 Total | 32473.9117 316 102.765543 Root MSE = 9.0408 ------------------------------------------------------------------------------ bdi6 | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- treata | -3.428509 1.539881 -2.23 0.027 -6.458298 -.3987196 bdi0 | .5813933 .0630405 9.22 0.000 .4573581 .7054285 _cons | 2.395561 1.546673 1.55 0.122 -.6475924 5.438714 ------------------------------------------------------------------------------ Instrumented: treata Instruments: bdi0 z ------------------------------------------------------------------------------ Usual gain in precision
Plan of session 1 • Describe departures from random allocation • Intention-to-treat analysis, per-protocol analysis and their limitations • What do we want to estimate? • Estimation methods: principal stratification • Instrumental variables • Structural mean model • Extensions: complex departures, missing data, covariates • Small group discussion All illustrated with data from the ODIN trial
Structural mean model (SMM) • Extends our simple model Yi(1) - Yi(0) = b • SMM is E[YiE - YiC | DiE, DiC, X] = b Di* • where Di* is a summary of treatment thought to have a causal effect, e.g.: • Di* = DiE – DiC: causal effect of treatment is proportional to amount of treatment • Di* = (DiE – DiC , Xi(DiE – DiC)): and X is an effect modifier • Goetghebeur and Lapp, 1997 (assumed DiC=0) • Estimation is equivalent to instrumental variables with R and R*X as instruments • in other words, we also assume that X does not modify the causal effect of treatment
Summary for binary compliance • The principal stratification approach divides individuals into always-takers, compliers and never-takers • We can then identify the complier-average causal effect, provided we make the exclusion restriction assumption • This works for binary or continuous outcomes • Instrumental variables and structural mean models approaches lead to the same estimates for continuous outcomes • For binary outcomes, instrumental variables are problematic, and generalised structural mean models are needed (Vansteelandt and Goetghebeur, 2003)
Plan of session 1 • Describe departures from random allocation • Intention-to-treat analysis, per-protocol analysis and their limitations • What do we want to estimate? • Estimation methods: principal stratification • Instrumental variables • Structural mean model • Extensions: complex departures, missing data, covariates • Small group discussion All illustrated with data from the ODIN trial
Example with missing outcome data • Our IV analyses of ODIN used complete cases only • This is a bad idea • Follow-up rates were worse in non-attenders (55%) than in attenders (92%) • So we modify the previous analysis • We will now assume the data are “missing at random” given randomised group and attendance • e.g. among non-attenders, there is no difference on average between non-responders and responders
128 108 236 complier-average causal effect (CACE) randomisation balance (108*191/236) 103.6 87.4 191 13.22 93.316.13 exclusion restriction 16.80 CACE analysis under MAR CACE (MAR) = 13.32 – 16.80 = -3.48cf CACE (CC) = 13.32 – 16.13 = -2.81
A more general approach • We can allow for missing data by using inverse probability weights • Suppose a certain group of individuals has only 50% chance of responding • give each responder in that group a weight of 2 • accounts for their non-responding fellows • In ODIN, we will consider the baseline-adjusted analysis • We will construct weights depending on baseline BDI, randomised group and attendance
Constructing the weights . logistic resp6 z treata bdi0 Logistic regression Number of obs = 427 LR chi2(3) = 49.84 Prob > chi2 = 0.0000 Log likelihood = -218.70364 Pseudo R2 = 0.1023 ------------------------------------------------------------------------------ resp6 | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- z | .4327186 .1102412 -3.29 0.001 .2626333 .7129535 treata | 10.1753 3.909568 6.04 0.000 4.791789 21.60713 bdi0 | .9750455 .0136551 -1.80 0.071 .9486461 1.00218 ------------------------------------------------------------------------------ . predict presp (option pr assumed; Pr(resp6)) . gen wt=1/presp
Examining the weights therapy, non-compliers control therapy, compliers
Weighted IV analysis . ivreg bdi6 (treata=z) bdi0 [pw=wt] (sum of wgt is 4.2710e+02) Instrumental variables (2SLS) regression Number of obs = 317 F( 2, 314) = 37.28 Prob > F = 0.0000 R-squared = 0.2183 Root MSE = 9.0521 ------------------------------------------------------------------------------ | Robust bdi6 | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- treata | -3.953868 1.944846 -2.03 0.043 -7.780444 -.1272916 bdi0 | .5810663 .0680343 8.54 0.000 .4472056 .714927 _cons | 2.37602 1.554941 1.53 0.128 -.6834003 5.435441 ------------------------------------------------------------------------------ Instrumented: treata Instruments: bdi0 z ------------------------------------------------------------------------------
Back to the exclusion restriction • Recall that partial attenders were included as non-compliers • If instead we include them as compliers, the exclusion restriction is much more plausible • The estimated causal effect is smaller because it is an average over a wider group that includes partial compliers
Example with continuous compliance:the SoCRATES trial • SoCRATES was a multi-centre RCT designed to evaluate the effects of cognitive behaviour therapy (CBT) and supportive counselling (SC) on the outcomes of an early episode of schizophrenia. • 201 participants were allocated to one of three groups: • Control: Treatment as Usual (TAU) • Treatment: TAU plus psychological intervention, either CBT + TAU or SC + TAU • The two treatment groups are combined in our analyses • Outcome: psychotic symptoms score (PANSS) at 18 months