Strengthening Causal Inference in HIV Studies: Introduction and Practical Examples

Strengthening Causal Inference in HIV Studies: Introduction and Practical Examples CAPS Methods Core Presentation, April 18, 2012 Starley Shade, Sheri Lippman, Mi-Suk Kang Dufour & Carol Camlin

Outline • Answering causal questions: common roadblocks in HIV research • Causal Inference Framework and Overview of methods • Concrete example: Using treatment and censoring weighting in Prevention with Positives • Concrete example: G-comp for population level attributable risk in the SHAZ study • Q & A

Roadblocks in HIV research: selection bias / who gets exposed • Population surveillance and surveys in probability-based samples • study participants (in testing, in survey research, etc.) almost always systematically differ from non-participants • Observational studies • using ‘comparison’ clinics, communities: Systematic differences in study arms exist and/or may accrue over time

Common roadblocks in HIV research: Loss To Follow-up • Cohort studies of HIV+ individuals: highly susceptible to loss to follow-up • >20% after 2 years, in resource-poor settings: medical records don’t capture patient mobility • Death registries rarely available & those who die mistakenly assumed to be lost to follow-up • Those who drop out are systematically different from those who stay engaged in care

C (&U)1 C (&U)2 C (&U)3 C (&U)0 Expos1 Expos2 Expos3 Expos0 STI1 STI2 STI3 STI0 Roadblocks in HIV research: time dependent confounding Time dependent confounding – if C is related to prior exposure & affects sub-sequent exposure C = group of confounders U = unmeasured confounders

Common roadblocks in HIV research: Complex, multi-component intervention studies • Increasing calls for comprehensive HIV prevention interventions addressing multiple levels and domains of influence on individual behavior • Evaluation of such studies hampered by: • Diverse levels of exposure to individual intervention components • Difficult to distinguish relative contributions of individual intervention components to observed outcomes

Mending our comparison – the causal /counter factual framework • “We may define a cause to be an object followed by another… where, if the first object had not been, the second never had existed” (Hume 1748) • An association can be considered causal when, if the exposure had been altered, the outcome would have been different • Key part is the counterfactual element – reference to what would have happened if, contrary to fact, the exposure had been something other than what it actually was

Counterfactual framework • “Ideal experiment” illustrates the framework • a hypothetical study which, if we could actually conduct it, would allow us to infer causality • Ideal experiment: • Person or population experiences one exposure and observed for outcome over a given time period • Roll back the clock • Change the exposure but leave everything else the same, observe for outcome over the same time period

Counterfactual framework AIDS ART Deatht Person A OBSERVED: Time Counterfactual question: how long would Person A have survived had if he/she had not received treatment?

Counterfactual framework AIDS ART Deatht Person A OBSERVED: Time No ART Deathnt AIDS Person A UNOBSERVED:

Counterfactuals – specifying what we really want to know • Thinking about the counterfactual outcome(s) as something we are missing and something we are trying to estimate when we analyze HIV studies or any epidemiologic data is instructive • Akin to a missing data problem • When we compare groups of people observed as exposed or unexposed we want to compare groups that best estimate the counterfactual outcomes that are unobserved or missing

Notation for presentation A Y W, L • A = treatment • Y = outcome • W = confounders (point treatment) • L = confounders (longitudinal) • The Likelihood of Data simplifies to: • L(O) = P(Y|A,W,L)P(A|W,L)

Rationale for causal inference approach • Basic regression models produce stratum specific, or conditional, estimates (i.e., “while holding constant a set of covariates”) Where Y is outcome, A is observed exposure and L is matrix of time-dependent covariates • Therefore, our estimates of effect are also conditional

Rationale for causal inference approach • Causal inference approaches help us model our way back to the ideal (counter factual) experiment Where Y is outcome and a is counterfactual where all individuals are exposed (a=1) or unexposed (a=0)

Inverse Probability Weighting

Inverse Probability of Treatment Weighting (IPTW) • Re-create the counter factual data set by weighting • IPTW assigns a weight for each subject equivalent to the inverse probability of being in their exposure group at each interval. • The treatment model is based on values of past and current covariates (L(j)) and past exposures (A(j-1)).

Inverse Probability of Treatment Weighting (IPTW) • The treatment weights are applied to the observed population (e.g. weighted logistic regression) • Creates a new pseudo-population in which the distribution of confounders is balanced between the two exposure groups, essentially mimicking a randomized trial.

Inverse Probability of Censoring Weighting (IPCW) • Like IPTW, IPCW assigns a weight equivalent to the inverse probability of remaining in the study at each interval, based on values of observed covariates and past outcomes and exposures. • The censoring weights are applied to the observed population, creating a new pseudo-population in which censored subjects are “replaced” by up-weighting uncensored subjects with the same values of past exposures and covariates.

Example: Prevention with Positives Demonstration Projects • Fifteen HRSA-funded demonstration projects implemented prevention with positives in clinical settings • Each site decided whether to randomize patients to: • Provider-delivered intervention vs. Assessment • Specialist-delivered intervention vs. Assessment • Mixed intervention vs. Provider intervention • How do we assess the effectiveness of each intervention type?

Example: Prevention with Positives Patient characteristics

Example: Prevention with Positives Retention • At the 12-month follow-up assessment, • 58% of patients were retained in the standard of care group, • 76% of patients were retained in the provider intervention sites; • 62% were retained in the specialist sites; and • 44% in the mixed intervention sites. • There were differences in retention by patient characteristics. • Older, white, gay males with more than a high school education but who did not use cocaine or injection drugs were more likely to be retained in the study at 12-months .

Example: Prevention with Positives Risk Behavior

Example: Prevention with Positives Analysis • Inverse probability of treatment weights

Example: Prevention with Positives Analysis • Inverse probability of censoring weights • Weighted logistic regression

Example: Prevention with Positives Results

G-computation and Population intervention Models

G-computation Sometimes called substitution estimation approach G-computation approach is to model the exposure and outcome relationship and then “control” exposure in the population by substituting counterfactual exposures in your model Population intervention models use this approach to answer practical questions 27

Population Intervention Models Standard regression models give conditional estimate: Marginal structural models allow total effect estimate: For interventions what we care about is the population difference when intervention is present or absent:

Analogous to Attributable Risk • Traditional population Attributable Risk or Attributable Fraction: • The proportion of the disease risk in the total population associated with the exposure This assumes the exposure causes the outcome and that there are no other causes i.e. in absence of that exposure there would be no outcome

Why PIMS? • Rarely looking at outcomes with only one important predictor/confounder • PIMS allow assessment of effect averaged across covariates • Rarely able to completely eliminate a risk factor from population • PIMS allow estimation for realistic interventions

Population Intervention Models: estimation 1) Estimate outcome model 2) Create new dataset setting covariate(s) of interest to intervention levels 3) Predict outcome of interest using model estimated in step 1 4) Calculate the difference between predicted mean outcome and observed mean outcome

Example: SHAZ! study • SHAZ! (Shaping the Health of Adolescents in Zimbabwe) • Enrolled adolescent orphan girls ages 16 to 19 • Overall project was designed as an HIV prevention intervention based on provision of reproductive health services, economic livelihoods training and life-skills education

Example: SHAZ! study • Using baseline data to look at a secondary outcome • Interested in the potential of interventions to improve mental health for adolescent orphan girls • Several structural factors considered as potentially modifiable with intervention

Social environment Female caregiver relationship Social support Exposure to violence Feeling safe at home Caring for ill person Orphaning Age at orphaning Socioeconomic status Food security Ability to pay for medication Ever homeless Changes in household Completed education Psychological distress (Unmeasured) Baseline Self efficacy Poor physical health General health status Viral infection Baseline Mental Health status SSQ

PIMS Question: What is the potential impact of intervening on these factors on this population’s mental health status?

Traditional regression results

6 month covariates Baseline covariates 12 month covariates 18 month covariates Intervention Participation: Life-skills Red Cross Intervention Participation: Start vocational training Intervention Participation: finish vocational training Intervention Participation: Receive grant Baseline Mental Health Mental Health at 6 months Mental Health at 12 months Mental Health at 18 months Mental Health at 24 months Extension of this approach to longitudinal context: Time

Question: Does poor mental health status affect participation in the intervention over time?

Analytic approach Interested in effect of exposure (A) on outcome (Y) given covariates and past exposure and outcome EW[E0(Y|A=1,W)‐E0(Y|A=0,W)] Where W includes past exposure and outcome and other covariates

Analytic approach cont. Fit a series of point treatment models for outcomes at timepoints following exposure(s) of interest

6 month covariates Baseline covariates (W) Intervention Participation: Life-skills (Y) Red Cross (Y) Intervention Participation: Start vocational training Baseline Mental Health (A) Mental Health at 6 months Example 1:

6 month covariates (W) Baseline covariates (W) Intervention Participation: Life-skills Red Cross (W) Intervention Participation: Start vocational training (Y) Baseline Mental Health (W) Mental Health at 6 months(A) Example 2:

Assumptions and Limitations

Assumptions • No Unmeasured Confounding • There is no way to empirically test for no unmeasured confounding; • collection of data on a complete set of covariates should be incorporated in the design phase • Experimental Treatment Assignment (ETA) or positivity • Groups defined by all possible combinations of covariates must have the potential to be in any (either) treatment groups. • If there are covariate groups that will only be observed in one treatment state, then we cannot estimate the effect of the exposure within that group • Time-ordering (temporality) • Need to be certain the covariates measured were prior to treatment if used in Tx weights/ treatment is prior to outcome.

Acknowledgements Thanks to: • Alan Hubbard, UCB • Mark van derLaan , UCB • Jennifer Ahern, UCB

Strengthening Causal Inference in HIV Studies: Introduction and Practical Examples