Estimating a causal effect using observational data

Estimating a causal effect using observational data Aad van der Vaart Afdeling Wiskunde, Vrije Universiteit Amsterdam Joint with Jamie Robins, Judith Lok, Richard Gill

CAUSALITY Operational Definition: If individuals are randomly assigned to a treatment and control group, and the groups differ significantly after treatment, then the treatment causes the difference We want to apply this definition with observational data

Counter factuals treatment indicator A  {0,1} outcome Y Given observations (A, Y) for a sample of individuals, mean treatment effect might be defined as E( Y | A=1 ) – E( Y | A=0 ) However, if treatment is not randomly assigned this is NOT what we want to know

Counter factuals (2) treatment indicator A  {0,1} outcome Y outcome Y1 if individual had been treated outcome Y0 if individual had not been treated mean treatment effect E Y1 – E Y0 Unfortunately, we observe only one of Y1and Y0, namely: Y= YA

Counter factuals (3) ASSUMPTION: there exists a measured covariate Z with A  (Y0, Y1 ) given Z Under ASSUMPTION: E Y1 – E Y0 =  {E (Y | A=1, Z=z) - E (Y | A=1, Z=z) } dPZ(z) CONSEQUENCE: underASSUMPTION the mean treatment effect is estimable from the observed data (Y,Z,A) ASSUMPTION is more likely to hold if Z is “bigger”  means “are statistically independent”

Longitudinal Data times: t0 < t1 < . . . . < tK treatments: a = (a0, a1, . . . , aK ) observed treatments: A = (A0, A1, . . . , AK ) counterfactual outcomes: Ya observed outcome: YA We are interested inE Yafor certaina

Longitudinal Data (2) times: t0 < t1 < . . . . < tK treatments: a = (a0, a1, . . . , aK ) observed treatments: A = (A0, A1, . . . , AK ) observed covariates: Z = (Z0, Z1, . . . , ZK ) ASSUMPTION: Ya  Ak given ( Zk , Ak-1 ), for all k Under ASSUMPTION E Ya can be expressed in the distribution of the observed data (Y, Z, A ) “It is the task of an epidemiologist to collect enough information so that ASSUMPTION is satisfied”

Estimation and Testing • Under ASSUMPTION it is possible, in principle • to test whether treatment has effect • to estimate the mean counterfactual treatment effects A standard statistical approach would be to model and estimate all unknowns. However there are too many. We look for a “semiparametric approach” instead.

Shift function The quantile-distribution shift function is the (only monotone) function that transforms a variable “distributionally” into another variable. It is convenient to model a change in distribution.

treatment until time k outcome of this treatment shift map corresponding to these distributions, transforms into IDEA: modelby a parameter and estimate it Structural Nested Models

treatment until time k outcome of this treatment transforms into negative effect no effect tk-1 tk time Structural Nested Models (2) positive effect no effect negative effect

Under ASSUMPTION: • is distributed as • Make regression model for • Make model for g • Add as explanatory variable • Estimate g by the value such that does NOT add explanatory value. Estimation

Estimation (2) Example: if treatment A is binary, then we might use a logistic regression model We estimate (a , b, d ) by standard software for given g. The “true” g is the one such that the estimated d is zero. We can also test whether treatment has an effect at all by testing H0: d=0 in this model with Y instead of Yg .

End Lok, Gill, van der Vaart, Robins, 2004, Estimating the causal effect of a time-varying treatment on time-to-event using structural nested failure time models Lok, 2001 Statistical modelling of causal effects in time Proefschrift, Vrije Universiteit

Estimating a causal effect using observational data

Estimating a causal effect using observational data

Presentation Transcript

ESG Observational Data Integration

Estimating Causal Effects from Large Data Sets Using Propensity Scores

Econometrics with Observational Data

Causal Inference for Complex Observational Data Using Stata

Observational Data

Modeling with Observational Data

Causal Data Mining

Learning Causal Structure from Observational and Experimental Data

Causal diagrams, the placebo effect, and the expectation effect

Econometrics with Observational Data

Estimating Parameters Using Measured Data

Chapter 6 Estimating Parameters From Observational Data

Complier Average Causal Effect analysis

Case Study: Causal Discovery Methods Using Causal Probabilistic Networks

Near Surface Soil Moisture Estimating using Satellite Data

Estimating Causal Effects with Experimental Data

Searching Simulation and Observational Climate Data Using OGC CSW

Cause and effect (aka causal analysis)

Model vs. observational data

Causal Data Mining

Enhancing causal influence (in observational studies )