Sequential Imputations and Bayesian Approaches to Missing Data Problems

SEQUENTIAL IMPUTATIONS AND BAYESIAN MISSING DATA PROBLEMS • AUGUSTING KONG, JUN LIU WING HUNG WONG • JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION MARCH 1994 VOL 89 NO. 425

Setting • X=(x1,…xn)=(y1, z1,…yn, zn)=(Y, Z) • If an observation l is complete, then yl=xl

Goal

Importance sampling • Draw m independent copies of Z’s from the conditional distribution p(Z|Y) and then approximate

problem • Drawing from p(Z|Y) directly is usually difficult • Gibbs sampler or data augmentation do this approximately by iterations

Sequential Imputation • Step 1: • Draw zt* from the conditional distribution p(zt|y1, z1*,…yt-1, zt-1*, yt). Notice that the zt*’s had to be drawn sequentially, because each zt* is drawn conditioned on the previously imputed missing part z1*,…,zt-1*

Sequential Imputation • Step2: • Compute the predictive probabilities p(yt|y1, z1*,…,yt-1, zt-1*) and • wt=wt-1 p(yt|y1, z1*,…,yt-1, zt-1*) • Let w=wn, so that • W=p(y1)π p(yt|y1, z1*,…,yt-1, zt-1*) , for t=2…n

Sequential Imputation • Step1 and step2 are done repeatedly and independently for m times • Let the results be denoted by Z*(1), Z*(2),…Z*(m) and w(1),…w(2), where Z*(j)=(z1*(j),…zn*(j)) for j=1…m

Sequential Imputation • Posterior distribution:

Sequential Imputations and Bayesian Approaches to Missing Data Problems