140 likes | 430 Vues
Bayesian Multivariate Logistic Regression by Sean O’Brien and David Dunson (Biometrics, 2004 ). Presented by Lihan He ECE, Duke University May 16, 2008. Outlines. Univariate logistic regression Multivariate logistic regression Prior specification and convergence Posterior computation
 
                
                E N D
Bayesian Multivariate Logistic Regressionby Sean O’Brien and David Dunson(Biometrics, 2004 ) Presented by Lihan He ECE, Duke University May 16, 2008
Outlines • Univariate logistic regression • Multivariate logistic regression • Prior specification and convergence • Posterior computation • Experimental result • Conclusions
Univariate Logistic Regression Model Equivalent: zi: latent variable L( ): logistic density logistic density: CDF:
Univariate Logistic Regression Model Approximation using t distribution set
, F-1( ) is the inverse CDF of density Multivariate Logistic Regression Model Binary variable for each output -- marginal pdf has univariate logistic density with
Multivariate Logistic Regression Model Property • The marginal univariate densities of zj, for j=1,…,p, have univariate logistic form • p=1, reduce to the univariate logistic density • R is a correlation matrix (with 1’s on the diagonal), reflecting the correlations between zj, and hence the correlations between yj • R=diag(1,…,1), reduce to a product of univariate logistic densities, and the elements of z are uncorrelated • Good convergence property for MCMC sampling
Multivariate Logistic Regression Model Likelihood M-ary variable for each output (ordered) Assume Define
Prior specification and convergence or R: uniform density [-1,1] for each element in non-diagonal position
Importance sampling: sample from a proposal distribution to approximate samples from , and use importance weights for exact inference. Use multivariate t distribution to approximate the multivariate logistic density in the likelihood part. Posterior Computation Posterior: Prior and likelihood are not conjugate Proposal distribution: =
Introduce latent variables and z, the proposal is expressed as z) Sample and z from the full conditionals since the likelihood is conjugate to prior. Set with probability Set otherwise Posterior Computation Update R using a Metropolis step (accept/reject)
Posterior Computation Importance weights for inference weights
Application Subject: 584 twin pregnancies Output: small for gestational age (SGA), defined as a birthweight below the 10th percentile for a given gestational age in a reference population. Binary output, yij={0,1}, i=1,…,584, j=1, 2 Covariates: xij for the ith pregnancy and the jth infant
Application • Obtain nearly identical estimates to the study of AP for the regression coefficients. • Female gender (β1), prior preterm delivery (β4, β5) and smoking (β8) are associated with an increased risk of SGA. • Outcomes for twins are highly correlated, represented by R.
Conclusions • Propose a multivariate logistic density for multivariate logistic regression model. • The proposed multivariatelogistic density is closely approximated by a multivariate t distribution. • Has properties that facilitate efficient sampling and guaranteed convergence. • The marginals are univariate logistic densities. • Embed the correlation structure within the model.