1 / 33

Pre-processing for Approximate Bayesian Computation in Image Analysis

Pre-processing for Approximate Bayesian Computation in Image Analysis. Matt Moores Chris Drovandi Christian Robert Kerrie Mengersen. Context: Radiotherapy planning. Fast information synthesis. 1. Use a fan-beam CT to establish a treatment plan.

tannar
Télécharger la présentation

Pre-processing for Approximate Bayesian Computation in Image Analysis

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Pre-processing for Approximate Bayesian Computation in Image Analysis Matt Moores Chris Drovandi Christian Robert Kerrie Mengersen

  2. Context: Radiotherapy planning

  3. Fast informationsynthesis 1. Use a fan-beam CT to establish a treatment plan. Less subject to artifacts induced by X-ray scatter or metal implants. 2. Use a cone-beam CT each day to determine any changes in the image and hence in the plan. Less accurate but can be used in-situ. • At present, all images are separately and manually inspected. • We want to automatically update the spatial prior information to improve identification of features in images with low contrast-to-noise ratio. • Represent the prior as an external field in a hidden Potts model of the image lattice: formulate the prior distribution of the latent pixel labels as a mixture of Gaussian fields, centred on the positions of the objects in a previous point in time.

  4. Mixture of Gaussians fan-beam cone-beam

  5. Potts model

  6. with informed external field fan-beam external field

  7. Image segmentation Fan-beam CT Cone-beam CT Simple Potts model External prior

  8. Classification accuracy

  9. Problem 1: imagescomputational challenges with big data

  10. Problem 2: Potts modelDoubly intractable likelihood analytically and computationallyintractable

  11. Problem 3: Potts modelEstimation of the inverse temperature

  12. Option 1: Pseudolikelihood Rydenand Titterington, 1998 • Approximate C(b ) using the product of the conditional densities p( xi | xi~l,b ) • Draw proposed values of b using a random walk and evaluate using a pseudolikelihood approximation to the Metropolis-Hastings ratio Fast, but approximation error increases for large values of b

  13. Option 2: Path sampling Gelmanand Meng, 1998 • Pre-compute E[S(x)] = Si~ld(xi ,xl) for fixed values of b • Use these values during model fitting to approximate E[S(x)|b′] • Approximate the MH ratio using numerical integration • Precomputation can be costly for large datasets, but output can be reused for multiple datasets of same size and number of mixture components.

  14. Option 3: Exchange algorithm Murray, Gharamani and MacKay, 2006 • Use an auxiliary variable w drawn from p( xi | xi~l,b ) for proposed b’ so the partition function cancels out in the MH ratio: • If b’ is drawn from a symmetric random walk then ratio of q’s = 1 • If a uniform prior is used for b then ratio of p’s also cancels, so log(r) = (b′ - b′′) S(x) + (b′′- b′) S(w) • Simulation of the auxiliary variable can be expensive (same dimension as latent labels x) • Method is exact when perfect sampling is used to simulate w

  15. Option 4: Approximations • Cucala, Marin, Robert, Titterington (2009): approximate exchange algorithm using 500 iterations of Gibbs sampling for w, with reduced computational cost • Fast [iterative, deterministic] approximations, eg: • Iterated conditional modes (ICM, Besag, 1986) • Variational Bayes (VB, McGrory, Titterington, Reeves, Pettitt, 2009) but the problem of estimating the inverse temperature persists! • Path sampling and exchange algorithm estimate ratio of normalising constants, ideal for MCMC, unsuitable for use in these algorithms • Pseudolikelihood directly approximates Z(b) so can be used in combination with these algorithms

  16. Software: BayesImageS

  17. Comparison of methods Comparison of no. pixels and elapsed time Allocation of pixels was similar • Exchange algorithm was the most computationally expensive • Precomputation of E[S(z)|b ] for path sampling helped slightly • Pseudolikelihood was faster but produced a much higher posterior estimate of b

  18. Option 5: SMC-ABC

  19. Still costly

  20. Avoiding computational cost by pre-computation The distribution of r(x)|q is independent of the data • By simulating pseudo-data for values of q, we can create a mapping function f(q) to approximate E[r(x)|q ] • This mapping function can be reused across multiple datasets, amortising its computational cost • By mapping directly from qr(x), we avoid the need to simulate pseudo-data during model fitting

  21. Sufficient statistics of the Potts model

  22. Scalable SMC-ABC for the Potts model

  23. Simulation study

  24. Accuracy of posterior estimates of b Posterior estimate b b Pseudo-data Pre-computed Algorithm

  25. Distribution of posterior sampling error for b Error Pseudo-data Pre-computed Algorithm

  26. Improvement in runtime Elapsed (wall clock) time CPU time Pseudo-data Pre-computed Pseudo-data Pre-computed

  27. Summary Scalability of SMC-ABC can be improved by pre-computing an approximate mapping q  r(x) • Pre-computation tool 8 minutes on a 16 core Xeon server • Average runtime for SMC-ABC improved from 74.4 hours to 39 minutes The mapping function represents the nonlinear, heteroskedastic relationship between the parameter and the summary statistic. The method handles noisy data. This is non-trivial.

  28. Comments • Based on the pre-computation step, we draw the sufficient statistic from a Gaussian with mean and variance estimated from the pre-computations, then compare this with the representative observed sufficient statistic.  This adds unnecessary noise. Instead, we could use the Gaussian density directly in the algorithm.  This would avoid having to use ABC tolerances (hence removing that approximation effect entirely).  We could still use SMC, but we would need to anneal the Gaussian likelihood in. • The method could be extended to multivariate applications, such as estimating both b and k for the hidden Potts model. • Most pixels don’t change allocation. • Careful consideration of spatial scale helps.

  29. Finally, “so what”?

  30. Further details

More Related