1 / 50

Variational Inference and Variational Message Passing

Variational Inference and Variational Message Passing. John Winn Microsoft Research, Cambridge. 12 th November 2004 Robotics Research Group, University of Oxford. Overview. Probabilistic models & Bayesian inference Variational Inference Variational Message Passing Vision example.

jdoreen
Télécharger la présentation

Variational Inference and Variational Message Passing

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Variational Inference and Variational Message Passing John WinnMicrosoft Research, Cambridge 12th November 2004 Robotics Research Group, University of Oxford

  2. Overview • Probabilistic models & Bayesian inference • Variational Inference • Variational Message Passing • Vision example

  3. Overview • Probabilistic models & Bayesian inference • Variational Inference • Variational Message Passing • Vision example

  4. Object class C P(C) Surface colour Lighting colour L S P(L) P(S|C) Image colour I P(I|L,S) Bayesian networks • Directed graph • Nodes represent variables • Links show dependencies • Conditional distributions at each node • Defines a joint distribution: P(C,L,S,I)=P(L) P(C) P(S|C) P(I|L,S)

  5. Bayesian inference Object class C Observed variables V and hidden variables H. Hidden Surface colour Lighting colour Hidden variables includeparameters and latent variables. L S Learning/inference involves finding: Image colour I P(H1, H2…| V) Observed

  6. Bayesian inference vs. ML/MAP • Consider learning one parameter θ • How should we represent this posterior distribution?

  7. Bayesian inference vs. ML/MAP • Consider learning one parameter θ Maximum of P(V| θ) P(θ) P(V| θ) P(θ) θ θMAP

  8. High probability mass Bayesian inference vs. ML/MAP • Consider learning one parameter θ High probability density P(V| θ) P(θ) θ θMAP

  9. Bayesian inference vs. ML/MAP • Consider learning one parameter θ P(V| θ) P(θ) θ θML Samples

  10. Bayesian inference vs. ML/MAP • Consider learning one parameter θ P(V| θ) P(θ) θ Variational approximation θML

  11. Overview • Probabilistic models & Bayesian inference • Variational Inference • Variational Message Passing • Vision example

  12. Variational Inference (in three easy steps…) • Choose a family of variational distributions Q(H). • Use Kullback-Leibler divergence KL(Q||P) as a measure of ‘distance’ between P(H|V) and Q(H). • Find Q which minimises divergence.

  13. Q Minimising KL(P||Q) Q P KL Divergence Exclusive Minimising KL(Q||P) P Inclusive

  14. Minimising the KL divergence • For arbitrary Q(H) fixed maximise minimise where • We choose a family of Q distributions where L(Q) is tractable to compute.

  15. Minimising the KL divergence KL(Q || P) maximise ln P(V) fixed L(Q)

  16. Minimising the KL divergence KL(Q || P) maximise ln P(V) fixed L(Q)

  17. Minimising the KL divergence KL(Q || P) maximise ln P(V) fixed L(Q)

  18. Minimising the KL divergence KL(Q || P) maximise ln P(V) fixed L(Q)

  19. Minimising the KL divergence KL(Q || P) maximise ln P(V) fixed L(Q)

  20. Optimal solution for one factor given by Factorised Approximation • Assume Q factorises No further assumptions are required!

  21. Example: Univariate Gaussian • Likelihood function • Conjugate prior • Factorized variational distribution

  22. Initial Configuration γ μ

  23. After Updating Q(μ) γ μ

  24. After Updating Q(γ) γ μ

  25. Converged Solution γ μ

  26. Lower Bound for GMM

  27. Variational Equations for GMM

  28. Overview • Probabilistic models & Bayesian inference • Variational Inference • Variational Message Passing • Vision example

  29. Variational Message Passing • VMP makes it easier and quicker to apply factorised variational inference. • VMP carries out variational inference using local computations and message passing on the graphical model. • Modular algorithm allows modifying, extending or combining models.

  30. Local Updates For factorised Q, update for each factor depends only on the Markov blanket: Updates can be carried out locally at each node.

  31. For example, the Gaussian distribution T mg é ù é ù g X m g = + - gm + 2 ln P ( X | , ) ln 0 1 1 ê ú ê ú 2 2 - g p 2 / 2 X 2 ë û ë û VMP I: The Exponential Family • Conditional distributions expressed in exponential family form. = + + T ln P ( X | θ ) θ u ( X ) g ( θ ) f ( X ) ‘natural’ parameter vector sufficient statistics vector

  32. = + + T ln P ( X | θ ) θ u ( X ) g ( θ ) f ( X ) = + + T ln P ( Z | X , Y ) φ ( Y , Z ) u ( X ) g ' ( X ) f ' ( Y , Z ) VMP II: Conjugacy • Parents and children are chosen to be conjugate i.e. same functional form X Y same Z • Examples: • Gaussian for the mean of a Gaussian • Gamma for the precision of a Gaussian • Dirichlet for the parameters of a discrete distribution

  33. = + + T ln P ( X | θ ) θ u ( X ) g ( θ ) f ( X ) • Parent to child (X→Z) = + + T ln P ( Z | X , Y ) φ ( Y , Z ) u ( X ) g ' ( X ) f ' ( Y , Z ) • Child to parent (Z→X) VMP III: Messages • Conditionals • Messages X Y Z

  34. VMP IV: Update • Optimal Q(X) has same form as P(X|θ) but with updated parameter vector θ* Computed from messages from parents • These messages can also be used to calculate the bound on the evidence L(Q) – see Winn & Bishop, 2004.

  35. VMP Example • Learning parameters of a Gaussian from N data points. γ μ mean precision (inverse variance) x data N

  36. VMP Example Message from γ to all x. γ μ need initialQ(γ) x N

  37. VMP Example Messages from each xnto μ. γ μ x N Update Q(μ) parameter vector

  38. VMP Example Message from updated μ to all x. γ μ x N

  39. VMP Example Messages from each xnto γ. γ μ x N Update Q(γ) parameter vector

  40. Features of VMP • Graph does not need to be a tree – it can contain loops (but not cycles). • Flexible message passing schedule – factors can be updated in any order. • Distributions can be discrete or continuous, multivariate, truncated (e.g. rectified Gaussian). • Can have deterministic relationships (A=B+C). • Allows for point estimates e.g. of hyper-parameters

  41. VMP Software: VIBES Free download from vibes.sourceforge.net

  42. Overview • Probabilistic models & Bayesian inference • Variational Inference • Variational Message Passing • Vision example

  43. Flexible sprite model Proposed by Jojic & Frey (2001) Set of images e.g. frames from a video x N

  44. Flexible sprite model f π Sprite appearance and shape x N

  45. Flexible sprite model f π Sprite transform for this image (discretised) T m x Mask for this image N

  46. Flexible sprite model b f π Background T m Noise β x N

  47. VMP π f b T m β x N Winn & Blake (NIPS 2004)

  48. Results of VMP on hand video Original video Learned appearance and mask Learned transforms (i.e. motion)

  49. Conclusions • Variational Message Passing allows approximate Bayesian inference for a wide range of models. • VMP simplifies the construction, testing, extension and comparison of models. • You can try VMP for yourself vibes.sourceforge.net

  50. That’s all folks!

More Related