1 / 34

Incomplete Graphical Models

Incomplete Graphical Models. Nan Hu. Outline . Motivation K-means clustering Coordinate Descending algorithm Density estimation EM on unconditional mixture Regression and classification EM on conditional mixture A general formulation of EM Algorithm. K-means clustering.

jabir
Télécharger la présentation

Incomplete Graphical Models

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Incomplete Graphical Models Nan Hu

  2. Outline • Motivation • K-means clustering • Coordinate Descending algorithm • Density estimation • EM on unconditional mixture • Regression and classification • EM on conditional mixture • A general formulation of EM Algorithm

  3. K-means clustering Problem: Given a set of observations how to group them into a set of K clustering, supposing the value of K is given. • First Phase • Second Phase

  4. K-means clustering First Iteration Original Set Second Iteration Third Iteration

  5. K-means clustering • Coordinate descent algorithm • The algorithm is trying to minimize distortion measure J by setting the partial derivatives to zero

  6. Unconditional Mixture Problem: If the given sample data demonstrate multimodal densities, how to estimate the true density? Fit a single density with this bimodal case. Although algorithm converges, the results bear little relationship to the truth.

  7. Unconditional Mixture • A “divide-and-conquer” way to solve this problem • Introducing latent variable Z Multinomial node taking on one of K values Z Assign a density model for each subpopulation, overall density is X Back

  8. Unconditional Mixture • Gaussian Mixture Models • In this model, the mixture components are Gaussian distributions with parameters • Probability model for a Gaussian mixture

  9. Unconditional Mixture • Posterior probability of latent variable Z: • Log likelihood:

  10. Unconditional Mixture • Partial derivative of over using Lagrange Multipliers • Solve it, we have

  11. Unconditional Mixture • Partial derivative of over • Setting it to zero, we have

  12. Unconditional Mixture • Partial derivative of over • Setting it to zero, we have

  13. Unconditional Mixture • The EM Algorithm • First Phase • Second Phase Back

  14. Unconditional Mixture • EM algorithm from expected complete log likelihood point of view Suppose we observed the latent variables , the data set becomes completely observed, the likelihood is defined as the complete log likelihood

  15. Unconditional Mixture We treat the as random variables and take expectations conditioned on X and . Note are binary r.v., where Use this as the “best guess” for , we have Expected complete log likelihood

  16. Unconditional Mixture • Minimizing expected complete log likelihood by setting the derivatives to zero, we have

  17. Conditional Mixture • Graphical Model For regression and classification X The relationship between X and Z can be modeled in a discriminative classification way, e.g. softmax func. Z Y Latent variable Z, multinomial node taking on one of K values Back

  18. Conditional Mixture • By marginalizing over Z, • X is taken to be always observed. The posterior probability is defined as

  19. Conditional Mixture • Some specific choice of mixture components • Gaussian components • Logistic components Where is the logistic function:

  20. Conditional Mixture • Parameter estimation via EM Complete log likelihood : Use expectation as the “best guess”, we have

  21. Conditional Mixture • The expected complete log likelihood can then be written as • Taking partial derivatives and setting them to zero to find the update formula for EM

  22. Conditional Mixture Summary of EM algorithm for conditional mixture • (E step): Calculate the posterior probabilities • (M step): Use the IRLS algorithm to update the parameter , base on data pairs . • (M step): Use the weighted IRLS algorithm to update the parameters , based on the data points , with weights . Back

  23. General Formulation • - all observable variables • - all latent variables • - all parameters Suppose is observed, the ML estimate is However, is in fact not observed Complete log likelihood Incomplete log likelihood

  24. General Formulation • Suppose factors in some way, complete log likelihood turns to be • Since is unknown, it’s not clear how to solve this ML estimation. However, if we average over the r.v. of

  25. General Formulation • Use as an estimate of , complete log likelihood becomes expected complete log likelihood • This expected complete log likelihood becomes solvable, and hopefully, it’ll also improve the complete log likelihood in some way. (The basic idea behind EM.)

  26. General Formulation • EM maximizes incomplete log likelihood Jensen’s Inequality Auxiliary Function

  27. General Formulation • Given , maximizing is equal to maximizing the expected complete log likelihood

  28. General Formulation • Given , the choice yields the maximum of . Note:is the upper bound of

  29. General Formulation • From above, at every step of EM, we maximized . • However, how do we know whether the finally maximized also maximized incomplete log likelihood ?

  30. General Formulation • The different between and non-negative and uniquely minimized at KL divergence

  31. General Formulation • EM and alternating minimization • Recall the maximization of the likelihood is exactly the same as minimization of KL divergence between the empirical distribution and the model. • Including the latent variable , KL divergence comes to be a “complete KL divergence” between joint distributions on .

  32. General Formulation • Reformulated EM algorithm • (E step) • (M step) Alternating minimization algorithm

  33. Summary • Unconditional Mixture • Graphic model • EM algorithm • Conditional Mixture • Graphic model • EM algorithm • A general formulation of EM algorithm • Maximizing auxiliary function • Minimizing “complete KL divergence”

  34. Incomplete Graphical Models Thank You!

More Related