550 likes | 699 Vues
Message Passing Algorithms for Optimization. Nicholas Ruozzi Advisor: Sekhar Tatikonda Yale University. TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box .: A A A. The Problem.
E N D
Message Passing Algorithms for Optimization Nicholas Ruozzi Advisor: SekharTatikonda Yale University TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: AAA
The Problem • Minimize a real-valued objective function that factorizes as a sum of potentials • (a multiset whose elements are subsets of the indices 1,…,n)
Corresponding Graph 1 3 2
Local Message Passing Algorithms • Pass messages on this graph to minimize f • Distributed message passing algorithm • Ideal for large scientific problems, sensor networks, etc. 1 3 2
The Min-Sum Algorithm • Messages at time t: 1 3 2 4
Computing Beliefs • The min-marginal corresponding to the ith variable is given by • Beliefs approximate the min-marginals: • Estimate the optimal assignment as
Min-Sum: Convergence Properties • Iterations do not necessarily converge • Always converges when the factor graph is a tree • Converged estimates need not correspond to the optimal solution • Performs well empirically
Previous Work • Prior work focused on two aspects of message passing algorithms • Convergence • Coordinate ascent schemes • Not necessarily local message passing algorithms • Correctness • No combinatorial characterization of failure modes • Concerned only with global optimality
Contributions • A new local message passing algorithm • Parameterized family of message passing algorithms • Conditions under which the estimate produced by the splitting algorithm is guaranteed to be a global optima • Conditions under which the estimate produced by the splitting algorithm is guaranteed to be a localoptima
Contributions • What makes a graphical model “good”? • Combinatorial understanding of the failure modes of the splitting algorithm via graph covers • Can be extended to other iterative algorithms • Techniques for handling objective functions for which the known convergent algorithms fail • Reparameterization centric approach
Publications • Convergent and correct message passing schemes for optimization problems over graphical modelsProceedings of the 26th Conference on Uncertainty in Artificial Intelligence (UAI), July 2010 • Fixing Max-Product: A Unified Look at Message Passing Algorithms(invited talk)Proceedings of theForty-Eighth Annual Allerton Conference on Communication, Control, and Computing, September 2010 • Unconstrained minimization of quadratic functions via min-sumProceedings of the Conference on Information Sciences and Systems (CISS), Princeton, NJ/USA, March 2010 • Graph covers and quadratic minimizationProceedings of the Forty-Seventh Annual Allerton Conference on Communication, Control, and Computing, September 2009 • s-t paths using the min-sum algorithmProceedings of theForty-Sixth Annual Allerton Conference on Communication, Control, and Computing, September 2008
Outline • Reparameterizations • Lower Bounds • Convergent Message Passing • Finding a Minimizing Assignment • Graph covers • Quadratic Minimization
The Problem • Minimize a real-valued objective function that factorizes as a sum of potentials • (a multiset whose elements are subsets of the indices 1,…,n)
Factorizations • Some factorizations are better than others • If xi takes one of k values this requires at most 2k2 + k operations
Factorizations • Some factorizations are better than others • Suppose • Only need k operations to compute the minimum value!
Reparameterizations • We can rewrite the objective function as • This does not change the objective function as long as the messages are real-valued at each x • The objective function is reparameterized in terms of the messages
Reparameterizations • We can rewrite the objective function as • The reparameterization has the same factor graph as the original factorization • Many message passing algorithms produce a reparameterization upon convergence
The Splitting Reparameterization • Let c be a vector of non-zero reals • If c is a vector of positive integers, then we could view this as a factorization in two ways: • Over the same factor graph as the original potentials • Over a factor graph where each potential has been “split” into several pieces
The Splitting Reparameterization 1 1 3 2 2 3 Factor graph resulting from “splitting” each of the pairwise potentials 3 times Factor graph
The Splitting Reparameterization • Beliefs: • Reparameterization:
Outline • Reparameterizations • Lower Bounds • Convergent Message Passing • Finding a Minimizing Assignment • Graph covers • Quadratic Minimization
Lower Bounds • Can lower bound the objective function with these reparameterizations: • Find the collection of messages that maximize this lower bound • Lower bound is a concave function of the messages • Use coordinate ascent or subgradient methods
Lower Bounds and the MAP LP • Equivalent to minimizing f • Dual provides a lower bound on f • Messages are a side-effect of certain dual formulations
Outline • Reparameterizations • Lower Bounds • Convergent Message Passing • Finding a Minimizing Assignment • Graph covers • Quadratic Minimization
The Splitting Algorithm • A local message passing algorithm for the splitting reparameterization • Contains the min-sum algorithm as a special case • For the integer case, can be derived from the min-sum update equations
The Splitting Algorithm • For certain choices of c, an asynchronous version of the splitting algorithm can be shown to be a block coordinate ascent scheme for the lower bound: • For example:
Coordinate Ascent • Guaranteed to converge • Does not necessarily maximize the lower bound • Can get stuck in a suboptimal configuration • Can be shown to converge to the maximum in restricted cases • Pairwise-binary objective functions
Other Ascent Schemes • Many other ascent algorithms are possible over different lower bounds: • TRW-S [Kolmogorov 2007] • MPLP [Globerson and Jaakkola 2007] • Max-Sum Diffusion [Werner 2007] • Norm-product [Hazan 2010] • Not all coordinate ascent schemes are local
Outline • Reparameterizations • Lower Bounds • Convergent Message Passing • Finding a Minimizing Assignment • Graph covers • Quadratic Minimization
Constructing the Solution • Construct an estimate, x*, of the optimal assignment from the beliefs by choosing • For certain choices of the vector c, if each argmin is unique, then x* minimizes f • A simple choice of c guarantees both convergence and correctness (if the argmins are unique)
Correctness • If the argmins are not unique, then we may not be able to construct a solution • When does the algorithm converge to the correct minimizing assignment?
Outline • Reparameterizations • Lower Bounds • Convergent Message Passing • Finding a Minimizing Assignment • Graph covers • Quadratic Minimization
Graph Covers • A graph H covers a graph G if there is homomorphism from H to G that is a bijection on neighborhoods 1 2 3 3 2 1 1’ 3’ 2’ Graph G 2-cover of G
Graph Covers • Potential functions are “lifts” of the nodes they cover 1 2 3 1 3 2 1’ 3’ 2’ Graph G 2-cover of G
Graph Covers • The lifted potentials define a new objective function • Objective function: • 2-cover objective function
Graph Covers • Indistinguishability: for any cover and any choice of initial messages on the original graph, there exists a choice of initial messages on the cover such that the messages passed by the splitting algorithm are identical on both graphs • For choices of c that guarantee correctness, any assignment that uniquely minimizes each must also minimize the objective function corresponding to any finite cover
Maximum Weight Independent Set 2 3 1 1 2 3 1’ 3’ 2’ Graph G 2-cover of G
Maximum Weight Independent Set 2 2 5 5 2 2 5 2 2 Graph G 2-cover of G
Maximum Weight Independent Set 2 2 5 5 2 2 5 2 2 Graph G 2-cover of G
Maximum Weight Independent Set 2 2 3 3 2 2 3 2 2 Graph G 2-cover of G
Maximum Weight Independent Set 2 2 3 3 2 2 3 2 2 Graph G 2-cover of G
More Graph Covers • If covers of the factor graph have different solutions • The splitting algorithm cannot converge to the correct answer for choices of c that guarantee correctness • The min-sum algorithm may converge to an assignment that is optimal on a cover • There are applications for which the splitting algorithm always works • Minimum cuts, shortest paths, and more…
Graph Covers • Suppose f factorizes over a set with corresponding factor graph G and the choice of c guarantees correctness • Theorem: the splitting algorithm can only converge to beliefs that have unique argmins if • f is uniquely minimized at the assignment x* • The objective function corresponding to every finite cover H of G has a unique minimum that is a lift of x*
Graph Covers • This result suggests that • There is a close link between “good” factorizations and the difficulty of a problem • Convergent and correct algorithms are not ideal for all applications • Convex functions can be covered by functions that are not convex
Outline • Reparameterizations • Lower Bounds • Convergent Message Passing • Finding a Minimizing Assignment • Graph covers • Quadratic Minimization
Quadratic Minimization • symmetric positive definite implies a unique minimum • Minimized at
Quadratic Minimization • For a positive definite matrix, min-sum convergence implies a correct solution: • Min-sum is not guaranteed to converge for all symmetric positive definite matrices