Sampling and Approximate Counting for Weighted Matchings

Sampling and Approximate Counting for Weighted Matchings Roy Cagan

Motivation – the permanent function • The permanent of an n×n integer matrix is defined by: • Evaluating the permanent of a 0,1-matrix is #P-complete [Valiant,1979] • Target: finding FPRAS for the permanent

Matchings in a graph • A matching in a graph G = (V,E) is any subset of edges that are pairwise vertex disjoint • A matching is said to be perfect if it covers every vertex • per A is equal to the number of perfect matchings in the bipartite graph G = (V1,V2,E), where V1=V2=[n], and iff aij = 1. [Jerrum et al. 1986] • So: approximation for the number of perfect matchings in a graph -> approximation for the Permanent function

Strategy • Weighted matchings • w(M) = λ|M| (λ is a positive real parameter) • The partition function of (weighted) matchings in G (a general graph with 2n vertices) is: • is the number of k-matchings in G • Sampling a weighted matching in a general graph • Our Goal: building FPRAS for ZG

FPRAS for ZG • Approach: simulate a suitable MC, Mmatch(λ), parameterized on G and λ • State space (Ω): the set of all matchings in G • stationary distribution:

Mmatch(λ) – transitions • With probability ½, let M′ = M • Otherwise: • select an edge u.a.r. and set • go to M′ with probability φ=min{1, πλ(M′)/πλ(M)} • (note: this probability is λ−1, λ or 1) ↓ ↑ ↔

Mmatch(λ) – transitions – Example ↑-transition φ=min{1, λ} ↓-transition φ=min{1, λ-1} ↔-transition φ=1

Mmatch(λ) – continued • Claim: Mmatch(λ) is irreducible • all states communicate via the empty matching • Claim: Mmatch(λ) is aperiodic • by step 1, the self-loop probabilities P(M,M) are all non-zero

Mmatch(λ) – continued • Claim: Mmatch(λ) is reversible • Proof: for all we need to show • P(M,M’)=0 iff P(M’,M)=0 • Otherwise, there are 3 cases: • |M|=|M’|+1 • |M|=|M’|-1 • |M|=|M’|

2 3 Mmatch(λ) – continued Assume w.l 1

Note: Note: Approximating Z(λ) • Express Z(λ) as the product: • Where • We will use • The length r of the sequence is taken to be minimal s.t: • Goal: estimate

Approximating Z(λ) – continued • Define the random variable • Then: • Thus estimating can be done by sampling matchings from the distribution and computing the sample mean of

The algorithm – MatchSample • Compute the sequence Where r is the least integer s.t. • For each value in turn, compute an estimate of the ratio : • Obtain an independent sample of size S from (close to) the distribution by performing S independent simulations of the Markov chain Mmatch(λi), each of length Ti , • let be the sample mean of the quantity • Output the product

MatchSample – Sample size S • Proposition 1: If the simulation length Ti in Step 2 is large enough that the variation distance of Mmatch(λi) from its stationary distribution is at most δ=ε/(3er). Then for a sample size S = O(ln(r)∙r2∙ε−2), the output random variable Y satisfies: • Proof: • Let Xij be the value of Zi(M) for the j’th sample in round i • So Xi is the mean of Xij for j=1..S • Note: e−1 ≤ Xij ≤ 1 (since Zi=(1+1/n)-|M|) • Let . Then: • So, for we get • Therefore, with probability at least (union bound on the r variables Xi), we get:

MatchSample – Sample size S • So, a modest sample size at each stage (polynomial in n and in ε−1) suffices to ensure a good final estimate Y, provided of course that the samples come from a distribution that is close enough to the stationary distribution • We still need to show that in order to achieve a distribution which is close enough to the stationary distribution, we don’t need Ti to be too big, i.e. that the mixing time of Mmatch(λi) is not too big.

The mixing time of Mmatch(λi) • Proposition 2: The mixing time of the Markov chain Mmatch(λ) satisfies • So, we can sample from (close to) the complex distribution over the exponentially large space of weighted matchings in G, by performing a Markov chain simulation of length only a low-degree polynomial in the size of G.

MatchSample – FPRAS for Z(λ) • Theorem: Algorithm MatchSample is an FPRAS for Z(λ) • Proof: • Proposition 1 ensures that the output of Algorithm MatchSample satisfies the requirements of an FPRAS for Z • The running time is dominated by the number of Markov chain simulations steps, which is ; since Ti increases with i, this is at most Substituting the upper bound for r, S and Tr , we see that the overall running time of the algorithm is bounded by which grows only polynomially with n, λ′ and ε−1

Proof of proposition 2 • Reminders (Ilan, Omri): • if M is reversible, then • Strategy: • choose a collection of canonical paths in the Markov chain Mmatch(λ) for which the “bottleneck” measure is small • Specifically, we shall show that our paths satisfy • Since the number of matchings in G is certainly bounded above by (2n)!, the stationary probability πλ(X) of any matching X is bounded below by πλ(X) ≥ 1/(2n)!λ′n • Using the bound on and the fact that , we get the bound on the mixing time claimed in Proposition 2

Proof of proposition 2 – the paths • For a pair of matchings X,Y in G, we define the canonical path as follows: • Consider the symmetric difference XY • This consists of a disjoint collection of paths in G (some of which may be closed cycles), each of which has edges that belong alternately to X and to Y. • Fix some arbitrary ordering on all simple paths in G, and designated in each of them a so-called “start vertex”, which is arbitrary if the path is a closed cycle but must be an endpoint otherwise. • This ordering induces a unique ordering P1, P2, . . . , Pm on the paths appearing in XY. • The canonical path from X to Y involves “unwinding” each of the Pi

Proof of proposition 2 – the paths • There are two cases to consider: • Pi is not a cycle • If Pi starts with a X-edge, remove it (↓-transition) • Perform a sequence of ↔-transitions (removes X-edge, inserts Y-edge) • If Pi’s length is odd (we remain with one Y-edge), insert it (↑-transition) • Pi is a cycle • Let Pi=(v0,v1,. . .,v2l+1) • V0 is the start vertex, , • Remove (v0,v1) – an X-edge (↓-transition) • We are left with a path with endpoints v0, v1, one of which must be the start vertex • Continue as above, but: • If v0 is the start vertex, use v1 as the start vertex • If v1 is the start vertex, use v0 as the start vertex • (This trick serves to distinguish paths from cycles)

Example X: Y: XY:

X Y X X Y X X Unwinding a path ↔-transition ↓-transition

X Y Y X X Y Y X X Y X X X X X X X X Y Y Y Y X Y Y Y Y Y Unwinding a cycle ↓-transition ↔-transition ↔-transition ↑-transition

The full path

Bounding the bottleneck measure • Let ε be an arbitrary edge in the Markov chain, i.e., a transition from M to M′≠M • Let denote the set of all canonical paths that use ε • Obtain a bound on the total weight of all paths that pass through ε by defining an injective mapping : • More precisely, it will defined: eXYt is the edge of X adjacent to the start vertex of the path currently being unwound

Example

Bounding the bottleneck measure • Claim 1: is always a matching • Proof: • Idea: show that no vertex degree exceeds 1 • Step 1: no vertex degree exceeds 2 • Let • Since it follows that and therefore, no vertex degree can exceed 2 • Step 2: • Suppose that some vertex u has degree 2 in A • Then A contains edges {u, v1}, {u, v2} and since , one of these edges must belong to X and the other to Y. • Hence neither can belong to • Since all edges of P1…Pi-1 agree with Y and all edges of Pi+1…Pm agree with X, these edges must belong to Pi. • Inside Pi, again, they can’t be edges that have already been unwound, as they agree with Y and can’t be edges that were not unwound yet, as they agree with X • The only option left is that they are the “first” and “last” edges of the circle and this is not possible since we removed eXYt from A

Bounding the bottleneck measure • Claim 2: is injective • Proof: • the symmetric difference can be recovered from using: • Note: Once we have formed the set , it will be apparent whether the current path is a cycle from the sense of unwinding (did it start from the start vertex or from the end vertex) • Given , we can find the paths P1,...,Pm and ε tells us which of these, Pi say, is the path currently being unwound • The partition of into X and Y is now straightforward: • X equals on the wounded paths and equals M on the unwounded paths – this gives all the edges of X which don’t belong to Y • Similarly, we can separate Y from • And to complete X and Y, we add • Hence X and Y can be uniquely recovered from , so is injective.

Bounding the bottleneck measure • Claim 3: is “weight-preserving”: • Proof: • First, note that • since • We now distinguish four cases: • ε is ↓-transition : Suppose M′=M−e. Then , so, viewed as multisets, and are identical. Hence we have: • ε is ↑-transition : The same as (1), with the roles of M and M′ interchanged

Bounding the bottleneck measure • ε is ↔-transition and the current path is a cycle : Suppose M′ = M +e−e′, and consider the multiset . Then , so the multiset differs from only in that e and eXYt are missing from it. Note that in this case, and therefore Thus we have: • ε is ↔-transition and the current path is not a cycle :This is identical with (3), except that the edge eXYt does not appear in the analysis. Accordingly, the bound is

Bounding the bottleneck measure • The first inequality is claim 3 • The second inequality follows from the fact that the length of any canonical path is bounded by 2n • The last inequality follows from claim 2 and from the fact that is a probability distribution • The can be easily improved to to reach the claimed bound

Summary (so far) • The canonical paths and the bound on the bottleneck, lead to the prove of proposition 2 – bound on the mixing time of MMatch • Proposition 2, together with proposition 1, proved that the algorithm MatchSample is indeed an FPRAS for ZG • So, until now, we showed that we can sample weighted matchings from and approximate ZG • Now, we will use these results to approximate the number of perfect matchings in G

Perfect Matchings • Let G = (V,E) be a graph with |V| = 2n • Perfect-matchings: matchings which cover every vertex in the graph • Near-perfect matching: matchings with exactly two unmatched vertices • Note: • the number of perfect matchings is mn • the number of near-perfect matchings is mn−1 • Theorem: There exists a randomized approximation scheme for the number of perfect matchings mn whose running time is polynomial in n, ε−1 and the ratio mn−1/mn • This is not in general an FPRAS, since there exist 2n-vertex graphs for which the ratio mn−1/mn is exponential in n • However, the probability that a randomly selected G on 2n vertices violates the inequality mn−1/mn ≤ 4n tends to 0 as n → ∞. • Thus, the above algorithm constitutes an FPRAS for almost all graphs

Perfect Matchings – continued • Idea: get a good estimator for mn by sampling matchings from the distribution πλ and computing the proportion X, between the perfect matchings’ weight mnλn and the total weight ZG(λ) • Suppose that we have computed a good estimate Z’G(λ) of ZG(λ) • We showed this can be done in time polynomial in n and λ′ • Since E[X] = mnλn/ZG(λ), our estimator for mn will be Y = Xλ−nZ’G(λ) • The sample size required to ensure a good estimate depends on the variance of a single sample, or more precisely on the quantity (E[X])−1 • By making λ large enough, we can make this quantity small • Corresponds to placing very large weight on the perfect matchings • We will next show that (E[X])−1 ≤n+1 and therefore, the sample size required grows only linearly with n

The log-concave lemma • Definition: (a0, a1, . . . , an) is log-concaveif ak−1ak+1 ≤ (ak)2 for k = 1, 2, . . . , n−1 • Note: if a sequence is log-concave, then: and therefore: • Lemma: The sequence m0,m1, . . . ,mn is log-concave • As noted, it follows that • This means that, if we take λ ≥ mn−1/mn, we get:

Perfect Matchings – continued • Using we get: • So, the sample size required grows only linearly with n • Since the time required to generate a single sample grows linearly with λ and ε−1 (proposition 2), the running time of the overall algorithm is polynomial in n, ε−1 and the ratio mn−1/mn as claimed

The log-concave lemma – proof • Let Mk = Mk(G) be the set of k-matchings of G • Thus mk = |Mk(G)| • We need to show that mk−1mk+1 ≤ mk2 and so we can assume that mk+1 > 0 • Let A = Mk+1×Mk−1 and B = Mk×Mk – so we need to show • If M,M′ are matchings, we know that MM’ consists of paths and cycles • Let a path of MM’ be an • M-path : if it contains more M-edges than M′-edges • M′-path : if the reverse is true • For any pair the number of M-paths exceeds the number of M′-paths by exactly two

The log-concave lemma – proof • We partition A into disjoint classes Ar, (r = 1,2,...,k) where • Similarly, we partition B into: • We will now show that for all r>0, and therefore • Let us call a pair reachable from iff and for some M-path P of MM’ • The number of elements of Br reachable from a given is r+1 • Conversely, any given element of Br is reachable from precisely r elements of Ar • Hence if |Ar| > 0 we have

Summary • We showed that we can sample weighted matchings from and approximate ZG • We used these results to show that there exists a randomized approximation scheme for the number of perfect matchings, whose running time is polynomial in n, ε−1 and the ratio mn−1/mn

Sampling and Approximate Counting for Weighted Matchings