1 / 17

Much Faster Algorithms for Matrix Scaling

Much Faster Algorithms for Matrix Scaling. Matrix Scaling and Balancing via Box-Constrained Newton’s Method and Interior Point Methods. Zeyuan  Allen-Zhu,  Yuanzhi Li, Rafael Oliveira, Avi Wigderson. Michael Cohen, Aleksander Mądry, Dimitris Tsipras, Adrian Vladu. Matrix Scaling. M 1 = r

poe
Télécharger la présentation

Much Faster Algorithms for Matrix Scaling

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Much Faster Algorithms for Matrix Scaling Matrix Scaling and Balancing via Box-Constrained Newton’s Method and Interior Point Methods Zeyuan Allen-Zhu, YuanzhiLi, Rafael Oliveira,AviWigderson Michael Cohen, Aleksander Mądry, Dimitris Tsipras, Adrian Vladu

  2. Matrix Scaling M 1 = r MT1 = c .5 1 = 1 1 .5 2 Y X A M Matrix Balancing M1 = MT1 1 1 = 1 1 .5 2 X-1 X A M

  3. Why Care? • Preconditioning linear systems (XAY) Y-1z = Xb A z = b • Approximating the permanent of nonnegative matrices Per(A) = Per(XAY) /(Per(X) Per(Y)) exp(-n) ≤ Per(XAY) ≤ 1 XAY doubly stochastic • Detecting perfect matchings A : adjacency matrix of bipartite graph ∃ perfect matching  Per(A) ≠ 0

  4. Why Care? • Intensively studied in scientific computing literature • [Wilkinson ’59], [Osborne ’60], [Sinkhorn’64], [Parlett, Reinsch’69], [Kalantari, Khachiyan’15], [Schulman, Sinclair ’15], … • Matrix balancing routines implemented in MATLAB, R • Generalizations (operator scaling) are related to Polynomial Identity Testing • [Gurvits’04],[Garg, Gurvits, Oliveira, Wigderson’17] , …

  5. Generalized Matrix Balancing Via Convex Optimization • Captures the problem’s difficulty • Solves matrix scaling via simple reduction rM = M 1 cM= MT1 1 1 = 1 1 .5 2 exp(-X) exp(X) A M X-1 X Goal: rM-cM=0 d f(x) = ∑ijAijexp(xi-xj) - ∑i dixi nice convex function ∇f(x) = rM - cM - d

  6. Equivalent Nonlinear Flow Problem “Nonlinear Ohm’s Law”: fuv = Auvexp(xu- xv) Ohm’s Law: fuv = Auv(xu- xv) 1 2 e/2 .5 0 e/2 3 .5 e 1 e/2 3e/2 t e s .5 -2e e +2e 1.5 1 2 1 1 * edge weights = capacitances

  7. Generalized Matrix Balancing Via Convex Optimization • Captures difficulty of both problems • Solves matrix scaling via simple reduction rM = M 1 cM= MT1 1 1 = 1 1 .5 2 A exp(X) M exp(-X) Goal: rM-cM=d Goal: |rM-cM-d|≤ ε f(x) = nice convex function ∇f(x) = rM - cM - d

  8. Generalized Matrix Balancing Via Convex Optimization f(x) = nice convex function ∇f(x) = r - c - d General Convex Optimization Framework: f(x + Δ) = f(x) + ∇f(x)TΔ + ½ ΔTHxΔ + … Δ= argmin|Δ|≤c… Δ= argmin|Δ|≤c… Second order methods First order methods [Ostrovsky, Rabani, Yousefi’17] Matrix Balancing O(m+nε-2) • Sinkhorn/Osborne iterations are instantiations of this framework (coordinate descent) [Kalantari,Khachiyan, Shokoufandeh’97] Õ(n4 log ε-1)

  9. Our Results • [AZLOW ’17 ] • [CMTV ’17 ] First Order Methods Second Order Methods Accelerated Gradient DescentO(mn1/3ε-2/3) Interior Point Method Õ(m3/2 log ε-1) Box-Constrained Newton Method New second-order framework (essentially identical in both papers) Õ((m+n4/3) log κ(X*)) Õ(m log κ(X*)) κ(X*) = condition number of matrix that yields perfect balancing

  10. Generalized Matrix Balancing Via Convex Optimization f(x) = nice convex function • Can we use second order information to obtain a good solution in few iterations? ∇f(x) = rM - cM - d Hx)= diag(rM+cM) - (M+MT) f(x + Δ) ≈ f(x) + ∇f(x)TΔ + ½ ΔTHxΔ (*) • Hessian matrix is a graph Laplacian • Can compute Hx-1b in Õ(m) time [Spielman-Teng’08, …] M = exp(X) A exp(-X) rM = M 1 cM= MT1 • If |Δ|∞≤ 1then Hx ≈O(1)Hx+Δ (* whenever the Hessian does not change too much along the line between x and x+Δ)

  11. Box-Constrained Newton’s Method f(x + Δ) ≈ f(x) + ∇f(x)TΔ + ½ ΔTHxΔ Key idea: • If |Δ|∞≤ 1then Hx ≈O(1)Hx+Δ • Suppose we can exactly minimize the second order approximation over |Δ|∞≤ 1 • Goal: show that moving to minimizer inside box makes a lot of progress • f(x+Δ)-f(x*) ≥ 1/10 (f(x+Δ*)-f(x*)) Minimizer of quadratic approximationin L∞ region • Minimizer of f in L∞ region

  12. R∞ = maxx:f(x)≤f(x0) |x-x*|∞ Box-Constrained Newton’s Method • f(O)-f(O) ≥ f(O)-f(O) • f(O)-f(O) ≥ (f(O)-f(O)) / |O-O|∞ • absolute upper bound R ∞ • arbitrarily close to O in Õ(R ∞) iterations

  13. R∞ = maxx:f(x)≤f(x0) |x-x*|∞ Box-Constrained Newton’s Method f(x + Δ) ≈ f(x) + ∇f(x)TΔ + ½ ΔTHxΔ Key idea: • If |Δ|∞≤ 1then Hx ≈O(1)Hx+Δ • Õ(R∞) box constrained quadratic minimizations • Suppose we can exactly minimize the second order approximation over |Δ|∞≤ 1 • f(x+Δ)-f(x*) ≥ 1/10 (f(x+Δ*)-f(x*)) Minimizer of quadratic approximationin L∞ region • Minimizer of f in L∞ region

  14. R∞ = maxx:f(x)≤f(x0) |x-x*|∞ Box-Constrained Newton’s Method f(x + Δ) ≈ f(x) + ∇f(x)TΔ + ½ ΔTHxΔ Key idea: • If |Δ|∞≤ 1then Hx ≈O(1)Hx+Δ • Õ(R∞) box constrained quadratic minimizations • Õ(kR∞) box constrained quadratic minimizations • Suppose we can exactly minimize the second order approximation over |Δ|∞≤ 1 • Unclear how to solve this fast  • Instead, relax the L∞ constraint by a factor of k • outsource to k-oracle

  15. k-oracle Input: graph Laplacian L, vector b Ideally: output Instead: output • [AZLOW ’17 ] • [CMTV ’17 ] based on Laplacian solver [LPS ’15] based on approximate max flow algorithm [CKMST’11] Õ(m) Õ(m+n4/3)

  16. Conclusions and Future Outlook • Nearly-linear time algorithms for matrix scaling and balancing • New framework for second order optimization • Used Hessian smoothness while avoiding self-concordance • Can we use any of these ideas for faster interior point methods? • Dependence in condition number log κ(X*) given by the R∞ bound • If we want to detect perfect matchings, R∞ = Θ(n) • Is there a way to improve this dependence? (log κ(X*))1/2 • We saw an extension of Laplacian solving. What else is there? • Better primitives for convex optimization?

  17. Thank You!

More Related