1 / 20

Approximating the MST Weight in Sublinear Time

Approximating the MST Weight in Sublinear Time. Bernard Chazelle (Princeton) Ronitt Rubinfeld (NEC) Luca Trevisan (U.C. Berkeley). Sublinear Time Algorithms. Make sense for problems on very large data sets

Télécharger la présentation

Approximating the MST Weight in Sublinear Time

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Approximating the MST Weight in Sublinear Time Bernard Chazelle (Princeton) Ronitt Rubinfeld (NEC) Luca Trevisan (U.C. Berkeley)

  2. Sublinear Time Algorithms • Make sense for problems on very large data sets • Go contrary to common intuition that “an algorithm must be given at least enough time to read all the input” • In most non-trivial cases are probabilistic • In most non-trivial cases are approximate

  3. Approximation • For decision problems:the output is the correct answer either for the given input, or at least for some other input “close” to it.(Property Testing) • For optimization problems:the output is a number that is close to the cost of the optimal solution for the given input.(There is not enough time to construct a solution)

  4. Previous Examples • The cost of the max cut in a graph with n nodes and cn2edges can be approximated to within a factor e in time 2poly(1/ec).(Goldreich, Goldwasser, Ron) • Other results for “dense” instances of optimization problems, for low-rank approximation of matrices, for metric spaces. . . • No results (that we know of) for problems on bounded-degree graphs.

  5. Our Result • Given a connected weighted graph G, with maximum degree d and with weights in the range {1, . . . , w}, • we can compute the weight of the minimum spanning tree of G to within a factor of e in time O(dwe-2log w/e); • we also prove that it is necessary to look at W(dwe-2) entries in the representation of G. (We assume that G is represented using adjacency lists)

  6. Algorithm

  7. Main Intuition • Suppose all weights are 1 or 2 • Then the MST weight is equal to n – 2 + # of conn. comp. induced by weight-1 edges weight 1 weight 2 MST connected components Induced by weight-1 edges

  8. Algorithm for weights in {1,2} • To approximate the MST weight to within a multiplicative factor (1+e) it’s enough to approximate c1 to within an additive factor en(c1:= # of connected components induced by weight-1 edges) • To approximate c1 we use ideas from Goldreich-Ron (property testing of connectivity) • The algorithm runs in time O(de-2loge-1)

  9. Approximating # of connected components • Given a graph G of max degree d with n nodes we want to compute c, the number of connected components of G up to an additive error en. • For every vertex u, definenu := 1 / size of component of u • Thenc = Sunu • And if we callau:= max {nu, e } • Thenc = Suauen

  10. Analysis • Can estimate summation of au using sampling • Once we pick a vertex u at random, the value au can be computed in time O(d/e) • We need to pick O(1/e2) vertices, so we get running time O(d/e3)

  11. Algorithm CC-APPROX(e) Repeat O(1/e2) times pick a random vertex v do a BFS from v, stopping after 2/e steps b:= 1/ number of visited vertices return (average of the values b) * n

  12. Improved Algorithm • Pick vertices at random as before, but then stop the BFS after 2k steps with prob. 2-k • If the output is appropriately “scaled”, the average output is right • The BFS takes on average of log 1/e steps instead of 1/e • The variance is still low • Improved algorithm runs in timeO(de-2log 1/e)

  13. General Weights • Generalize argument for weight 1 and 2. • Let ci = # of connected components induced by edges of weight at most i • Then the MST weight is n – w + Si=1,. . ., w-1ci

  14. Final Algorithm • For j=1,. . ., w-1, call algorithm to approximate # of connected component on the subgraph of G obtained by removing edges of cost >j • Get ai, an approximation of ci • Return n – w + Si=1,. . ., w-1ai • Average answer is within en/2 from cost of MST, and variance is bounded • Total running time O(dwe-2log w/e)

  15. Extensions • Low average degree • Non-integer weights

  16. Lower Bound

  17. Abstract sampling problem Define two binary distributions A,B such that • Pr[A=1] = 1/w, Pr[A=0]=1-1/w • Pr[B=1] = 1/w+ e/w, Pr[B=0]=1- 1/w- e/w • Distinguishing A from B with constant probability requires W(w/e2) samples

  18. Reduction • We consider two distributions of weights over a cycle of length n • In distribution G, for each edge we sample from A; if A=0 the edge gets weight 1, otherwise it gets weight w • In distribution H, same with B • H and G are likely to have MST costs that differ by about en • To distinguish them we need to look at W(w/e2) edge weights

  19. Higher Degree • Sample from G or H as before, • also add d-1 forward edges of weight w+1 from each vertex • randomly permute names of vertices • Now, on average, reading t edge weights gives us t/d samples from A or B, so t=W(dw/e2)

  20. Conclusions • A plausibility result showing that approximation for a standard graph problem in bounded degree (and sparse) graphs can be achieved in time independent of number of vertices • Use of approximate cost without solution? • More problems? • Max SAT (work in progress) • Something really useful?

More Related