1 / 52

What I learned about cut, expansion and density problems

What I learned about cut, expansion and density problems. Guy Kortsarz. The character flaw of Michael Myers, Freddy Kruger and Jason Voorhees: Cut problems. Input: an edge weighted graph G(V,E ) and a collection of pairs { S i , t i }

jackie
Télécharger la présentation

What I learned about cut, expansion and density problems

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. What I learned about cut, expansion and density problems Guy Kortsarz

  2. The character flaw of Michael Myers, Freddy Kruger and Jason Voorhees: Cut problems. Input: an edge weighted graph G(V,E) and a collection of pairs {Si,ti} Output: a minimum cost collection of edges whose removal disconnects all pairs We look at it as a fractional relaxation that says: Put lengths on edges so that the distance between Si andtiis at least 1 for every i.

  3. The “good cut point” lemma 0.5 0.1 0.2 0.2 0.3 0.2 0.1 0.2 0.3 t 0.1 s 0.1 0.05 0.4 0.1 0.07 0.01 0.1 There is a radios r0.5so that if you cut at r the cut cots value is only O(log n) factor larger than the LP value inside the sphere.

  4. Remarks • The fact that r<0.5 is important. • Means that no s,tpair can both be in the sphere. • The total cost is roughly O(log n) times the LP value. • Needs to give some non zero value for radios 0 for this to work.

  5. Vertex separators in undirected graphs • Say that we have a collection of {s,t} pairs so that the distance between every pair is at least d. • Say that you want to delete vertices and cut all such pairs. • A fractional solution of n/d is giving every vertex value 1/d.

  6. Small vertex separator • There is a collection of at most O(log n)n/d vertices who separate all the pairs of distance at least d. Not that far from the fractional solution. • This is not written anywhere (but not hard to prove). • Credit: the upper bound due to GargVazirani and Yanakakis

  7. Separating pairs of distance al least d: directed graphs • How many edges do you need to remove if we have a collection {s,t} of pairs so that dist(s,t)≥d? • We worked on this problem without knowing that its closely related to the Maximum Number of Disjoint Paths in Directed graphs. • Thus some results were known • Chekuri and KhannaO(n4/d4) edges. • Hajiaghayi, Leighton O(n3/d3) edges

  8. What we proved • We proved that there exists a separator with O(n2/d2) edges (K, Nutov) • Good achievement . • Then I found out: The above easily implies an O(n2/3) approximation to the “Maximum number of disjoint paths problem on directed graphs” • Thus I started to suspect this lemma was proved before.

  9. It was done 3 months before in a SODA paper • The authors Varadarajan andVenkataraman. Hence the O(n2/3) was known. • The algorithm: among the non joined pairs that are reachable take the shortest path. • Still we proved a ratio O(n2/3/opt1/3) ratio for unit capacities. • Also O(opt) is known (see later Anupam Gupta)

  10. Clearlysqrt{n} ratio isonlyifopt=sqrt{n} • Best ratio known by AmitAgarwal, NogaAlonand Moses Charikar. O(n11/n23) ratio. • The paper seems very complex to me • For the unit capacity case since we have sqrt{n} only for opt=sqrt{n} maybe there is a simpler algorithm that returns o(n) edges? • Hardness of approximation: Labelcoverhard. Chuzhoy and Khanna. • Both the breaking of the sqrt{n} and the lower bound are huge achievments.

  11. The elegant and simple paper by AnupamGipta • The ratio is only O(sqrt{n}) but very elegant. • Thepaper of Agarwal et al used ideas from the paper of Gupta. • Also from my paper with Nutov. Reducing reachability among pairs. • And now to another less known but very simple cut lemma. • In this case we may charge all edges.

  12. The ideas of Gupta • Solve the LP. Add all edges with xe ≥1/sqrt{n} to the solution. • Consider a non separated pair sjtj • And now we show that there is a cut with 1/3 ≤ r≤ 2/3 distance from sjof value at most the fractional opt. • For this, break the distances of the LP to multiplication of a small . • We do get x eceif we go from 1/3 to 2/3 and multiply the costs by 

  13. The I’thfarctional value adds the cost crossing i to (i+1) times  •  1/3≤i≤2/3 Cut(i)≤optf • It can not be that for every i: Cut(i)>3 optf • This means that there is a cut whose value is at most 3optf • Note that here every(relevant) edge for s jtjis charged.

  14. Loss of reachability Let (u,v) be an edge on a path from sj to tj. Note that there is a path of length 1/3 in the LP distances that u was able to reach and now it does not. To make length 1/3 you need (sqrt{n}) vertices. Thus (u,v) is charged at most O(sqrt{n}) times hence sqrt{n} ratio. In addition as each time charged 1/3 and disjoint 1/3, so O(opt f) ratio follows.

  15. A very interesting alternative objective function After the the cut is chosen give length 1 cut edges and length 0 to non cut edges Minimize the max distance between sjtj. Called the Checkpoint Problem. A paper by M. Hajiaghayiand R. Khandekar and K. and J. Mestre. On trees a tight 2 ratio. If all path go up, a gem combinatorial optimal solution polynomial. Only polynomial ratios for the case of general G.

  16. Conductance • Conductance: e(S,V-S)/deg(S) (a.k.a sparsest cut) • Approximation Algorithms • O(log n) [Bourgain], [Leighton, Rao],[Linial,London,Rabinovich] • Uses (quite elementary) embedding of a metric into L1 O(log n)stretch factor.

  17. The best known ratio By Arora, Rao and Vazirani. Approximation ratio Uses the so called negative type metric. Uses what the authors call expander flows We now focus on the small set expansion conjecture.

  18. The small set expansion conjecture Let ≤0.5 be a constant and let  be an arbitrary small constant. Let (S)=e(S,V-S)/deg(S) be the expansion factor of S. Let X be the collection of all subsets of V with size n Consider ()=MinSX(S)

  19. The conjecture It is hard to distinguish between the following two cases: (a)()≤ and (b) ()≥1- This is a weaker assumption than the Unique Game conjecture by Khot. If we prove the SSEC we prove the UGC

  20. A closely related problem There is an alternative way to disprove the conjecture (due to Gandhi and K). By describing a very simple problem that has ratio 2 even in the weighted case. And we will show that under the SSEC this is the best ratio.Or a 2- ratio for the problem disproves the SSEC.

  21. The problem Given: a graph G(V,E)and a number k. Required: Find a set Uof k vertices with minimum number of touching edges A touching edge: at least one endpoint in U Remark: our main results were for the weighted case. We improved a result by Shmoys et al and a different one by Hochbaum et al.

  22. A trivial ratio of 2 • Let OPT, |OPT|=k, be the best solution • Let U be the k least degrees vertices, thus deg(OPT)≥ deg(U) • Clearly t(OPT)≥deg(OPT)/2 • : • t(U)≤deg(U)≤deg(OPT) • ≤2t(OPT)

  23. The weighted case Vertices have weights. Minimum edges under cost at least k: Find a set U of cost at least k and minimize the number of edges touching U. We explain the ratio 2 for a related question: maximum touching edges m’ and maximize the vertex cost.

  24. Some ideas of how to give ratio 2 for Maximum cost at mostM edges We use Dynamic Programming. We guess the number P of edges between the optimum set OPT and V-OPT. We guess the sum of degree of OPT whom may be 2M. A serious technical problem: we are only able to compute A[n, P, M].

  25. The reason for that This is the only way, it seems, to assure feasibility. Indeed if deg(U)≤M then t(U)≤M. The question is do we loose a lot by bounding the sum of degrees by M while the sum may be 2M? One more detail: we need to guess the highest cost vertex in OPT and add it the our solution.

  26. If deg(U)≤M, how much cost we loose? Let OPT= A {x}  B so that deg(A+x) isthe first to be above M Thus A isa feasible solution for M. Clearly B too is a feasible solution for M because deg(A+x)>M and the total at most 2M One of A or B has ½ the weight. The fact that we guess the highest cost vertex in OPT compensate for x. Thus ratio 2.

  27. What is the properties of a good solution? Let us check the case of adregular graphs. The question is if the edges are internal or external S

  28. What is the properties of a good solution In this example most of the edges in S stay inside. Which means that t(S) is close to kd/2 U

  29. What is the properties of a good solution But S can behave badly. Namely most edges go to V-U In this case t(S) close to dk. U

  30. Is the Small SET Expansion Conjecturereliable? Opinions vary. I think: VERY RELIABLE. We tried to disprove the SSGE and failed. Namely we tried to give a ratio 2- for the problem descrbed before. It seems that the SSECis related to the a Dense k-subgraph

  31. Partition via Sparsest cut • Motivation: • Natural Social Communities[MSST08,ABL10,…] • Better clusters (AGM) • Easier to compute (GLMY) • Useful for Distributed Computation (AGM) • Good Clusters  Low Conductance? • Inside: Well-connected, • Toward outside: Not so well-connected.

  32. Overlapping Clustering • Find a set of (at most K) overlapping clusters: each cluster degree sum≤ B, coveringall nodes, and minimize: • Maximum conductance of clusters (Min-Max) • Sum of the conductance of clusters (Min-Sum) • Overlapping vs. non-overlapping variants?

  33. Summary of Results [Khandekar, K, Mirrokni.] Overlap vs. no-overlap: • Min-Sum: Within a factor 2 using Uncrossing. • Min-Max: Might be arbitrarily different.

  34. Arora et al: Finding overlapping communities in social networks: Toward a rigorous approach. A lot of follow up work. M. Balcan et al: Again, rigorous study of overlapping clusters And much more. Now last topic: Density

  35. The densest subgraph problem • Let e(S), be the number of edges in the graph induced by S. • This problem requires finding a subset S of V that maximizes e(S)/|S|. • A faster algorithm, approximates the best density by 2 but get O(n) time which is much faster than flow. • Was done by K,Peleg in 1992. Also Charikar1998. • Very extensively cited for social networks. Almost always attribute the result to Charikar.

  36. A quick approximation for densest subgraph • Let  be e(OPT)/|OPT| • We show that all vertices in the optimum have degree at least . • Otherwise, removing a vertex with degree less than  increases the optimum. • Therefore we can iteratively remove vertices of degree strictly less than . We never remove a vertex of OPT and so the remaining graph is not empty.

  37. The 2 approximation continued • The degree of all vertices in the final graph is at least . • Say that there are i vertices. The density is the sum of the degrees, over 2,divided by i. • In other words the density is at least i*/(2i)=  /2 Thus ratio 2. With a bit of data structure, O(n) running time.

  38. The dense k-subgraph problem Input: A graph G(V,E) and a number k Required: a set U of size k with maximum e(U) I started working on this circa 1993. What did I learn?

  39. Unfortunately, not much It may be fair to say that this is a central problem. Amazing number of applications (too long to list). A big disappointment: under PNP there is still no hardness No PTAS under NP is not sub exponential. Khot Harness under assumptions on random 3-SAT. Feige

  40. A fact about walks (and a one line proof) If the minimum degree is  then clearly the number of walks of length d is at least nd What about the average degree? It turns out that this is also true for the average degree.

  41. The number of walks with respect the the average degree  It turns out that the number of walks is also at least nd I do not know the proof to this. But I discovered years later that is was known from 1920’s. A problem is hard: bypass the proof.

  42. Do not solve problems: bypass them By the claim above the number of walks of length d is at least nd We bypass this hard proof. We give a one line or so proof that there is alwaysu,vso that Walks(u,v)≥ d/n Uses a known fact: the largest eigenvalue of the matrix of G is at least the average degree in G.

  43. [Feige, K, Peleg]: n approximation for <1/3 Consider the matrix of the graph. Let 1≥ 2≥ 3 ≥… be the (real) eigenvalues of A. Well known: trace(A)=ii Well known: the eigenvalues of Ak areIk Well known that Ak[i,j] are the number of walks fromito j of length k.

  44. The proof Ak[i,j] Ak[i,j]= i I2k ≥ 12k ≥2k This implies that there is an i and a j so that A2l[i,j] ≥2k/n2 The claim follows by taking square root of every side. Why did we count walks? Walks are example of trees that exist if the graph is dense enough but not if it is random.

  45. The addiotonal insight Bhaskara, Charikar, Chlamtac, FeigeVijaraghavan. Almost 20 years later! The intuition comes from comparing random graph to a random graph with a dense subgraphimplanted in the random graph. It also turns out that walks are not the best thing to count.

  46. Counting local trees It turns out that in a dense graph some trees appear more than in G(n,p). For example walks. Ad hoc- systematic. The state of the art: we can not tell between a graph chosen from G(n,1/sqrt{n}) and the same graph with sqrt{n} vertices changed to G(sqrt{n},1/n1/4)

  47. Note the random gap The average degree of the planted graph is n1/4 In the random graph any set of sqrt{n} vertices will have O(sqrt{n}) edges. The author complement this with essentially O(n1/4 ) approximation. This was done by two groups one using LP lift and project and the other using combinatorial methods.

  48. Important remarks In this example, the average degree equals the square root of the number of vertices both in the random planted graph and the original graph. As far as I understand, if this is not the case this paper can get better than trivial ratio. Technically the improvement is done by counting caterpillars and not walks.

  49. Something is missing in our lower bound understanding We keep adding assumptions. The SSEC, the 2 to 1 unique game conjecture. Projection game conjecture. Countless paper show hardness just by saying the paper is Dense k-sub graph hard to approximate.

  50. Technically, there is no value to such hardness Still, I would not try to give a polylog ratio to problems that are dense k-subgraphhard. An example of a simple such problem. k-Steiner Forest: connect k of the pairs. At best we can expect poly ratios. It seem OK assume the exponential time hypothesis namely that 3-SAT can not be solved in time 2o(n)

More Related