1 / 18

Greedy Approximation Algorithms for finding Dense Components in a Graph

Greedy Approximation Algorithms for finding Dense Components in a Graph. Paper by Moses Charikar. Presentation by Paul Horn. Overview. Differing definitions of density The problem Undirected Case Linear Programming Network Flows Approximation Directed Case Linear Programming

Faraday
Télécharger la présentation

Greedy Approximation Algorithms for finding Dense Components in a Graph

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Greedy Approximation Algorithms for finding Dense Components in a Graph Paper by Moses Charikar Presentation by Paul Horn

  2. Overview • Differing definitions of density • The problem • Undirected Case • Linear Programming • Network Flows • Approximation • Directed Case • Linear Programming • Approximation

  3. Defining Density • Logical definition of density relates the number of edges to the number of possible edges. In other words, given G(V,E)

  4. Problems with Density • This simple definition of density does not make sense when looking for a densest subgraph, as two vertices connected by an edge have density 1, and this problem simplifies to maximum clique

  5. Redefining Density • Instead we define density as the average degree of a subgraph. • This definition of density is appropriate for sparse graphs • This definition is, however, inappropriate for Erdős-Rényi random graphs.

  6. Density of a Directed Graph • Introduced by Kannan and Vinay Given a digraph G(V,E), consider subgraphs S, T and let E(S,T) be the set of directed edges from S to T. Then the density of the sets S and T is The density of the graph G is

  7. The problem • Known exact algorithms for finding a maximum density subgraph of a graph are cubic or slower. • For large graphs, such as the webgraph – or even any sizable chunk of the webgraph this is too slow.

  8. Linear programming • In an undirected case an exactly solution can be solved by maximizing the following LP.

  9. Go with the flow? • Flow-based algorithm to find a maximum density subgraph exists. • Finding a Maximum Density Subgraph, by A.V. Goldberg • Creates a digraph from the undirected graph, and uses flows to partion the graph. • Requires log(n) executions of a max flow algorithm

  10. Getting Greedy… • Since the density of a subgraph S is its average degree, nodes of lowest degree are least likely to be a part of the densest subgraph. • Algorithm: Remove the lowest degree vertex each time, find the maximum density subgraph. • Runs in O(|V|) time. • Theorem: Algorithm is a 2-approximation of f(S)

  11. Directing our Insight • Finding the maximum d(S,T) is harder as we need to find the maximum over all subgraphs S and T. • For our exact case, we can generalize our LP to use |S|/|T| = c as a parameter to give us our new LP(c) • Can be solved in O(n2) linear programs

  12. LP(c) LP(c) A solution to this linear program corresponds to the densests sets S, T such that |S|/|T| = c for a given value of c. Therefore

  13. Approximate this. • Idea: Maintain two sets, S and T. At each iteration remove either the vertex of the lowest ‘degree’ in S or T based on a certain rule. • We define degree of a vertex x in S to be |E({x}, T)| and degree of a vertex y in T to be |E(S,{y})|. • Our rule is based on the same idea of c=|S|/|T| that we found in the linear progam, so each pass finds an S and T that maximize for that particular c.

  14. Analyzing our Approximation • When run over all c values, this algorithm gives us a 2 approximation of d(c). • There are, however, roughly n2possible values of c. • Each iteration can run in O(m+n) time. • Therefore running through all possible values becomes restrictive. • Anis possible in iterations of the algorithm.

  15. Generalizations, and notes • While there is a flow-based algorithm for finding a maximum density subgraph of an undirected graph, none is known for a digraph. • Both cases can be generalized to weighted graphs, however the linear nature of the algorithm does not hold. • Using Fibonacci heaps it can run in O(m+nlogn). (in the directed case, for a single value of c.)

  16. Wrapping Up • Finding dense subgraphs is important in areas such as clustering. • Kannan and Vinay defintion of density motivated by the idea of hubs and authorities. • With large graphs (such as any sizable chunk of the webgraph), solving the n2LP to find the exact densest graph is unrealistic

  17. Wrapping Up: The Sequel • Therefore, the paper • Provides LP solutions to both the directed and undirected cases • Provides a linear approximation algorithm for undirected graph techniques • Generalizes the algorithm to directed graphs, finding sets S and T given |S|/|T|=c. • Observes that this is a 2-aproximation when run over all values of c and a aproximation is possible in iterations.

  18. Future Work • Flow based algorithm for directed case. • The defintion of density which we used does not require S and T to be disjoint. How does this requirement affect the algorithm and it’s complexity? • An n-approximation of d(G) can provide an O(n)-approximation of d’(G)

More Related