Dense Subgraphs on Dynamic Networks

Dense Subgraphs on Dynamic Networks Atish Das Sarma Ashwin Lall Danupon Nanongkai AmitabhTrehan (Presented by AnisurMolla)

Density Network density is probably the most fundamental network metric for understanding how networks tick.... Third Degree Centrality (blog), June 16, 2011

Sparse but Dense • While a graph may be global sparse, it often still has dense substructures • These provide topological characteristics that are often important to understand • Important to study finding dense subgraphs in large graphs • The world wide web • Search query-document click logs • Social Networks • Telephone call logs • Peer-to-peer backbone networks

What do dense structures reveal? • Web social network communities (potentially hidden) • Friend groups / shared interest groups • Good structures to study for “cohesive” webpages • Helpful for identifying similar webpages • Potentially helps spam detection • Network backbones in peer to peer networks • Understand connectivity structure • User behavior/interest analysis from click logs

Properties of Density • Largely “robust” to graph alterations • Small changes in the graph (so edge addition/deletion) only marginally affect the density – so “smooth” in this regard • Relatively stable for dynamic graphs • Measures “local” structural property that often reveals local and global topological insights • Some variants of the problem are poly-time solvable

Density • Density of Graph G(V,E): • Density of Subgraph S (= induced density on G):

Our problem • Efficient Distributed algorithms for discovering densest subgraphs/ bounded size densest subgraphs • Maintaining the subgraphs when edges change (Dynamic graphs)

Our Dynamic Model • Initial Graph over n nodes • Edge Dynamic Model • At each time step, adversary may add or removeup to redges • Constraint: Bound on “dynamic diameter” is D • After adversarial action, nodes communicate with direct neighbors under the Congest model

Distributed Congest Model • Synchronous communication “rounds” • A node can exchange messages with each of its neighbors in each round • Each message should be O(logn) size (bandwidth restriction) • Objective: Minimize number of “rounds”/time complexity • Model is well studied theoretical abstraction for peer-to-peer network motivation

Additional details • Algorithms run continuously to maintain the approximations at all times • Self-awareness: Nodes are aware if they are part of output subgraph • Nodes need knowledge of the dynamic diameter D • Cost is measured in approximation guarantees as well as time bounds

Related Work • Lots of work on finding size-bound dense subgraphs in classical setting • NP-hard with size restriction (poly time solvable otherwise) • No approximation scheme for size-exactly-k or size-at-most-k (and no constant factor known) • Khuller-Saha and Andersen-Chellapilla gave constant factor algorithms for size at-least-k • Some of our algorithms are based on Khuller-Saha • Surprisingly no work in the distributed (CONGEST model) and dynamic settings

Related Work - 2 • Lots of work on dynamic networks • Notable recent model for edge-alteration by Kuhn-Lynch-Oshman (stability property with T-interval connectivity) • Our model slightly different (though similar graphs generated) • Lot of graph problems studied in the CONGEST model (Peleg) • Very fast distributed approximation algorithms studied • Densest subgraph falls under category of “global problem” so rounds • For several global problems recent lower bound (Das Sarma et al.) • Densest subgraph problem is one for which this lower bound does not hold

Our ResultsDensest Subgraph Problem We give a distributed algorithm for any dynamic graph with dynamic diameter D and rate r that w.h.p. the densest subgraph at that time given the max density is at least

Our ResultsDensest Subgraph Problem Static graphs We give a distributed algorithm that obtains a w.h.p. in O(Dlogn) rounds of the CONGEST model Note: First known algorithm for this problem in the CONGEST model

Our ResultsAt-least-k Densest Subgraph (Subgraph should have at least k vertices) We give an algorithm that w.h.p. the at-least-k densest subgraph at that time given that the max at-least-k density is at least Centralized version is known to be NP-hard

Our ResultsAt-least-k Densest Subgraph Static graphs A distributed algorithm that obtains w.h.p. a in O(Dlogn) rounds of the CONGEST model

Algorithm for Densest Subgraph

Previous 2-Approximation Algorithm[Khuller-Saha’09] • Iteratively remove the lowest degree node and keep track of density Densities 16/11 Red = lowest degree Gray = deleted Slides idea borrowed from Sergei Vassilivitskii “MapReduceAlgorithmics” presented at “Large-Scale Distributed Computation” Seminar 2011, Japan

Previous 2-Approximation Algorithm [Khuller-Saha’09] • Iteratively remove the lowest degree node and keep track of density Densities 16/11 Red = lowest degree Gray = deleted Slides idea borrowed from Sergei Vassilivitskii “MapReduceAlgorithmics” presented at “Large-Scale Distributed Computation” Seminar 2011, Japan

Previous 2-Approximation Algorithm [Khuller-Saha’09] • Iteratively remove the lowest degree node and keep track of density Densities 16/11 15/10 Red = lowest degree Gray = deleted Slides idea borrowed from Sergei Vassilivitskii “MapReduceAlgorithmics” presented at “Large-Scale Distributed Computation” Seminar 2011, Japan

Previous 2-Approximation Algorithm [Khuller-Saha’09] • Iteratively remove the lowest degree node and keep track of density Densities 16/11 15/10 14/19 Red = lowest degree Gray = deleted Slides idea borrowed from Sergei Vassilivitskii “MapReduceAlgorithmics” presented at “Large-Scale Distributed Computation” Seminar 2011, Japan

Previous 2-Approximation Algorithm [Khuller-Saha’09] • Iteratively remove the lowest degree node and keep track of density Densities 16/11 15/10 14/19 13/8 Red = lowest degree Gray = deleted Slides idea borrowed from Sergei Vassilivitskii “MapReduceAlgorithmics” presented at “Large-Scale Distributed Computation” Seminar 2011, Japan

Previous 2-Approximation Algorithm [Khuller-Saha’09] • Iteratively remove the lowest degree node and keep track of density Densities 16/11 15/10 14/19 13/8 12/7 Red = lowest degree Gray = deleted Slides idea borrowed from Sergei Vassilivitskii “MapReduceAlgorithmics” presented at “Large-Scale Distributed Computation” Seminar 2011, Japan

Previous 2-Approximation Algorithm [Khuller-Saha’09] • Iteratively remove the lowest degree node and keep track of density Densities 16/11 15/10 14/19 13/8 12/7 10/6 Red = lowest degree Gray = deleted Slides idea borrowed from Sergei Vassilivitskii “MapReduceAlgorithmics” presented at “Large-Scale Distributed Computation” Seminar 2011, Japan

Previous 2-Approximation Algorithm [Khuller-Saha’09] • Iteratively remove the lowest degree node and keep track of density Densities 16/11 15/10 14/19 13/8 12/7 10/6 9/5 Red = lowest degree Gray = deleted Slides idea borrowed from Sergei Vassilivitskii “MapReduceAlgorithmics” presented at “Large-Scale Distributed Computation” Seminar 2011, Japan

Previous 2-Approximation Algorithm [Khuller-Saha’09] • Iteratively remove the lowest degree node and keep track of density Densities 16/11 15/10 14/19 13/8 12/7 10/6 9/5 6/4 Red = lowest degree Gray = deleted Slides idea borrowed from Sergei Vassilivitskii “MapReduceAlgorithmics” presented at “Large-Scale Distributed Computation” Seminar 2011, Japan

Previous 2-Approximation Algorithm [Khuller-Saha’09] • Iteratively remove the lowest degree node and keep track of density Densities 16/11 15/10 14/19 13/8 12/7 10/6 9/5 6/4 3/3 Red = lowest degree Gray = deleted Slides idea borrowed from Sergei Vassilivitskii “MapReduceAlgorithmics” presented at “Large-Scale Distributed Computation” Seminar 2011, Japan

Previous 2-Approximation Algorithm [Khuller-Saha’09] • Iteratively remove the lowest degree node and keep track of density Densities 16/11 15/10 14/19 13/8 12/7 10/6 9/5 6/4 3/3 1/2 Red = lowest degree Gray = deleted Slides idea borrowed from Sergei Vassilivitskii “MapReduceAlgorithmics” presented at “Large-Scale Distributed Computation” Seminar 2011, Japan

Previous 2-Approximation Algorithm [Khuller-Saha’09] • Iteratively remove the lowest degree node and keep track of density Densities 16/11 15/10 14/19 13/8 12/7 10/6 9/5 6/4 3/3 1/2 0 Red = lowest degree Gray = deleted Slides idea borrowed from Sergei Vassilivitskii “MapReduceAlgorithmics” presented at “Large-Scale Distributed Computation” Seminar 2011, Japan

Previous 2-Approximation Algorithm [Khuller-Saha’09] • Iteratively remove the lowest degree node and keep track of density Densities 16/11 15/10 14/19 13/8 12/7 10/6 9/5 6/4 3/3 1/2 0 2-approximated density = Largest density Red = lowest degree Gray = deleted Slides idea borrowed from Sergei Vassilivitskii “MapReduceAlgorithmics” presented at “Large-Scale Distributed Computation” Seminar 2011, Japan

Previous 2-Approximation Algorithm [Khuller-Saha’09] • Iteratively remove the lowest degree node and keep track of density • Inefficient to implement even on static distributed networks (needs W(n) rounds)

Our (2+)-approximation algorithm

Our (2+)-approximation algorithm • Iteratively remove the allnodes such that … and keep track of density Say, Average degree Densities 32/11 16/11 Red = degree lower than average Gray = deleted

Our (2+)-approximation algorithm • Iteratively remove the all nodes such that … and keep track of density Say, Average degree Densities 32/11 16/11 Red = degree lower than average Gray = deleted

Our (2+)-approximation algorithm • Iteratively remove the all nodes such that … and keep track of density Say, Average degree Densities 32/11 18/5 16/11 9/5 Red = degree lower than average Gray = deleted

Our (2+)-approximation algorithm • Iteratively remove the all nodes such that … and keep track of density Say, Average degree Densities 32/11 18/5 6/3 16/11 9/5 3/3 Red = degree lower than average Gray = deleted

Our (2+)-approximation algorithm • Iteratively remove the all nodes such that … and keep track of density Say, Average degree Densities 32/11 18/5 6/3 0 16/11 9/5 3/3 0 Red = degree lower than average Gray = deleted

Our (2+)-approximation algorithm • Iteratively remove the all nodes such that … and keep track of density Say, Average degree Densities 32/11 18/5 6/3 0 16/11 9/5 3/3 0 -Approximated Density = Largest density Red = degree lower than average Gray = deleted

Our (2+)-approximation algorithm • Iteratively remove the all nodes such that … and keep track of density • Can be implemented on staticdistributed networks in rounds • A similar idea is independently discoveredby Bahmani, Kumar, and Vassilvitskii [VLDB’12] for solving this problem on Streamingand MapReducemodels.

Dense Subgraphs on Dynamic Networks