1 / 20

Fast Shortest Path Distance Estimation in Large Networks

Fast Shortest Path Distance Estimation in Large Networks. Michalis Potamias Francesco Bonchi Carlos Castillo Aristides Gionis. Context-aware Search. …use shortest-path distance in wikipedia links-graph!. Social Search. John searches Mary Ranking: Mary A Mary B

mei
Télécharger la présentation

Fast Shortest Path Distance Estimation in Large Networks

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Fast Shortest Path Distance Estimation in Large Networks Michalis Potamias Francesco BonchiCarlos Castillo Aristides Gionis

  2. Context-aware Search …use shortest-path distance in wikipedia links-graph! Shortest Paths in Large Networks @ CIKM 2009

  3. Social Search • John searches Mary • Ranking: • Mary A • Mary B • Mary C …use shortest-path distance in friendship graph! Shortest Paths in Large Networks @ CIKM 2009

  4. Problem and Solutions • DB: Graph G = (V,E) • Query: Nodes s and t in V • Goal: Compute fast shortest path d(s,t) • Exact Solution • BFS - Dijkstra • Bidirectional- Dijkstra with A* (aka ALT methods) • [Ikeda, 1994] [Pohl, 1971] [Goldberg and Harrelson, SODA 2005] • Heuristic Solution • Random Landmarks • [Kleinberg et al, FOCS 2004] [Vieira et al, CIKM 2007] • Better Landmarks! Shortest Paths in Large Networks @ CIKM 2009

  5. The Landmarks’ Method • Offline • Precompute distance of all nodes to a small set of nodes (landmarks) • Each node is associated with a vector with its SP-distance from each landmark (embedding) • Query-time • d(s,t) = ? • Combine the embeddings of s and t to get an estimate of the query Shortest Paths in Large Networks @ CIKM 2009

  6. Contribution • Proved that covering the network w. landmarks is NP-hard. • Devised heuristics for good landmarks. • Experiments with 5 large real-world networks and more than 30 heuristics. Comparison with state of the art. • Application to Social Search. Shortest Paths in Large Networks @ CIKM 2009

  7. Algorithmic Framework • Triangle Inequality • Observation: the case of equality Shortest Paths in Large Networks @ CIKM 2009

  8. The Landmarks’ Method • Selection: Select klandmarks • Offline: Run kBFS/Dijkstra and store the embeddings of each node: Φ(s)= <dG(u1, s), dG(u2, s), … , dG(uk, s)> = <s1, s2, …, sd> • Query-time: dG(s,t) = ? • Fetch Φ(s)and Φ(t) • Compute mini{si + ti} (i.e. inf of UB) ... in time O(k) Shortest Paths in Large Networks @ CIKM 2009

  9. Example query: d(s,t) Shortest Paths in Large Networks @ CIKM 2009

  10. Coverage Using Upper Bounds • A landmark ucovers a pair (s, t), if ulies on a shortest path from s to t • Problem Definition : find a set of k landmarks that cover as many pairs (s,t) in V x V • NP-hard • k = 1 : node with the highest betweenness centrality • k > 1 : greedy set-cover (too expensive) Shortest Paths in Large Networks @ CIKM 2009

  11. Basic Heuristics • Random (baseline) • Choose central nodes! • Degree • Closeness centrality • Closeness of u is the average distance of u to any vertex in G • Caveat: The selected landmarks may cover the same pairs: we need to make sure that landmarks cover different pairs!! Shortest Paths in Large Networks @ CIKM 2009

  12. Constrained Heuristics • Spread the landmarks in the graph! • Rank all nodes according to Degree or Centrality • Iteratively choose the highest ranking nodes. Remove h-neighbors of each selected node from candidate set • Denote as • Degree/h • Closeness/h • Best results for h = 1 Shortest Paths in Large Networks @ CIKM 2009

  13. Partitioning-based Heuristics • Use partitioning to spread nodes! • Utilize any partitioning scheme and… • Degree/P • Pick the node with the highest degree in each partition • Closeness/P • Pick the node with the highest closeness in each partition • Border/P • Pick the nodes close to the border in each partition. Maximize the border-value that is given from the following formula: Shortest Paths in Large Networks @ CIKM 2009

  14. Border/P d1(u) = 3 d2(u) = 3 d3(u) = 2 b(u) = d1(u)*(d2(u) + d3(u))= 3(3 + 2) Shortest Paths in Large Networks @ CIKM 2009

  15. Versus Random - error Shortest Paths in Large Networks @ CIKM 2009

  16. Versus Random - triangulation Shortest Paths in Large Networks @ CIKM 2009

  17. Versus ALT - efficiency Shortest Paths in Large Networks @ CIKM 2009

  18. Social Search Task Shortest Paths in Large Networks @ CIKM 2009

  19. Conclusion • Heuristic landmarks yield remarkable tradeoffs for SP-distance estimation in huge graphs • Hard to find the optimal landmarks • Border/P and Centrality heuristics outperform Random even by a factor of 250. • For a 10% error, thousand times faster than state of the art exact algorithms (ALT) • Novel search paradigms need distance as primitive • Approximations should be computed in milliseconds • Future Work • Provide fast estimation for more graph primitives! Shortest Paths in Large Networks @ CIKM 2009

  20. Thank you! ? Shortest Paths in Large Networks @ CIKM 2009

More Related