10 likes | 116 Vues
This research introduces a novel algorithm for online distance computation on massive graphs, particularly in social networks and road networks. Leveraging sketch-based distance estimation, our approach precomputes essential information for all nodes, allowing rapid query responses. With a focus on both undirected and directed distances, we analyze the efficiency of our algorithm against a dataset of 65 million web pages and 2.3 billion edges, highlighting its effectiveness for real-world applications. Our findings demonstrate a compelling reduction in computational time while maintaining accurate distance estimates.
E N D
Sketch-Based Distance Estimates for Web Scale Graphs Atish Das Sarma (Georgia Tech), Sreenivas Gollapudi, Marc Najork, and Rina Panigrahy (Microsoft) Distance Computation Algorithm • Online Distance Computation on Massive Graphs • Distance/path computation on Social Networks • Distance between search and ad results • Building block for other online algorithms • pre-computation : all sketches • query time: nodes u and v • at runtime, retrieve Obama • Road Networks • Already solved very efficiently – specific to 2D • Set Sketch Based Distances Effectiveness of our Algorithm For all nodes x, precompute small information Sketch(x) At query time, combine Sketch(u) and Sketch(v) to estimate distance. You undirected Real Data • 65M web pages, 420M URLs, 2.3B edges • C = 60M (directed), C = 128M (undirected) • Undirected distance [1,15] • Directed distance [1,100] (∞ otherwise) • Sketch size: (s+8)k |logC|bits • k = 3 number of copies of seed sets • s = 12 size of seed id. 8 to store distance • ~200, 400 bytes for undirected, directed Sketch computation Repeatedly (k times), sample random set of nodes (S) of sizes 20, 21, 22, …, 2│logC| from candidate set C and store nearest node and distance to it from all nodes in the graph. directed