1 / 19

PAGE: A Partition Aware Graph Computation Engine

PAGE: A Partition Aware Graph Computation Engine. Yingxia Shao, Junjie Yao, Bin Cui, Lin Ma EECS, Peking University, China. Agenda. Background Design of PAGE Experiment result Conclusion. Background. Prevalent large scale graphs Social networks Web graph … Graph computing systems

woody
Télécharger la présentation

PAGE: A Partition Aware Graph Computation Engine

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. PAGE: A Partition Aware Graph Computation Engine Yingxia Shao, Junjie Yao, Bin Cui, Lin Ma EECS, Peking University, China

  2. Agenda • Background • Design of PAGE • Experiment result • Conclusion

  3. Background • Prevalent large scale graphs • Social networks • Web graph • … • Graph computing systems • Pregel (Google) • Giraph (Apache) • GPS (Stanford) • GraphLab (CMU) • …

  4. Background • Graph Partitioning • Offline approach • METIS (Karypis Lab) • Online approach • Streaming partitioning • Linear Deterministic Greedy(LDG) algorithm (I. Stanton) Problem: The existing graph computation systems cannot efficiently integrate the high-quality graph partitioning.

  5. Inefficient partition integrating • The high-quality graph partitioning leads to the worse overall performance. • The graph partitioning quality is improved from left to right. Running PageRank on Giraph with six different graph partition qualities.

  6. Motivation of the PAGE Call for a novel graph computation engine to efficiently integrate graph partitioning with various qualities.

  7. Agenda • Background • Design of PAGE • Experiment result • Conclusion

  8. Message processor

  9. Inefficient partition integrating • The local message processing cost dominates the overall cost. • The existing systems cannot provide enough local message processor. Running PageRank on Giraph with six different graph partition qualities.

  10. Overview of the PAGE PAGE applies adaptively tuning mechanism and new cooperation methods.

  11. New Designed PAGE Worker

  12. Dual Concurrent Message Processor • First type concurrency • A remote MP and a local MP are embedded • Second type concurrency • A set of message process units are contained by each message processor • The concurrency is automatically determined by the system itself.

  13. Dynamic Concurrency Control Model • The DCCM determines the proper parameters, such as nmp , nmpl, nmpr. • The DCCM is built on top of two heuristic rules. • Ability Lower-bound. • Workload Balance Ratio. • Monitor • Tracks the necessary metrics

  14. Agenda • Background • Design of PAGE • Experiment result • Conclusion

  15. Environment & Datasets • Experiment Environment • a 24 nodes cluster • Dataset: the uk-2007-05-u. • Undirected • Vertex #: 105,153,952 • Edge #: 6,603,753,128 • Benchmark: PageRank Partition qualities Balance factor: < 1%.

  16. Partition Awareness in PAGE PAGE Giraph

  17. Compare with the naive solution * The Giraph-GPSop is the naive solution.

  18. Contribution & Conclusion • We identify the problem of partition unaware inefficiency. • We set up a new partition aware graph computation engine, PAGE. • We design a Dynamic Concurrency Control Model based on several heuristic rules to better profile the characters of graph partition. • At last, we demonstrate PAGE’s robustness and efficiency on different graph partition qualities.

  19. Thanks! Email: simon0227@gmail.com

More Related