1 / 37

Tree-based Overlay Networks for Scalable Applications and Analysis

Tree-based Overlay Networks for Scalable Applications and Analysis. Dorian Arnold darnold@cs.wisc.edu. Barton P. Miller bart@cs.wisc.edu. Computer Sciences Department University of Wisconsin. Overview. Extremely large scale systems are here Effective, scalable programming is hard

arnon
Télécharger la présentation

Tree-based Overlay Networks for Scalable Applications and Analysis

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Tree-based Overlay Networksfor Scalable Applications and Analysis Dorian Arnold darnold@cs.wisc.edu Barton P. Miller bart@cs.wisc.edu Computer Sciences Department University of Wisconsin

  2. Overview • Extremely large scale systems are here • Effective, scalable programming is hard • Tree-based Overlay Networks (TBŌNs) • Simple yet powerful model • Effective for tool scalability • Applied to a variety of areas • Paradyn Performance Tools • Vision algorithms • Stack trace analysis (new, this summer) • New concepts in fault tolerance (no logs, no hot-backups).

  3. HPC Trends from . No Data Available June ’06Processor Count Distribution Systems Larger than 1024K

  4. “I think that I shall never seeAn algorithm lovely as a tree.” Trees by Joyce Kilmer (1919) “I think that I shall never seeA poem lovely as a tree.” Trees by Joyce Kilmer (1919) If you can formulate the problem so that it is hierarchically decomposed, you can probably make it run fast.

  5. Hierarchical Distributed Systems • Hierarchical Topologies • Application Control • Data collection • Data centralization/analysis • As scale increases,front-end becomes bottleneck FE … BE BE BE BE

  6. TBŌNs for Scalable Systems TBŌNs for scalability • Scalable multicast • Scalable gather • Scalable data aggregation FE … BE BE BE BE

  7. TBŌN Model Application Front-end FE CP CP Tree ofCommunication Processes CP CP … BE BE BE BE Application Back-ends

  8. TBŌN Model FE Application-level packet CP CP Packet filter Filterstate CP CP … BE BE BE BE

  9. TBŌNs at Work • Multicast • ALMI [Pendarakis, Shi, Verma and Waldvogel ’01] • End System Multicast [Chu, Rao, Seshan and Zhang ’02] • Overcast[Jannotti, Gifford, Johnson, Kaashoek and O’Toole ’00] • RMX [Chawathe, McCanne and Brewer ’00] • Multicast/gather (reduction) • Bistro (no reduction) [Bhattacharjee et al ’00] • Gathercast [Badrinath and Sudame ’00] • Lilith [Evensky, Gentile, Camp, and Armstrong ’97] • MRNet [Roth, Arnold and Miller ‘03] • Ygdrasil [Balle, Brett, Chen, LaFrance-Linden ’02] • Distributed monitoring/sensing • Ganglia [Sacerdoti, Katz, Massie, Culler ’03] • Supermon (reduction) [Sottile and Minnich ’02] • TAG (reduction) [Madden, Franklin, Hellerstein and Hong ’02]

  10. Example TBŌN Reductions • Simple • Min, max, sum, count, average • Concatenate • Complex • Clock synchronization [ Roth, Arnold, Miller ’03] • Time-aligned aggregation [ Roth, Arnold,Miller ’03] • Graph merging [Roth, Miller ’05] • Equivalence relations [Roth, Arnold, Miller ‘03] • Mean-shift image segmentation [Arnold, Pack, Miller ‘06] • Stack Trace Analysis

  11. MRNet Front-end Interface front_end_main(){ Network * net = new Network (topology); Communicator * comm = net-> get_BroadcastCommunicator(); Stream * stream = new Stream( comm, IMAX_FILT, WAITFORALL); stream->send(“%s”, “go”); stream->recv(“%d”, &result); }

  12. MRNet Back-end Interface back_end_main(){ Stream * stream; char *s; Network * net = new Network(); net->recv(“%s”, &s, &stream); if(s == “go”){ stream->send(“%d”, rand_int); } }

  13. MRNet Filter Interface imax_filter(vector<Packet>packets_in, vector<Packet>packets_out) { for( i=0; i<packets_in.size; i++){ result = max( result, packets[i].get_int()); } Packet p(“%d”, result); packets_out.pushback(p); }

  14. TBŌNs for Tool Scalability MRNet integrated into Paradyn • Efficient tool startup • Performance data analysis • Scalable visualization Equivalence computations • Graph merging • Trace analysis • Data clustering (image analysis) • Scalable stack trace analysis

  15. Paradyn Start-up Latency Results Paradyn with SMG2000 on ASCI Blue Pacific SMG2000 on ASCI Blue Pacific

  16. Cluster points in feature spaces Useful for image segmentation Prohibitively expensive as feature space complexity increases TBŌNs for Scalable Aps: Mean-Shift Algorithm

  17. ~6x speedup withonly 6% more nodes TBŌNs for Scalable Aps: Mean-Shift Algorithm

  18. Recent Project: Peta-scalable Tools In collaboration with LLNL • Stack Trace Analysis (STA) • Data representation • Data analyses • Visualization of results

  19. STA Motivation • Discover application behavior • Progressing or deadlock? • Infinite loop? • Starvation? • Load balanced? • Target:Petascale systems • Targeted forBG/L

  20. Some Observations • Debugging occurs after problems manifest • Tool goals: • Pin-point symptoms as much as possible • Direct user’s to root cause

  21. Users need better tools LLNL Parallel Debug Sessions(03/01/2006 – 05/11/2006) 18,391sessions!

  22. STA Approach • Sample application stack traces • Merge/analyze traces: • Discover equivalent process behavior • Group similar processes • Facilitates scalable analysis/data presentation • Leverage TBŌN model (MRNet)

  23. The Basic Stack Trace

  24. 2D-Process/Space View • Single sample, multiple processes • Loosely synchronized distributed snapshot • Color distinguishes similar behavior • Distinction by invocation leads to trees • Misses temporal information.

  25. 2D-Process/Time View • Multiple samples, single process • Track process behavior over time • Chronology information lost • One graph per process

  26. 2D-Process/Time View

  27. 3D-Process/Space/Time Analysis • Multiple samples, multiple processes • Track global program behavior over time • Folds all processes together • Challenges: • Scalable data representations • Scalable analyses • Scalable and useful visualizations/results

  28. 3D-Process/Space/Time Analysis

  29. Scalable 3D Analysis • Merge temporal traces locally • Combine merged process-local traces into global program trace

  30. STA Tool Front-end • MRNet front-end creates tree to appl. nodes • Sets up MRNet stream w/ STA filter • Controls daemon sampling (start, count, freq.) • Collects single merged stack trace tree • Post-process: color code equivalence classes

  31. STA Tool Daemon • MRNet back-end • Dyninst to sample traces from unmodified applications (no source code needed) • 1 daemon per node • Merge collected traces locally • Propagate to front-end

  32. STA Tool Performance

  33. TBŌNs Summary • Simple model to understand, simple to program. • Good foundation for run-time tools, monitoring and many distributed applications. • Current research: no log, no hot-backup fault tolerance. • Open source: http://www.cs.wisc.edu/paradyn/mrnet/

  34. MRNet References • Arnold, Pack and Miller: “Tree-based Overlay Networks for Scalable Applications”, Workshop on High-Level Parallel Programming Models and Supportive Environments, April 2006. • Roth and Miller, “The Distributed Performance Consultant and the Sub-Graph Folding Algorithm: On-line Automated Performance Diagnosis on Thousands of Processes”, PPoPP, March 2006. • Schulz et al, “Scalable Dynamic Binary Instrumentation for Blue Gene/L”, Workshop on Binary Instrumentation and Applications, September, 2005. • Roth, Arnold and Miller, “Benchmarking the MRNet Distributed Tool Infrastructure: Lessons Learned”, 2004 High-Performance Grid Computing Workshop, April 2004. • Roth, Arnold and Miller, “MRNet: A Software-Based Multicast/Reduction Network for Scalable Tools”, SC 2003, November 2003. www.cs.wisc.edu/paradyn

  35. 2D-Process/Space (Totalview)

More Related