1 / 22

Efficient and contention-free virtualisation of fat-trees

Efficient and contention-free virtualisation of fat-trees. Frank Olaf Sem -Jacobsen, Åshild Grønstad Solheim, Olav Lysne, Tor Skeie, and Thomas Sødring. May 16, 2011. Outline. The virtualisation challenge The different design choices The allocation algorithm Exploration Conclusion.

linh
Télécharger la présentation

Efficient and contention-free virtualisation of fat-trees

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Efficient and contention-free virtualisation of fat-trees • Frank Olaf Sem-Jacobsen, Åshild Grønstad Solheim, Olav Lysne, Tor Skeie, and Thomas Sødring • May 16, 2011

  2. Outline • The virtualisation challenge • The different design choices • The allocation algorithm • Exploration • Conclusion

  3. The virtualisation challenge • What to aim for? • Virtualisation is a tool used to get maximum performance out of a computer cluster. • The goal: • Full utilisation of the HPC hardware • The means: • Multiple applications in parallel • Minimal interference between the applications • Leads to higher revenue and predictable performance • Relies on an allocation algorithm System utilisation Application performance #allocated jobs

  4. The virtualisation challenge • Virtualising what? • Typical topologies are mesh, torus, and fat-tree • Mesh/Torus • Locality almost guarantees contention freedom (submeshs) • Fat-trees may seem straightforward, but they suffer from contention at high utilisation

  5. The design space • Locality • Maximum utilisation • Contention freedom/routing containment • The different design choices Unused, fragmented

  6. The design space • Locality • Maximum utilisation • Contention freedom/routing containment • The different design choices

  7. The design space • Locality • Maximum utilisation • Contention freedom/routing containment • The different design choices

  8. The design space • Locality • Maximum utilisation • Contention freedom/routing containment • The different design choices

  9. The design space • Locality • Maximum utilisation • Contention freedom/routing containment • The different design choices

  10. The resources • Multiple paths versus virtual channels • Multiple paths • Allows full physical separation of traffic from different applications • K sets • Unused channels reserved to one application cannot be utilised by another • Virtual channels • Logical separation • Traffic shares physical channels • Capacity is dynamically shared between applications • Applications can affect each other, but to a lesser extent • Virtual channels can be assigned to different priorities • Combination • Multiple virtual channels can be combined with specific paths to increase the allocation granularity. • For example • Multiple applications with little communication can be separated using virtual channels on the same paths. • An application with higher communication requirements may benefit from being isolated to its own paths.

  11. We have… • Developed an allocation algorithm to exploit these design possibilities • Explored a set of the trade-offs offered by the design choices

  12. The allocation algorithm • Calculate the tree depth based on allocation size () • Traverse the tree from this depth and above • Find a node with a sufficient amount of free processors through one or more branches • If all the necessary branches have a common free resource, allocation is successful • Alternatively, allocate a fraction of resources equal to the fraction of nodes allocated from the subtree • Otherwise, try the next node, or move up one level in the allocation tree • Deallocation • Free every node upwards in the allocation tree

  13. The allocation algorithm • An example… • Allocating three nodes for a job • Grey and white are busy black are free. • 2 virtual channels • Allocation tree level is 0 • Choose virtual channel 2

  14. The allocation algorithm • Complexity • Allocation: • Deallocation (tree traversal):

  15. Exploration • Experiment setup • Allocation simulations • 24-port 3-tree, Oracle 3456-Port InfiniBand switch • Random, local, contained with 1 to 12 resources • 20,000 jobs, with a mean runtime of 1000 cycles (exponential distribution) • Packet simulations • Based on snapshots from the allocation simulations • Using either virtual channels or multiple paths as the isolation resource • Two and six resources, mixed task sizes • Uniform traffic within each application • Maximises the chance of interference • Packet size is 256 bytes, and link rate is 52 bytes/cycle

  16. Small jobs Exploration • Utilisation Mixed jobs Large jobs

  17. Exploration • Throughput and queueing time, mixed sizes

  18. Exploration • Throughput without containment Maximum locality Random Containment ratio: the fraction of ideal throughput achieved when sharing the system Network performance ratio: the fraction of mean throughput reduction caused by resource constraints

  19. Exploration • Throughput with containment, two resources Virtual channels Multiple paths

  20. Exploration • Throughput with containment, six resources Multiple paths Virtual channels

  21. Conclusion • Allocation strategy has been the impact on system performance • Interesting trade-off decisions! • Routing containment gives predictable performance • It has a cost • Uncontained routing may severely punish individual applications • We have presented a flexible allocation algorithmto utilise these trade-offs to increase virtualisation efficiency.

More Related