1 / 11

System Architecture: Near, Medium, and Long-term Scalable Architectures

Panel Discussion Presentation Sandia CSRI Workshop on Next-generation Scalable Applications: When MPI-only is not enough June 4, 2008 Kevin Pedretti Scalable System Software Dept. Sandia National Laboratories ktpedre@sandia.gov.

morrison
Télécharger la présentation

System Architecture: Near, Medium, and Long-term Scalable Architectures

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Panel Discussion Presentation Sandia CSRI Workshop onNext-generation Scalable Applications:When MPI-only is not enough June 4, 2008 Kevin Pedretti Scalable System Software Dept. Sandia National Laboratories ktpedre@sandia.gov System Architecture:Near, Medium, and Long-termScalable Architectures Sandia is a multiprogram laboratory operated by Sandia Corporation, a Lockheed Martin Company,for the United States Department of Energy’s National Nuclear Security Administration under contract DE-AC04-94AL85000.

  2. Near Term • Odds are good, but goods are odd... • Multi-core, many-core, mega-core • Heterogeneous ISAs, cores, systems • Accelerators: GPU, Cell, Clearspeed, FPGA, etc. • Embedded: Tilera, SPI, Ambric (336-core), Tensilica • Scalable Architectures • Peak FLOPS not bottleneck • Improving per-socket efficiency on real applications is “low-hanging fruit” • Decreasing memory size & bandwidth per core • Symbiosis of architecture and system software

  3. Near Term (Cont.)‏ • Adapting MPI implementations for architecture • Shared memory copies vs. NIC • Cache pollution, injection • Leverage hierarchy / intra-node locality • Adapting MPI applications for architecture • MPI + shared memory: LIBSM • MPI + something else for intra-node • OpenMP, Thread Building Blocks, ALF Streaming, CUDA, Rapid Mind, Peakstream/Google, etc. • All incompatible, some similar concepts • Adapting architecture for MPI? • Leveraging interconnect capabilities for PGAS

  4. OS Scalability At 8192 nodes, CNL (2.0.44) is 49% worse than Catamount onthis Partisn problem. Doesn’t appear to be a bandwidth issue.

  5. Task and Memory Placement • No standard mechanisms, most punt and hope for best • Explicit vs. implicit mechanisms • More important than node placement?

  6. Intra-node MPI

  7. Virtual Memory Nice, but Gets in Way Dashed Line = Small pages Solid Line = Large pages (Dual-core Opteron)‏ Open Shapes = Existing Logarithmic Algorithm (Gibson/Bruck)‏ Solid Shapes = New Constant-Time Algorithm (Slepoy, Thompson, Plimpton)‏ UnexpectedBehavior Due to TLB TLB misses increased with large pages,but time to service miss decreased dramatically (10x).Page table fits in L1! (vs. 2MB per GB with small pages)‏

  8. So, Answer is Large Pages? • DRAM bank conflicts can be considerable depending on data alignment • OS-level and hardware mitigation strategies

  9. Affects SpMV Also(28 Node HPCCG Run)‏

  10. Medium Term • More accelerators, normalization • Attractive power and memory efficiency • Commodity processors will integrate GPUs on-chip • HPC-centric off-chip accelerators • General-purpose cores not getting much faster • Leverage architecture for specific app domains • Some common mechanism will/must emerge for dealing with data-parallel accelerators • General-purpose cores become more light-weight, better match for light-weight system software • Chip stacking • Off-chip optics

  11. Long Term • MPP-on-a-chip • On and off-chip optics • More intelligent memory systems • Application driven architectures

More Related