20 likes | 148 Vues
This proposal focuses on enhancing energy efficiency and compute density in high-performance compute nodes, targeting a fivefold increase in energy efficiency compared to common IA32-based systems. By utilizing ARM Cortex CPUs and Texas Instruments DSP technology, we aim to achieve a compute density up to ten times that of traditional systems. The hybrid programming approach combines OpenMP and MPI for efficient utilization of resources. Partnerships with industry leaders like Texas Instruments and Supermicro will bolster the project's technological foundations and real-world applications.
E N D
SNIC/KTH Proposal • Objective • Improved energy efficiency over common IA32 based nodes by a factor of at least 5 • High compute density, possibly 10 times that of an IA32 based system in SP • Modest increase in programming complexity • High volume component technologies • Technology • Embedded processor technology • ARM Cortex 9 4-core CPU (0.8 – 2GHz, 0.4 – 2W) • TI DSP (designed in Nice) • Hybrid programming OpenMP+MPI • Industry Partners (tentative) • TI (4th largest IC company by revenue in 2009 after Intel, Samsung and Toshiba) • Supermicro • Smooth Stone (start-up with funding from ARM) targeting energy efficient servers for Internet and web applications. CPU chip by TI.
High Performance Compute Node • HPC Compute Node Performance • 1024 GMAC (1 TMAC) (MAC = Multiply-Accumulate, 32-bit) • 512 Single Precision Floating Point Operations @ 1Ghz (=614.4 GF SP@1.2 GHz) • Support for double precision floating point announced. • Approximate 50 to 60 W • DDR3 (number of DIMMs not yet fixed) • Interconnect • DSP to DSP: SRIO • CPU to DSP: PCIe x2 Gen2 (5 GHz) • Node to Node: 10 G Ethernet • Board • 4 – 8 nodes • 2,5 - 5 TF SP per board/blade • 3.5 – 7 TF/U SP • ~ 400 – 800 W/U • Programming Model • MPI across compute nodes • OpenMP within a node • DSPs and ARM processor both programmed in a high level language • OpenMP-style directives define accelerate regions that are executed on the DSPs HPC Compute Node Texas Instruments 8 core DSP @ 1.2 GHz Texas Instruments 8 core DSP @ 1.2 GHz Acceleration Memory DDR3-1333 PCIe/SRIO/Eth Connectivity CPU ARM/x86 Texas Instruments 8 core DSP @ 1.2 GHz CPU External Memory Texas Instruments 8 core DSP @ 1.2 GHz 10G Ethernet