150 likes | 304 Vues
Advanced Storage Technologies for High Performance Computing. Sorin, Faibish EMC NAS Senior Technologist IDC HPC User Forum, April 14-16, Norfolk, VA. New HPC Storage Intensive Applications. Storage Challenges* New algorithms that can scale to search and process massive datasets;
E N D
Advanced Storage TechnologiesforHigh Performance Computing Sorin, Faibish EMC NAS Senior Technologist IDC HPC User Forum, April 14-16, Norfolk, VA
IDC HPC User Forum 2008 New HPC Storage Intensive Applications Storage Challenges* • New algorithms that can scale to search and process massive datasets; • New metadata management of distributed data sources; • New platforms provide uniform high-speed memory access to multi terabyte data structures; • Hybrid interconnect architectures to process and filter multi gigabyte data streams from scientific instruments; • High-performance, high-reliability, petascale distributed file systems; • New approaches to software mobility, so that algorithms can execute on nodes where the data resides; • Flexible and high-performance software integration technologies running on diverse computing platforms; • Data signature generation techniques for data reduction and rapid processing. *Computer Magazine: http://www.computer.org/portal/cms_docs_computer/computer/homepage/0408/R4gei.pdf
IDC HPC User Forum 2008 New Storage Technologies for HPC Storage Technologies • Virtualization to address the multi-core problem • CDP and memory snapshots to address storage failures during computation • DR and distributed cache appliances to address computation across geographies • SSD disk technology to address Data Intensive Super Computing tasks as well as decrease power consumption of storage • pNFS and RDMA technologies to increase the I/O speeds and reduce computation cycles Storage at Previous HPC User Forum
IDC HPC User Forum 2008 New Concept – Better Utilization of multi-cores • Current Implementation • Application split on multiple single core SMP HW • Use middleware SW (Platform)
IDC HPC User Forum 2008 New Concept – Better Utilization of multi-cores • Dual-core support added • Application modified to support SMP dual core • CPU used: 4x 100% (100%) • Licenses paid: 4 • Licenses used: 4
IDC HPC User Forum 2008 New Concept – Better Utilization of multi-cores • Quad-core chips appear • CPU used: 4x 100% (4/8=50%) • Licenses paid: 8 • Licenses used: 4 • Application must be modified or
IDC HPC User Forum 2008 New Concept – Better Utilization of multi-cores • Quad-core chips appear • CPU used: 4x 100% (50%) • Licenses paid: 8 • Licenses used: 4 • Application must be modified or • Use VM with CPU affinity • CPU used: 8x80% (80%) • Licenses used: 8
IDC HPC User Forum 2008 New Concept – Better Utilization of multi-cores • N-cores chips are coming • Use VM with VT support • CPU used: 2xNx90% (90%) • Licenses paid=used: 2xN
IDC HPC User Forum 2008 New Concept – Better Utilization of multi-cores • Core agnostic Middleware will work with as many cores as available • Enabled by pNFS access to shared storage
HPC Application platform support CDP Appliance SAN CDP Journal + Memory Snapshots IBM Sun HDS HP EMC IDC HPC User Forum 2008 CDP + Memory Snapshots in HPC applications • CDP Technology will work with Real and Virtual Infrastructures • VM snapshots on central storage repository • VM and HW hosts memory snapshots • Any SAN or NAS storage • Recover HPC job at any point in time (last minute failure after 2 weeks run)
Heterogeneousstorage IBM IBM Sun Sun HDS HDS HP HP EMC EMC IDC HPC User Forum 2008 Continuous Remote Replication in HPC HPC Application platform support HPC Application remote platform • Distributed cache engines allow distributed access to shared storage • Remote Compute Nodes accessing the shared storage HeterogeneousBlades; VM+HW Cache Appliance Cache Appliance Site B Site A SAN SAN
HPC Application platform support SAN EMC DMX + SSD IDC HPC User Forum 2008 SSD Disks in HPC applications • Solid State Disks will replace Disk Drives • Today HPC workloads are mostly compute intensive • Data intensive Super Computing (DISC) applications start to appear (see: IEEE Computer Magazine, April 2008) • SSD will balance performance between DISC and compute intensive HPC applications • EMC DMX has SSD today (25 SSD = 800K iops or 5 GB/sec)
HPC Jobs Compute Engines CONNECTIVITY NFS S E R V E R S pNFS IDC HPC User Forum 2008 pNFS will deliver very high I/O speeds to HPC Storage must be Networked pNFS addresses the storage access issues • Remove servers layer between CE and shared storage • Separates MD traffic from Data Traffic • Asymmetric storage architectures increase scalability • SSD increase I/O speed HPC Architecture MIDDLEWARE CONNECTIVITY SSD STORAGE
IDC HPC User Forum 2008 pNFS with Infiniband RDMA value added to HPC CE Cache • MD is directed to the single MD server • Data is served by storage servers or storage arrays directly from host to storage • Storage access controlled by iSCSI • I/O to native IB or 10G storage redirected via RDMA in HW iSCSI (iSER) NFS (pNFS) Control path Data path MetaData Cache NFS/pNFS RDMA File systems Storage array Native IB Storage Array Cache