60 likes | 175 Vues
The landscape of high-performance computing (HPC) is rapidly evolving with trends such as decreasing commodity hardware prices, increased core counts per chip, and the integration of inexpensive, high-density RAM. These developments necessitate a re-evaluation of compute models, favoring distributed processing and HPC clusters for faster results and time-to-market. The introduction of Flash storage as a primary tier enhances I/O speed and eliminates bottlenecks, calling for significant software innovations and new data management paradigms. As enterprises embrace these changes, the need for scalable, efficient storage strategies will redefine performance metrics and workflow management.
E N D
Vendor Update:Changes in Industry; Re-evaluate products Omer Asad Architect, Office of the CTO
Industry Trends • Falling Prices around commodity hardware • More cores on a chip • Cheaper more dense RAM (Cache) • Result: Re-evaluate compute model • HPC grids/clusters; distributed processing • Quicker results; Faster time to market • HPC more prevalent in Enterprise segment now! • Introduction of Flash as tier 0 • Flash as victim cache, augmenting RAM • Flash based SSDs for high speed I/O to disks • Eliminate the I/O stack bottleneck • Above require software changes to fully exploit benefits
Industry Trends; Ramifications Compute side more advanced than storage side • Same commodity components on both compute & storage side (Ratio 50:1 sometimes) • Need for Scale-Out storage Need for: • Simplicity of management • Grids/Cluster generate PB of data • Need for I/O performance • Parallel I/O, New meta-data management paradigms • Minimal Disruptions • NDU; Checkpoint/Restart; Data movement; QOS Above require a fundamental re-think in the file system design
Architecture Thoughts Ability to adjust to changing business needs • Filesystem’s ability to adapt to business & performance needs • Striped and non-striped Ability to move between different tiers non disruptively to the applications, within a Global namespace • True independence of data paths from data locations Ability to partition workloads on storage between data sets • Define Service Levels among multiple jobs • Simple integration with Checkpoint/Restart on the compute side NetApp Confidential
Disks Disks Disks Disks Disks Disks Disks Disks Disks Pooled Storage Pooled Storage Pooled Storage An Attempt! Projects A B C A1 A2 A3 B1 B2 C1 C2 C3 A A1 B C2 C3 B1 B2 A3 A2 C1 C Primary Storage Secondary Storage NetApp Confidential
Match storage to changing priorities Increase performance for critical jobs Re-balance as needed Save money by avoiding over-provisioning No need to touch clients, namespace is unchanged Example: Optimize response times Non-disruptive Partitionings Projects A B C A1 A2 A3 B1 B2 C1 C2 C3 A B1 B A1 C2 C3 B2 A3 A2 C1 C Project A gets dedicated resources