1 / 13

IDC HPC User Forum 09/08/2009

IDC HPC User Forum 09/08/2009. Manuel Hoffmann, Vice President, Channel Development Manuel@ScaleMP.com +1.408.342.0337. ScaleMP at a Glance. 100+ Deployments Worldwide. V irtualization for high-end computing

hani
Télécharger la présentation

IDC HPC User Forum 09/08/2009

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. IDC HPC User Forum 09/08/2009 Manuel Hoffmann, Vice President, Channel Development Manuel@ScaleMP.com +1.408.342.0337

  2. ScaleMP at a Glance • 100+ Deployments Worldwide Virtualization for high-end computing deliveringhigher performance and lower Total Cost of Ownership through aggregation of multiple x86 off-the-shelves servers into a single large virtual shared memory system • Founded in 2003 • Product shipping since 2006 • Sold through Tier-1 and Tier-2 OEMs

  3. Virtualization ? • “a technique for hiding the physical characteristics of computing resources from the way in which other systems, applications or end users interact with these resources” • Wikipedia Partitioning Providing a virtual resource that is a subset of the physical resource Aggregation Providing a virtual resource that is a concatenation of several physical resources Utilization (Disk Partitioning) Flexibility and Capability (RAID and Volume Manager) Flexibility (VLANs) Availability and Capability (Link Aggregation) Utilization (Server Virtualization) ???

  4. Server Virtualization AGGREGATION Concatenation of physical resources PARTITIONING Subset of the physical resource Virtual Machines Virtual Machine App App App App OS OS OS OS Hypervisor or VMM Hypervisor or VMM Hypervisor or VMM Hypervisor or VMM Hypervisor or VMM

  5. How Does It Work? Multiple off-the-shelf x86 servers, with processors and memory Processors speed/amount or memory amount does not have to be same across all boards InfiniBand HCAs, cables and switch vSMP Foundation™ Devices The flash-devices plug into the boards and used as bootable device. vSMP Foundation is booted to present an aggregate coherent view to the OS HIGH-END X86 SYSTEM, BASED ONSTANDARD X86 COMPONENTS

  6. Behind The Scenes SEAMLESS INTEGRATION One System • Software interception engine creates a uniform execution environment • vSMP Foundation creates the relevant BIOS environment to present the OS (and the SW stack above it) as single coherent system Coherent Memory • vSMP Foundation maintains cache coherency between boards • Multiple concurrent memory coherency mechanisms, on a per-block basis, based on real-time memory activity access pattern • Leverage board local-memory for caching Shared I/O • vSMP exposes all available I/O resources to the OS in a unified PCI hierarchy • No need for cluster file systems HIGHEST X86 SMP MEMORY BANDWIDTH!

  7. Server Virtualization Aggregation • SMP Cost Savings & Performance Virtual Machine App OS High core-count / Large memory • Cluster Manageability Hypervisor or VMM Hypervisor or VMM Hypervisor or VMM Hypervisor or VMM New management paradigm for small clusters: 4 to 64 nodes AGGREGATION Concatenation of physical resources • Cloud Flexibility On-the-fly provisioning for compute grids: unlimited scaling

  8. vSMP Foundation Aggregation Platform • SMP • Cluster • Cloud Cost Savings & Performance Manageability Flexibility • Cost Savings • Up to 5X cost savings • Performance • Leveraging latest Intel processors • Best x86 solution by SPEC CPU2006 • #7th best shared-memory by STREAM (memory bandwidth) • Reliability • Fault detection and component isolation • Redundant backplane support • Capabilities • Up to 128 cores and 4TB RAM • Manageability • Single Operating System (OS) for up to 16 nodes • OS driven job scheduling and resource management • Performance • InfiniBand performance with zero management and knowhow • Storage • Built-in cluster file system • Installation • Unboxing to production in less than 3 hours • Flexibility • On-the-fly aggregated VM provisioning and tear-down • Scaling memory or CPU • Utilization • Resource fragmentation reduction • Support any programming model (serial, throughput, multi-threaded, large-memory) without machine boundary • Integration • Network installation provides seamless integration with any grid management system

  9. Target Environments and Applications Target Environments Typical end-user applications Manufacturing CSM (Computational Structural Mechanics) ABAQUS/Explicit ABAQUS/Standard ANSYS Mechanical LSTC LS-DYNA ALTAIR Radioss NASTRAN CFD (Computational Fluid Dynamics) FLUENT ANSYS CFX STAR-CD AVL FIRE Tgrid Other inTrace OpenRT Life Sciences Gaussian VASP AMBER Schrödinger Jaguar Schrödinger Glide NAMD DOCK GAMESS GOLD mpiBLAST GROMACS MOLPRO OpenEye FRED OpenEye OMEGA SCM ADF HMMER Energy Schlumberger ECLIPSE Paradigm GeoDepth 3DGEO 3DPSDM Norsar 3D • Users seeking to simplify cluster complexities • Applications that use large memory footprint (even with one processor) • Applications that need multiple processors and shared memory EDA Mentor Cadence Synopsys Finance Wombat KX Others The MathWorks MATLAB R Octave Wolfram MATHEMATICA ISC STAR-P Weather Forecasting MM5 WRF

  10. Example vSMP Foundation Cluster Challenges: • Need to run MPI as well as OpenMP codes • System needs to be deployed remotely, and hence needs to be simple to manage • Data processing flow is complex and requires transferring large amounts of data between steps Applications: MM5, WRF, MAWSIP, Home-grown code for data transformation Solution: • 4 Intel Nehalem dual socket blades, total of 8 sockets (32 cores) and 192GB RAM • Internal storage • Solution was extended to 8 blades, total of 16 sockets (64 cores) and 384GB RAM Benefits: • Performance: 2.5 X better performance on same # of cores (32) • Simpler solution: Significantly reduced capital expense, allowedthe customer to have a higher # of cores • Simplicity: Simple to manage by domain experts (weather forecast scientists) • Dataflow remains within the system, leveraging internal storage WEATHER FORECASTING SERVICE PROVIDER SIMPLE AND FLEXIBLE COST EFFECTIVE SOLUTION

  11. Example: vSMP Foundation SMP Challenges: • Need to generate large mesh as part of pre-processing of whole-carsimulation (FLUENT TGrid). Mesh requirements are ~200GB in size • Expect to grow significantly within 12 months after initial deployment • Would like to standardize on x86 architecture due to lower costs and open standards Solution: • 12 Intel dual-processor Xeon systems to provide 384GB RAM single virtual system running Linux with vSMP Foundation Benefits: • Better performance: Solution evaluated and found to be faster thanalternative systems (x86 and non-x86) • Cost: Significant savings compared to alternative system • Versatility: Also being used to run FLUENT (MPI) as part of large cluster • Investment protection: Solution can grow FORMULA1 TEAM SCALEUPAT SCALEOUTPRICING

  12. Example: vSMP Foundation Cluster Challenges: • A single 4-socket server did not provide enough performance required for customer business targets • Multiple 4-socket servers required complex decomposition and introduced challenges in transferring data between processes in a short and deterministic time (low latency and small jitters) • Co-location at exchanges for a solution comprised of multiple systems is complicated Applications: KX, WOMBAT, Home-grown code Solution: • 16 Intel dual-processor Xeon systems to provide 0.5TB RAM, 32 sockets (128 cores) single virtual system running Linux with vSMP Foundation Benefits: • Reduced latency and latency variance • Simpler solution: Deploy and management of a single system • Better utilization: Single system reduces resources fragmentation • Simpler programming model: No need for specific InfiniBandprogramming FINANCIAL SERVICES SIMPLIFYING INTER-PROCESS COMMUNICATION

  13. Example: vSMP Foundation Cloud Problems: • Need to provision systems for MPI as well as OpenMP (shared memory) codes • Large shared memory jobs currently require dedicated proprietary hardware • Low utilization on shared memory systems Applications: A variety of commercial and customer codes Solution: • Original: 4 systems, total of 8 sockets (32 cores) and 128GB RAM • Solution was extended to 16 nodes Benefits: • Utilization: Rely on standard commodity hardware • Flexibility: Using same system for both shared memoryand cluster benchmarks, resulting in high utilization HOSTED HPC RESOURCE PROVIDER COST EFFECTIVE FLEXIBLE SOLUTION WITH HIGH UTILIZATION

More Related