High Performance Cluster Computing: Architectures and Systems

High Performance Cluster Computing:Architectures and Systems Book Editor: Rajkumar Buyya Slides: Hai Jin and Raj Buyya Internet and Cluster Computing Center

http://www.buyya.com/cluster/ Cluster Computing at a GlanceChapter 1: by M. Baker and R. Buyya • Introduction • Scalable Parallel Computer Architecture • Towards Low Cost Parallel Computing and Motivations • Windows of Opportunity • A Cluster Computer and its Architecture • Clusters Classifications • Commodity Components for Clusters • Network Service/Communications SW • Cluster Middleware and Single System Image • Resource Management and Scheduling (RMS) • Programming Environments and Tools • Cluster Applications • Representative Cluster Systems • Cluster of SMPs (CLUMPS) • Summary and Conclusions

Resource Hungry Applications • Solving grand challenge applications using computer modeling, simulation and analysis Aerospace Internet & Ecommerce Life Sciences Digital Biology CAD/CAM Military Applications Military Applications Military Applications

How to Run Applications Faster ? • There are 3 ways to improve performance: • Work Harder • Work Smarter • Get Help • Computer Analogy • Using faster hardware • Optimized algorithms and techniques used to solve computational tasks • Multiple computers to solve a particular task

2100 2100 2100 2100 2100 2100 2100 2100 2100 Scalable HPC: Breaking Administrative Barriers ? PERFORMANCE Administrative Barriers • Individual • Group • Department • Campus • State • National • Globe • Inter Planet • Universe Desktop SMPs or SuperComputers Global Cluster/Grid Inter Planet Cluster/Grid ?? Local Cluster Enterprise Cluster/Grid

Era of Computing • Rapid technical advances • the recent advances in VLSI technology • software technology • OS, PL, development methodologies, & tools • grand challenge applications have become the main driving force • Parallel computing • one of the best ways to overcome the speed bottleneck of a single processor • good price/performance ratio of a small cluster-based parallel computer

Commercialization R & D Commodity Two Eras of Computing Architectures System Software/Compiler Applications P.S.Es Architectures System Software Applications P.S.Es Sequential Era Parallel Era 1940 50 60 70 80 90 2000 2030

Scalable (Parallel) Computer Architectures • Taxonomy • based on how processors, memory & interconnect are laid out, resources are managed • Massively Parallel Processors (MPP) • Symmetric Multiprocessors (SMP) • Cache-Coherent Non-Uniform Memory Access (CC-NUMA) • Clusters • Distributed Systems – Grids/P2P

Scalable Parallel Computer Architectures • MPP • A large parallel processing system with a shared-nothing architecture • Consist of several hundred nodes with a high-speed interconnection network/switch • Each node consists of a main memory & one or more processors • Runs a separate copy of the OS • SMP • 2-64 processors today • Shared-everything architecture • All processors share all the global resources available • Single copy of the OS runs on these systems

Scalable Parallel Computer Architectures • CC-NUMA • a scalable multiprocessor system having a cache-coherent nonuniform memory access architecture • every processor has a global view of all of the memory • Clusters • a collection of workstations / PCs that are interconnected by a high-speed network • work as an integrated collection of resources • have a single system image spanning all its nodes • Distributed systems • considered conventional networks of independent computers • have multiple system images as each node runs its own OS • the individual machines could be combinations of MPPs, SMPs, clusters, & individual computers

Key Characteristics of Scalable Parallel Computers

CPU CPU CPU CPU SM SM UMA vs. NUMA UMA LM LM CPU CPU SM cache

CPU CPU CPU SM SM SM directory directory directory Local bus Local bus Local bus node block offset 8 18 6 UMA vs. NUMA NUMA 256 nodes 16MB (tota 4GB shared memory)l Directory : 2^18 64-bytes enteries 2^18-1 4 3 2 1 0 82 1 0x24000108 36 node, 4block, 8bytes

In Summary • Need more computing power • Improve the operating speed of processors & other components • constrained by the speed of light, thermodynamic laws, & the high financial costs for processor fabrication • Connect multiple processors together & coordinate their computational efforts • parallel computers • allow the sharing of a computational task among multiple processors

Technology Trends... • Performance of PC/Workstations components has almost reached performance of those used in supercomputers… • Microprocessors (50% to 100% per year) • Networks (Gigabit SANs); • Operating Systems (Linux,...); • Programming environment (MPI,…); • Applications (.edu, .com, .org, .net, .shop, .bank); • The rate of performance improvements of commodity systems is much rapid compared to specialized systems.

Technology Trends

Trend • [Traditional Usage] Workstations with UNIX for science & industry vs PC-based machines for administrative work & work processing • [Trend] A rapid convergence in processor performance and kernel-level functionality of UNIX workstations and PC-based machines

Rise and Fall of Computer Architectures • Vector Computers (VC) - proprietary system: • provided the breakthrough needed for the emergence of computational science, buy they were only a partial answer. • Massively Parallel Processors (MPP) -proprietary systems: • high cost and a low performance/price ratio. • Symmetric Multiprocessors (SMP): • suffers from scalability • Distributed Systems: • difficult to use and hard to extract parallel performance. • Clusters - gaining popularity: • High Performance Computing - Commodity Supercomputing • High Availability Computing - Mission Critical Applications

The Dead Supercomputer Societyhttp://www.paralogos.com/DeadSuper/ • Dana/Ardent/Stellar • Elxsi • ETA Systems • Evans & Sutherland Computer Division • Floating Point Systems • Galaxy YH-1 • Goodyear Aerospace MPP • Gould NPL • Guiltech • Intel Scientific Computers • Intl. Parallel Machines • KSR • MasPar • ACRI • Alliant • American Supercomputer • Ametek • Applied Dynamics • Astronautics • BBN • CDC • Convex • Cray Computer • Cray Research (SGI?Tera) • Culler-Harris • Culler Scientific • Cydrome • Meiko • Myrias • Thinking Machines • Saxpy • Scientific Computer Systems (SCS) • Soviet Supercomputers • Suprenum Convex C4600

Computer Food Chain: Causing the demise of specialize systems • Demise of mainframes, supercomputers, & MPPs

Towards Clusters The promise of supercomputing to the average PC User ?

Towards Commodity Parallel Computing • linking together two or more computers to jointly solve computational problems • since the early 1990s, an increasing trend to move away from expensive and specialized proprietary parallel supercomputers towards clusters of workstations • Hard to find money to buy expensive systems • the rapid improvement in the availability of commodity high performance components for workstations and networks • Low-cost commodity supercomputing • from specialized traditional supercomputing platforms to cheaper, general purpose systems consisting of loosely coupled components built up from single or multiprocessor PCs or workstations

History: Clustering of Computers for Collective Computing PDA Clusters 1990 1995+ 2000+ 1980s 1960

Why PC/WS Clustering Now ? • Individual PCs/workstations are becoming increasing powerful • Commodity networks bandwidth is increasing and latency is decreasing • PC/Workstation clusters are easier to integrate into existing networks • Typical low user utilization of PCs/WSs • Development tools for PCs/WS are more mature • PC/WS clusters are a cheap and readily available • Clusters can be easily grown

What is Cluster ? • A cluster is a type of parallel or distributed processing system, which consists of a collection of interconnected stand-alone computers cooperatively working together as a single, integrated computing resource. • A node • a single or multiprocessor system with memory, I/O facilities, & OS • generally 2 or more computers (nodes) connected together • in a single cabinet, or physically separated & connected via a LAN • appear as a single system to users and applications • provide a cost-effective way to gain features and benefits

PC/Workstation PC/Workstation PC/Workstation PC/Workstation Communications Software Communications Software Communications Software Communications Software Network Interface Hardware Network Interface Hardware Network Interface Hardware Network Interface Hardware Cluster Architecture Parallel Applications Parallel Applications Parallel Applications Sequential Applications Sequential Applications Sequential Applications Parallel Programming Environment Cluster Middleware (Single System Image and Availability Infrastructure) Cluster Interconnection Network/Switch

So What’s So Different about Clusters? • Commodity Parts? • Communications Packaging? • Incremental Scalability? • Independent Failure? • Intelligent Network Interfaces? • Complete System on every node • virtual memory • scheduler • files • … • Nodes can be used individually or jointly...

Windows of Opportunities • Parallel Processing • Use multiple processors to build MPP/DSM-like systems for parallel computing • Network RAM • Use memory associated with each workstation as aggregate DRAM cache • Software RAID • Redundant array of inexpensive disks • Use the arrays of workstation disks to provide cheap, highly available, & scalable file storage • Possible to provide parallel I/O support to applications • Multipath Communication • Use multiple networks for parallel data transfer between nodes

Cluster Design Issues • Enhanced Performance (performance @ low cost) • Enhanced Availability (failure management) • Single System Image (look-and-feel of one system) • Size Scalability (physical & application) • Fast Communication (networks & protocols) • Load Balancing (CPU, Net, Memory, Disk) • Security and Encryption (clusters of clusters) • Distributed Environment (Social issues) • Manageability (admin. And control) • Programmability (simple API if required) • Applicability (cluster-aware and non-aware app.)

Scalability Vs. Single System Image UP

Common Cluster Modes • High Performance (dedicated). • High Throughput (idle cycle harvesting). • High Availability (fail-over). • A Unified System – HP and HA within the same cluster

High Performance Clustera (dedicated mode)

Shared Pool ofComputing Resources:Processors, Memory, Disks Interconnect Guarantee at least oneworkstation to many individuals (when active) Deliver large % of collective resources to few individuals at any one time High Throughput Cluster (Idle Resource Harvesting)

High Availability Clusters

HA and HP in the same Cluster • Best of both Worlds: world is heading towards this configuration)

Cluster Components

Prominent Components of Cluster Computers (I) • Multiple High Performance Computers • PCs • Workstations • SMPs (CLUMPS) • Distributed HPC Systems leading to Metacomputing

Prominent Components of Cluster Computers (II) • State of the art Operating Systems • Linux (MOSIX, Beowulf, and many more) • Microsoft NT (Illinois HPVM, Cornell Velocity) • SUN Solaris (Berkeley NOW, C-DAC PARAM) • IBM AIX (IBM SP2) • HP UX (Illinois - PANDA) • Mach (Microkernel based OS) (CMU) • Cluster Operating Systems (Solaris MC, SCO Unixware, MOSIX (academic project) • OS gluing layers (Berkeley Glunix)

Prominent Components of Cluster Computers (III) • High Performance Networks/Switches • Ethernet (10Mbps), • Fast Ethernet (100Mbps), • Gigabit Ethernet (1Gbps) • SCI (Scalable Coherent Interface- MPI- 12µsec latency) • ATM (Asynchronous Transfer Mode) • Myrinet (1.2Gbps) • QsNet (Quadrics Supercomputing World, 5µsec latency for MPI messages) • Digital Memory Channel • FDDI (fiber distributed data interface) • InfiniBand

Prominent Components of Cluster Computers (IV) • Network Interface Card • Myrinet has NIC • User-level access support

Prominent Components of Cluster Computers (V) • Fast Communication Protocols and Services • Active Messages (Berkeley) • Fast Messages (Illinois) • U-net (Cornell) • XTP (Virginia) • Virtual Interface Architecture (VIA)

Myrinet QSnet Giganet ServerNet2 SCI Gigabit Ethernet Bandwidth (MBytes/s) 140 – 33MHz 215 – 66 Mhz 208 ~105 165 ~80 30 - 50 MPI Latency (µs) 16.5 – 33Nhz 11 – 66 Mhz 5 ~20 - 40 20.2 6 100 - 200 List price/port $1.5K $6.5K $1.5K ~$1.5K Hardware Availability Now Now Now Q2‘00 Now Now ~$1.5K ~$1.5K Linux Support Now Late‘00 Now Q2‘00 Now Now Maximum #nodes 1000’s 1000’s 1000’s 64K 1000’s 1000’s Protocol Implementation Firmware on adapter Firmware on adapter Firmware on adapter Implemented in hardware Implemented in hardware Firmware on adapter VIA support Soon None NT/Linux Done in hardware Software TCP/IP, VIA NT/Linux 3rd Party MPI support 3rd party Quadrics/ Compaq 3rd Party Compaq/3rd party MPICH – TCP/IP Comparison

Prominent Components of Cluster Computers (VI) • Cluster Middleware • Single System Image (SSI) • System Availability (SA) Infrastructure • Hardware • DEC Memory Channel, DSM (Alewife, DASH), SMP Techniques • Operating System Kernel/Gluing Layers • Solaris MC, Unixware, GLUnix, MOSIX • Applications and Subsystems • Applications (system management and electronic forms) • Runtime systems (software DSM, PFS etc.) • Resource management and scheduling (RMS) software • SGE (Sun Grid Engine), LSF, PBS, Libra: Economy Cluster Scheduler, NQS, etc.

Prominent Components of Cluster Computers (VII) • Parallel Programming Environments and Tools • Threads (PCs, SMPs, NOW..) • POSIX Threads • Java Threads • MPI (Message Passing Interface) • Linux, NT, on many Supercomputers • PVM (Parallel Virtual Machine) • Parametric Programming • Software DSMs (Shmem) • Compilers • C/C++/Java • Parallel programming with C++ (MIT Press book) • RAD (rapid application development tools) • GUI based tools for PP modeling • Debuggers • Performance Analysis Tools • Visualization Tools

Prominent Components of Cluster Computers (VIII) • Applications • Sequential • Parallel / Distributed (Cluster-aware app.) • Grand Challenging applications • Weather Forecasting • Quantum Chemistry • Molecular Biology Modeling • Engineering Analysis (CAD/CAM) • ………………. • PDBs, web servers,data-mining

Key Operational Benefits of Clustering • High Performance • Expandability and Scalability • High Throughput • High Availability

Clusters Classification (I) • Application Target • High Performance (HP) Clusters • Grand Challenging Applications • High Availability (HA) Clusters • Mission Critical applications

Clusters Classification (II) • Node Ownership • Dedicated Clusters • Non-dedicated clusters • Adaptive parallel computing • Communal multiprocessing

Clusters Classification (III) • Node Hardware • Clusters of PCs (CoPs) • Piles of PCs (PoPs) • Clusters of Workstations (COWs) • Clusters of SMPs (CLUMPs)

Clusters Classification (IV) • Node Operating System • Linux Clusters (e.g., Beowulf) • Solaris Clusters (e.g., Berkeley NOW) • NT Clusters (e.g., HPVM) • AIX Clusters (e.g., IBM SP2) • SCO/Compaq Clusters (Unixware) • Digital VMS Clusters • HP-UX clusters • Microsoft Wolfpack clusters

High Performance Cluster Computing: Architectures and Systems