1 / 30

Cluster Computing Overview

Cluster Computing Overview. What is a Cluster?. A cluster is a collection of connected, independent computers that work together to solve a problem . A Typical Cluster. Many standalone computers All of the cluster can work together on a single problem at the same time

bella
Télécharger la présentation

Cluster Computing Overview

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Cluster Computing Overview

  2. What is a Cluster? A cluster is acollection of connected, independent computers that work together to solve a problem.

  3. A Typical Cluster • Many standalone computers • All of the cluster can work together on a single problem at the same time • Portions of the cluster can be working on different problems at the same time • Connected together by a network • Larger clusters have separate high speed interconnects • Administered as a single “machine”

  4. Some Cluster Acronyms • Node – a machine in a cluster • Sizes • Kb – kilobyte – thousand bytes – small/medium sized email • Mb – megabyte – million bytes – 2/3 of a 3.5” floppy • Gb – gigabyte – billion bytes – good amount of computer memory or a very old disk drive • Tb – terabyte – trillion bytes – 1/30th Library of Congress • Pb – petabyte – 1K trillion bytes – 30 Libraries of Congress • SMP – symetric multi-processing (many processors) • NFS – Network File System • HPC – High Performance Computing

  5. 1984 Computer Food Chain Mainframe PC Workstation Mini Computer Vector Supercomputer

  6. Cray* 2 How to Build a Supercomputer: 1980’s A supercomputer was a vector SMP (symmetric multi-processor) • Custom CPUs • Custom memory • Custom packaging • Custom interconnects • Custom operating system Costs were Extreme: Around ~$5 million/gigaFLOP Technology Evolution Tracking: ~1/3 Moore’s Law Predictions

  7. 1994 Computer Food Chain (hitting wall soon) Mini Computer PC Workstation Mainframe (future is bleak) Vector Supercomputer MPP

  8. Intel® processor based ASCI - Red How to Build a Supercomputer: 1990’s A supercomputer was an MPP (massively parallel processor) • COTS1 CPUs • COTS memory • Custom packaging • Custom interconnects • Custom operating system 1 COTS = Commercial Off The Shelf Costs were High: Around $200K/gigaFLOP Technology Evolution Tracking: ~1/2 Moore’s Law Predictions

  9. NCSA 1990’s Former Cluster • ~1,500 processor SGI de-commissioned • Too costly to maintain • Software too expensive • Takes up large amounts of floor space • (Great for tours, looks impressive, nice displays) • Gradually being taken out when floor space required • Now being used as network file servers

  10. Computer Food Chain (Now and Future)

  11. Loki: an Intel® processor based cluster at Los Alamos National Laborry (LANL How to Build a Supercomputer: 2000’s A Supercomputer is a Cluster • COTS1 CPUs • COTS Memory • COTS Packaging • COTS Interconnects • COTS Operating System 1COTS = Commercial Off The Shelf Costs are Modest: Around $4K/gigaFLOP Technology Evolution Tracks Moore’s Law

  12. Upcoming Teragrid Clusters • Over 4,000 Itanium 2 processors at 4 supercomputer sites • National Center for Supercomputing Applications (NCSA) • San Diego Supercomputer Center (SDSC) • Argonne National Laboratory • California Institute of Technology (Caltech). • 13.6 Teraflops computing power (8 teraflops at NCSA) • 650 terabytes of disk storage • Linked by cross-country 40 Gbit network (16 times faster than the fastest research network currently in existance) • 16 minutes to transfer the entire Library of Congress • Some uses: • The study of cosmological dark matter • Real-time weather forecasting

  13. Larger Clusters • Japan wants to top TOP500 with new cluster • 30,000 node cluster planned • “Black” clusters (classified) • NSA used to receive large fraction of Cray production • Larger ones planned • Scaling problems • Scidac federal mandate to solve scaling problems and enable very large clusters deployment • Cooperative widely separated clusters such as SETI

  14. Clustering Today • Clustering gained momentum when 3 technologies converged: • 1. Very HP Microprocessors • workstation performance = yesterday supercomputers • 2. High speed communication • Comm. between cluster nodes >= between processors in an SMP. • 3. Standard tools for parallel/ distributed computing & their growing popularity.

  15. Future Cluster Expansion Directions • Hyper-clusters • Grid computing

  16. Cluster 1 Scheduler Master Daemon LAN/WAN Submit Graphical Control Cluster 3 Execution Daemon Scheduler Clients Master Daemon Cluster 2 Scheduler Submit Graphical Control Execution Daemon Master Daemon Clients Submit Graphical Control Execution Daemon Clients Clusters of Clusters (HyperClusters)

  17. Towards Grid Computing….

  18. What is the Grid ? • An infrastructure that couples • Computers (PCs, workstations, clusters, traditional supercomputers, and even laptops, notebooks, mobile computers, PDA, etc) • Databases (e.g., transparent access to human genome database) • Special Instruments (e.g., radio telescope--SETI@Home Searching for Life in galaxy, Austrophysics@Swinburne for pulsars) • People (may be even animals who knows, frogs already planned?) • across the local/wide-area networks (enterprise, organisations, or Internet) and presents them as an unified integrated (single) resource.

  19. Network Topologies • Cluster has it’s own private network • One or a few outside accessible machines • Most of cluster machines on a private network • Easier to manage • Better security (only have to secure entry machines) • Bandwidth limitations (funneling through a few machines) • Appropriate for smaller clusters • Lower latency between nodes • Cluster machines are all on the public network • Academic clusters require this • Some cluster software applications require this • Harder for security (have to secure EVERY machine) • Much higher network bandwidth

  20. Communication Networks • 100 Base T (Fast Ethernet) • 10 MB/sec (100 Mb/sec) • 80-150 microsecond latency • Essentially free • Gigabit Ethernet • Typically delivers 30-60 MB/sec • ~$1500 / node (going down rapidly)

  21. Message Passing • Most parallel computations cluster software requires Message Passing • The speed of computations is often dependant on message passing speed as much as raw processor speed • Message passing is often done through high speed interconnects because traditional networks are too slow

  22. High Speed Interconnects • Myrinet from Myricom (most popular in large clusters) • Proprietary, Myrinet 2000 delivers 200 MB/sec • 10-15 microsecond latency • ~$1500 / node (going down) • Scales to 1000’s of nodes • SCI • Proprietary, good for small clusters • 100 MB/sec • ~5 microsecond latency • Quadrics • Proprietary, very expensive • 200 MB/s delivered • 5 microsecond latency

  23. Up to 30 Gbits/second first specifications 15 times faster than fastest high speed interconnects Just now starting to be available commercially Industry standard Will be available from numerous companies InfiniBand – Future of Interconnects?

  24. Cluster Software Operating System Choices • Linux • Redhat – most popular • Mandrake – similar to Redhat, technically superior • FreeBSD, OpenBSD, other BSD’s • Technically superior to Linux’es • Much less popular than Linux’es • Windoze

  25. Pre-Packaged Cluster Software Choices • Pre-packaged cluster software • NCSA cluster-in-a-box • NCSA grid-in-a-box • OSCAR • Score • Scyld/Beuwolf • MSC • NPACI Rocks

  26. OSCAR Pre-packaged Cluster Software • Packaged open source cluster software • Designed to support many Unix operating systems • Currently, Redhat Linux • Soon to be released - Mandrake • Supported and developed by: • NCSA • IBM • Dell • Intel • Oak Ridge Laboratories • Most popular open source cluster software package

  27. Score Pre-Packaged Cluster Software • Very popular in Japan • Very sophisticated

  28. Scyld/Beuwolf Pre-Packaged Cluster Software • Different model – treats cluster of separate machines like one big machine – same process space • Oriented towards commercial turn-key clusters • Very slick installation • Not as flexible – separate machines not accessible

  29. NPACI Rocks Pre-Packaged Cluster Software • Based on Redhat Linux • Similar to OSCAR • Competitor of OSCAR • Developed by the San Diego Supercomputer Center and others

  30. OSCAR Overview • Open Source Cluster Application Resources • Cluster on a CD – automates cluster install process • IBM, Intel, NCSA, ORNL, MSC Software, Dell • NCSA “Cluster in a BOX” base • Wizard driven • Nodes are built over network • OSCAR <= 64 node clusters for initial target • OSCAR will probably be on two 1,000 node clusters • Works on PC commodity components • RedHat based (for now) • Components: Open source and BSD style license

More Related