Building a Low-Cost Supercomputer Dr. Tim McGuire Sam Houston State University
Acknowledgments • Most treatments of cluster computing (including this one) are heavily based on the seminal work of Greg Pfister (IBM Research, Austin,) In Search of Clusters • The concept of Beowulf clusters originated with Donald J. Becker and Thomas Sterling at the Center of Excellence in Space Data and Information Sciences, NASA Goddard Space Flight Center
Introduction • There are three ways to do anything faster: • Work harder • "Crunch Time" is familiar to all of us • Work smarter • Better to find a way to reduce the work needed • Get help • Certainly works, but we all know about committees ...
In a computer ... • Working Harder Get a faster processor • Working Smarter Use a better algorithm • Getting Help Parallel processing
Working Harder -- Faster Processors • The effect of faster processors is astonishing • The effective speed of the x86 family of processors has increased nearly 50% per year • RISC architectures have sustained a 60% annual cumulative growth rate • These trends will likely continue for the foreseeable future
Working Smarter -- Better Algorithms • The increases in speed made possible by better algorithms dwarf the accomplishments of faster hardware • Binary search on 1 billion items takes 30 comparisons, versus a maximum of one billion comparisons using linear search
Getting Help -- Parallel Processing • Covert parallel processing pipelining, vector processing, etc. really equivalent to faster hardware • Overt parallelism Done via software • "Parallelism is the wave of the future -- and always will be"
Early Attempts at Parallelism • Von Neumann thought it was too hard, and gave us the "Von Neumann bottleneck" • 60's ILLIAC IV project was the first great attempt at parallel processing (as well as trying to advance circuit and software technology.) • Japanese Fifth Generation Project launched another wave, including the Grand Challenge problems
Microprocessor Revolution • Microprocessors have had a superior price/performance ratio • "All you have to do is gang a whole bunch of them together" • The problem is "All you also have to do is program them to work together" • Programming costs much more than hardware
Highly Parallel Computing • Finally, (early 90's) microprocessors became fast and powerful enough that a practical-sized aggregation of them seemed the only feasible way to exceed supercomputer speeds • Even Cray Research (T3D) got into the act
"Lowly" Parallel Processing • Mid-to-late 90's -- military downsizing (among other things) caused funding to dry up • However … • Microprocessors kept getting faster … a lot faster • With overall performance doubling each year, in 4 years what needed 256 processors can be done with 16 instead. • System availability became a mass market issue • Since computers are so cheap, buy two (or more) for redundancy in case one fails and use them both, interconnected by a network
SMP -- One Form of "Cheap" Parallelism • Symmetric multiprocessors have been around for some time and have certain advantages over clusters • Typically, these have been shared memory systems -- few communication problems
The Big Distinction -- Programming • How you program SMP systems is substantially different from programming clusters: Their programming models are different • If you explicitly exploit SMP in an application, it's essentially impossible to efficiently exploit clusters in the same program
Why Clusters? • The Standard Litany • Why Now ? • Why Not Now?
The Standard Litany • Performance • Availability • Price/Performance Ratio • Incremental Growth • Scaling • Scavenging
Performance • No matter what form or measure of performance one is seeking -- throughput, response time, turnaround time, etc., it is straightforward to claim that one can get even more of it by using a bunch of machines at the same time. • Only occasionally does one hear the admission that a "tad bid" of new programming will be needed for anything to work correctly.
Availability • Having a computer shrivel up into an expensive paperweight can be a lot less traumatic if it's not unique, but rather one of a herd. • The work done by the dear departed sibling can be redistributed among the others (fail-soft computing)
Price/Performance Ratio • Clusters and other forms of computer aggregation are typically collections of machines that individually have very good performance for their price. • The promise is that the aggregate retains the price/performance of its individual members.
Incremental Growth • To the degree that one really does attain greater performance and availability with a group of computers, one should be able to enhance both by merely adding more machines. • Replacing machines should not be necessary.
Scaling • "Scalable" is, unfortunately, a buzzword • What it does deal with is how big a computer system can usably get. • It is a crucial element in the differentiation between clusters and symmetric multiprocessors.
Scavenging • "Look at all those unused CPU cycles spread across all the desktops in our network…" • Unused cycles are free. • However, how do you get and manage them? -- this complicates cluster support very significantly
The Benefits are Real • But, how does one take advantage of it? • The hardware provides the potential. • The fulfillment lies in the software, and unfortunately, software isn't riding the exponential growth curve.
Why Now? • Three Trends • Fat Boxes -- very high performance microprocessors • Fat Pipes -- standard high-speed communication • Thick Glue -- standard tools for distributed computing • One Market Requirement • High Availability
Fat Boxes • Microprocessors have kept, and will keep getting faster. • Supercomputers in the classic style are extinct for practical purposes • Mass-market, inexpensive microprocessors have crawled up the tailpipe of the workstation market just like workstations crawled up the tailpipe of minicomputers and mainframes earlier. • There are no more supercomputers, there is only supercomputing.
Fat Pipes • Commodity off the shelf (COTS) networking parts have achieved communication performance that was only previously possible with expensive, proprietary techniques • Standardized communication facilities such as • ATM - Asynchronous Transmission Mode • Switched Gigabit Ethernet • FCS -- Fibre Channel Standard • Performance of Gigabytes per second are possible.
Thick Glue • Standard tools for distributed computing such as TCP/IP • Intranets, the Internet, and the World Wide Web • Tool sets for distributed system administration • PVM (Parallel Virtual Machine) and MPI (Message Passing Interface)
Requirement for High Availability • Nobody has ever wanted computers to break. • However, never before has high availability become a significant issue in a mass market computer arena. • Clusters are uniquely capable of answering the need of both sides of the spectrum and are much cheaper than hardware based fault-tolerant approaches.
Why Not Now? • If they're so good, why haven't clusters become the most common mode of computation? • Lack of "single system image" software • Limited exploitation
Lack of Single System Image Software • Replacing a single large computer with a cluster means that many systems will have to be managed rather than one. • Their distributed management tools are tools, not turnkey systems • 50% of the cost of a computer system is staffing, rather than hardware, software, or maintenance
Limited Exploitation • Only relatively few types of subsystems now exploit the ability of clusters to provide both scalable performance and high availability. • This is a direct result of substantial difficulties that arise in parallel programming. • The problem is not hardware, it's software
An Exception • For one kind of parallel system, the software issues have been addressed to a large degree: The symmetric multiprocessor (SMP) • It of necessity requires a single system image
Definitions, Distinctions, and Comparisons • Definition • Distinction from Parallel Systems • Distinctions from Distributed Systems • Comparisons and Contrasts
Definition • A cluster is a type of parallel or distributed system that: • consists of a collection of interconnected stand-alone computers, and • is used as a single, unified computing resource • We define them as a subparadigm of distributed (or parallel) systems
Distinction from Parallel Systems • A useful analogy: • This is A Dog • (a single computer)
A Pack of Dogs • And this is a pack of dogs (running in parallel) • (a cluster)
A Savage Multiheaded Pooch • … or, pardon the abbreviation, "SMP" • (This pooch is no relation to Kerberos (Cerberus in Latin) that guards both the gates of Hades and distributed systems -- He only has three heads.)
Dog Packs and SMPs are Similar • Both are more potent than just plain dogs • They can both bring down larger prey than a plain single dog. • They eat more and eat faster than a single dog
Dog Packs and SMPs are Different • Scaling • Availability • System Management • Software Licensing
Scaling Differences • The Savage Multiheaded Pooch can take many bites at once • What happens when it tries to swallow? • It needs a larger throat, stomach, intestines, etc. • Similarly, to scale SMPs, you must beef up the entire machine • When you add another dog to a dog pack, you add a whole dog. You don't have to do anything to the other dogs. • Likewise, clusters
Availability • If an SMP breaks a leg … "that dog won't hunt" … no matter how many heads it has. • If a member of the pack is injured, the rest of the pack can still bring down prey.
System Management • You only have to walk a SMP once. • It takes a good deal more effort to train a pack of dogs to behave. • With the SMP, all you have to do is get the heads to learn basic cooperation (and that should be built into the operating system.)
Licensing (Dogs or Software) • If you get a license for an SMP, you'll probably only need one license • For an cluster of dogs, you'll need one per dog
Distinctions from Distributed Systems • The distinctions of clusters from distributed systems is not as clear (and a lot of people confuse the two.) • We'll try. The salient points are: • Internal Anonymity • Peer Relationship • Clusters as part of a Distributed System
Internal Anonymity • Nodes in a distributed system necessarily retain their own individual identities • The elements of a cluster are usually viewed from outside the cluster as anonymous • Internally, they may be differentiated, but externally the jobs are submitted to the cluster, not, for example, to cluster node #4
Peer Relationship • Distributed systems • use an underlying communication layer that is peer-to-peer • at a higher level, they are often organized into a client-server paradigm • Clusters • underlying communication is peer-to peer • organization is also peer-to-peer (with some minor exceptions)
Clusters as part of a Distributed System • Clusters usually exist in the context of a distributed system • In this case, they are viewed by the distributed system as a single node • For example, the cluster could server as a compute engine • It also could serve as, say, a DBMS server in the client-server paradigm (but that's not the organization we want to consider in this presentation)
Beowulf Clusters • The Beowulf project was initiated in 1994 under the sponsorship of the NASA HPCC program to explore how computing could be made "cheaper better faster". • They termed this PoPC -- a Pile of PCs
The "Pile of PCs" Approach • Very similar to COW (cluster of workstations) and shares the roots of NOW (network of workstations,) but emphasizes: • COTS (commodity off the shelf) components • dedicated processors (rather than scavenging cycles from idle workstations) • a private system area network (enclosed SAN rather than exposed LAN)
What Beowulf Adds • Beowulf adds to the PoPC model by emphasizing • no custom components • easy replication from multiple vendors • scalable I/O • a freely available software base • using freely available distributed computing tools with minimal changes • a collaborative design
Advantages of the Beowulf Approach • No single vendor owns the rights to the product -- not vulnerable to single vendor decisions • Approach permits technology tracking -- using the best, most recent components at the best price • Allows "just in place" configuration -- permits flexible and user driven decisions