430 likes | 533 Vues
Explore the world of supercomputing, data analysis, and advanced computing opportunities. Learn about parallel processing, cluster computing, and the applications of supercomputers in data mining, simulation, and more.
E N D
High Performance Computing and CyberGIS Keith T. Weber, GISP GIS Director, ISU
Goal of this presentation • Introduce you to an another world of computing, analysis, and opportunity • Encourage you to learn more!
Some Terminology Up-Front • Supercomputing • HPC • HTC • CI
Acknowledgements • Much of the material presented here, was originally designed by Henry Neeman at Oklahoma University and OSCER
What is Supercomputing? • Supercomputing is the biggest, fastest computing right this minute. • Likewise, a supercomputeris one of the biggest, fastest computers right this minute. • So, the definition of supercomputing is constantly changing. • Rule of Thumb: A supercomputer is typically 100 X as powerful as a PC.
What is Supercomputing About? Size Speed Laptop
Size… • Many problems that are interesting to scientists and engineers can’t fit on a PC • usually because they need more than a few GB of RAM, or more than a few 100 GB of disk.
Speed… • Many problems that are interesting to scientists and engineers would take a long time to run on a PC. • months or even years. • But a problem that would take 1 month on a PC might take only a few hours on a supercomputer
What can Supercomputing be used for? • Data Mining • Modeling • Simulation • Visualization [1]
What is a Supercomputer? • A cluster of small computers, each called a node, hooked together by an interconnection network (interconnect for short). • A cluster needs software that allows the nodes to communicate across the interconnect. • But what a cluster is … is all of these components working together as if they’re one big computer ... a super computer.
For example: Dell Intel Xeon Linux Cluster • 1,076 Intel Xeon CPU chips/4288 cores • 8,800 GB RAM • ~130 TB globally accessible disk • QLogicInfiniband • Force10 Networks Gigabit Ethernet • Red Hat Enterprise Linux 5 • Peak speed: 34.5 TFLOPs* • *TFLOPs: trillion floating point operations (calculations) per second sooner.oscer.ou.edu
Quantifying a Supercomputer • Number of cores • Your workstation (4?) • ISU cluster (800) • Blue Waters (300,000) • TeraFlops
How a cluster works together: Parallelism
Parallelism Parallelism means doing multiple things at the same time More fish! Less fish …
Understanding Parallel Processing The Jigsaw Puzzle Analogy
Serial Computing • We are very accustom to serial processing. It can be compared to building a jigsaw puzzle by yourself. • In other words, suppose you want to complete a jigsaw puzzle that has 1000 pieces. • We can agree this will take a certain amount of time…let’s just say, one hour
Shared Memory Parallelism • If Scott sits across the table from you, then he can work on his half of the puzzle and you can work on yours. • Once in a while, you’ll both reach into the pile of pieces at the same time (you’ll contend for the same resource), which will cause you to slowdown. • And from time to time you’ll have to work together (communicate) at the interface between his half and yours. The speedup will be nearly 2-to-1: Together it will take about 35 minutes instead of 30.
The More the Merrier? • Now let’s put Paul and Charlie on the other two sides of the table. • Each of you can work on a part of the puzzle, but there’ll be a lot more contention for the shared resource (the pile of puzzle pieces) and a lot more communication at the interfaces. • So you will achieve noticeably less than a 4-to-1 speedup. • But you’ll still have an improvement, maybe something like 20 minutes instead of an hour.
Diminishing Returns • If we now put Dave, Tom, Horst, and Brandon at the corners of the table, there’s going to be a much more contention for the shared resource, and a lot of communication at the many interfaces. • The speedup will be much less than we’d like; you’ll be lucky to get 5-to-1. • We can see that adding more and more workers onto a shared resource is eventually going to have a diminishing return.
Amdahl’s Law CPU utilization Source: http://codeidol.com/java/java-concurrency/Performance-and-Scalability/Amdahls-Law/
Distributed Parallelism • Let’s try something a little different. • Let’s set up two tables • You will sit at one table and Scott at the other. • We will put half of the puzzle pieces on your table and the other half of the pieces on Scott’s. • Now you can work completely independently, without any contention for a shared resource. • BUT, the cost per communication is MUCHhigher, and you need the ability to split up (decompose) the puzzle correctly, which can be tricky.
More Distributed Processors • It’s easy to add more processors in distributed parallelism. • But you must be aware of the need to: • decompose the problem and • communicate among the processors. • Also, as you add more processors, it may be harder to load balance the amount of work that each processor gets.
FYI…Kinds of Parallelism • Instruction Level Parallelism • Shared Memory Multithreading • Distributed Memory Multiprocessing • GPU Parallelism • Hybrid Parallelism (Shared + Distributed + GPU)
Why Parallelism Is Good • The Trees: We like parallelism because, as the number of processing units working on a problem grows, we can solve the same problem in less time. • The Forest: We like parallelism because, as the number of processing units working on a problem grows, we can solve bigger problems.
Jargon • Threads are execution sequences that share a single memory area • Processes are execution sequences with their own independent, private memory areas • Multithreading: parallelism via multiple threads • Multiprocessing: parallelism via multiple processes • Shared Memory Parallelism is concerned with threads • Distributed Parallelism is concerned with processes.
Basic Strategies • Data Parallelism: Each processor does exactly the same tasks on its unique subset of the data • jigsaw puzzles or big datasets that need to be processed now! • Task Parallelism: Each processor does different tasks on exactly the same set of data • which algorithm is best?
An Example: Embarrassingly Parallel • An application is known as embarrassingly parallel if its parallel implementation: • Can straightforwardly be broken up into equal amounts of work per processor, AND • Has minimal parallel overhead (i.e., communication among processors) FYI…Embarrassingly parallel applications are also known as loosely coupled.
Monte Carlo Methods • Monte Carlo methods are ways of simulating or calculating actual phenomena based on randomness within known error limits. • In GIS, we use Monte Carlo simulations to calculate error propagation effects • How? • Monte Carlo simulations are typically, embarrassingly parallel applications.
Monte Carlo Methods • In a Monte Carlo method, you randomly generate a large number of example cases (realizations), and then compare the results of these realizations • When the average of the realizations converges that is, your answer doesn’t change substantially if new realizations are generated, then the Monte Carlo simulation can stop.
Embarrassingly Parallel • Monte Carlo simulations are embarrassingly parallel, because each realization is independent of all other realizations
A Quiz… • Q: Is this an example of Data Parallelism or Task Parallelism? • A: Task Parallelism: Each processor does different tasks on exactly the same set of data
A bit more to know… What is A GPGPU?Or thank you gaming industry
It’s an Accelerator No, not this ....
Accelerators • In HPC, an accelerator is hardware whose role it is to speed up some aspect of the computing workload. • In the olden days (1980s), PCs sometimes had floating point accelerators (aka, the math coprocessor)
Why Accelerators are Good • They make your code run faster.
Why Accelerators are Bad Because: • They’re expensive (or they were) • They’re harder to program (NVIDIA CUDA) • Your code may not be portable to other accelerators, so the labor you invest in programming may have a very short life.
The King of the Accelerators The undisputed king of accelerators is the graphics processing unit (GPU).
Why GPU? • Graphics Processing Units (GPUs) were originally designed to accelerate graphics tasks like image rendering for gaming. • They became very popular with gamers, because they produced better and better images, at lightning fast refresh speeds • As a result, prices have become extremely reasonable, ranging from three figures at the low end to four figures at the high end.
GPU’s Do Arithmetic • GPUs render images • This is done through floating point arithmetic – As it turns out, this is the same stuff people use supercomputing for!
Interested? Curious? • To learn more, or to get involved with supercomputing there is a host of opportunities awaiting you • Get to know your Campus Champions • Ask about internships (BWUPEP) • Learn C (not C++, but C) or Fortran • Learn UNIX