320 likes | 334 Vues
This paper explores the use of abstractions for efficient data-intensive computing on the Condor system. It discusses various approaches for data and task management, abstraction patterns, and implementation examples for common problems such as the All-Pairs problem and Genome Assembly. The paper also addresses performance measurement, job submission, and result handling.
E N D
Abstractions for Data-Intensive Computing on Condor Christopher Moretti University of Notre Dame
I want to complete big workloads 3.6B Hamming distance computations, .02 seconds each on a 2GHz dual-core desktop computer, each creating 1 real number output: 2.3 CPUYrs and 29GB of output. 85M 1000x1000 dynamic programming tables, .04 seconds each on a 3GHz quad-core Xeon server: 39 CPUDays requiring a total of 77GB of input data. 500x500 recurrence matrix, recurrence functions 7 seconds each on the desktop: 22 CPUDays, not completely independently parallelizable. … can these be run this week? How about this afternoon? Can we even turn these into “lunchtime” or “coffee refill” problems?
How? • On my workstation. • Write my program, make sure to make it partitionable, because it takes a really long time and might crash, debug it. Now run it for 39 days – 2.3 years. • On my department’s 128-node research cluster • Learn MPI, determine how I want to move many GBs of data around, re-write my program and re-debug, wait until the cluster can give me 8-128 homogeneous nodes at once, or go buy my own. Now run it. • BlueGene • Get $$$ or access, learn custom MPI-like computation and communication working language, determine how I want to handle communication and data movement, re-write my program, wait for configuration or access, re-debug my program, re-run.
So? • Serially • Cluster • Supercomputer • So I can either take my program as-is and it’ll take forever, or I can do a new custom implementation to a certain particular architecture and re-write and re-debug it every time we upgrade (assuming I’m lucky enough to have a BlueGene in the first place)? • Well what about Condor?
Yes, what about Condor? What is Condor? Which resources? How Many? What happens when things fail? How do I fit my workload into jobs? How long will it take? What about job input data? How can I measure job stats? What do I do with the results?
Abstractions Fill in the Gap Here is my function: F(x,y) Here is a folder of files: set S Here is a static library I need. binary function F set S of files lib.a F 1CPU Multicore Cluster Condor Supercomputer F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F exec exec rfork condor_submit submit
What is an abstraction? • Abstraction: a declarative specification of the computation and data of a workload. • A restricted pattern, not meant to be a general purpose programming language. • Uses data structures instead of files. • Regular structure makes it tractable to model and predict performance. • Allows a user to repeat the same pattern of work many times, making slight changes to the data and algorithms
Abstractions Example Approaches • Data management: distribute data only to nodes where it is necessary for computation. Distribute broadcasted data via efficient algorithms. Access data in a memory-efficient order. Use data structures instead of flat files. • Task management: assign appropriate amounts of data per discrete task. Adapt to the environment by choosing nodes showing good performance. Submit/manage tasks that do not overwhelm the batch system.
F The All-Pairs Problem All-Pairs( Set S1, Set S2, Function F ) yields a matrix M: Mij = F(S1i,S2j) 60K 20KB images >1GB 3.6B comparisons @ 50/s = 2.3 CPUYrs x 8B output = 29GB
All Pairs Abstraction binary function F set S of files F invocation M = AllPairs(F,S)
x F d y Wavefront Recurrence ( R[x,0], R[0,y], F(x,y,d) ) R[0,4] R[2,4] R[3,4] R[4,4] x F d y R[0,3] R[3,2] R[4,3] x x F F d y d y R[0,2] R[4,2] x F x F d y d y R[0,1] x F x F x F x F d y d y d y d y R[0,0] R[1,0] R[2,0] R[3,0] R[4,0]
Implementing Wavefront Input Worker F Complete Input Output Master Input Complete Output Worker F Output
Genome Assembly • Bioinformatics sequencers can only extract DNA from samples 50-1000 basepairs (A,C,G,T) at a time. • Biologists need the DNA together in genome profiles of millions of contiguous basepairs. • Genome assembly is the process of putting the pieces of the puzzle back together again in the right configuration. A principal step is “overlapping”. • One way would be a huge All-Pairs problem, but this isn’t necessary. Algorithms exist to extract a sparse matrix of possible candidate pairs (two sequences that might overlap in the right answer). So we must only compute the overlaps for the candidates.
WorkQueue: “Align”“>Seq1\nATG*CTAG\n…” Candidate (Work) List Worker Seq1 Seq2 Seq1 Seq3 Seq2 Seq3 Seq4 Seq5 Master Input data Align >Seq1 ATG*CTAG >Seq2 A*G*CTGA … Output Alignment Results (raw format) Input Sequence Data
Purpose of a Suite of Abstractions • Engineering: Big systems to solve cool problems. • Science: • What are common elements of abstractions? • What is an intuitive interface for users to adapt their existing serial solutions to larger problems? data invocation computation Set S F M = AllPairs(F,S) F R = Wavefront(F,R) Initial State F O = Overlap(F,C,S) Cands Seqs
Challenges Remaining with Abstractions • Exploiting a regular pattern doesn’t mean that nothing can go wrong … • It’s not just domain scientists who can make mistakes that lead to disastrous consequences. • Sometimes good solutions to problems beget new and interesting problems. • A motivating example to finish …
starter starter Our tasks are done, we don’t need workers anymore! condor_rm; exit(); starter master starter Universe=vanilla … TransferFiles=always …
starter Okay, I’ll send back the data the workers generated. starter shadow . . . . . starter schedd starter Okay, I’ll send back the data the workers generated.
starter Here’s the data the worker generated! starter shadow starter schedd starter
starter starter shadow starter schedd Here’s the data the worker generated! starter
starter starter shadow starter schedd starter Here’s the data the worker generated!
starter starter shadow Here’s the data the worker generated! ENOSPACE? What? starter schedd starter
starter starter Ack! We didn’t want those files back, they were temporary. “TransferFiles=Always” was a mistake! Now we’re out of space! shadow starter schedd starter
Hrm, I can’t transfer back their files. I guess I’ll hang out and won’t remove the jobs until I can. starter shadow schedd
I’m waiting to finish my condor_rm until I can transfer files back to the submitting node. starter shadow We’re waiting to clean up local state until you kill your jobs. schedd
I’m waiting to finish my condor_rm until I can transfer files back to the submitting node. ARGH! starter shadow We’re waiting to clean up local state until you kill your jobs. schedd
Moral of the Story • Abstractions are a way to give domain scientists tools that don’t require drastically changing their already-complete solutions, but still allow for efficient HPC/HTC. • Exploiting a regular pattern doesn’t mean that nothing can go wrong. • Interesting challenges remain, among these: • Predicting performance on an unpredictable system • Monitoring and adapting to fit a changing system • Dealing with entangling relationships • e.g. Remote state requires local state • Tying together the lessons learned from these abstractions • What do they have in common? • Why is one solution right for one problem and wrong for another?
For More Information • Christopher Moretti • cmoretti@cse.nd.edu • Douglas Thain • dthain@cse.nd.edu • Cooperative Computing Lab • http://cse.nd.edu/~ccl