Co-Allocation Efficiency in Multicluster Systems - DAS-2 Workshop
230 likes | 333 Vues
Explore how to optimize job performance in multicluster systems, focusing on co-allocation strategies for better resource utilization and response time. Case study on Cactus Computational Toolkit usage. Study job structure, component sizes, communication, and scheduling policies for enhanced system performance evaluation.
Co-Allocation Efficiency in Multicluster Systems - DAS-2 Workshop
E N D
Presentation Transcript
Processor Co-Allocation in Multicluster Systems DAS-2 Workshop Amsterdam June 6, 2002 Anca Bucur and Dick Epema Parallel and Distributed Systems Group Delft University of Technology D.H.J. Epema/PDS/TUD
Introduction (1) • In multicluster systems (like the DAS, in GRIDs), jobs may use co-allocation (i.e., span multiple clusters): • to use available capacity • to process geographically spread data • Single-application performance issues: • application restructuring • wide-area runtime systems (e.g., optimize collective communication operations) • Multiple-application performance issues: • design/analyze scheduling policies • minimize response time, maximize maximal utilization D.H.J. Epema/PDS/TUD
Introduction (2): Example • In april 2001, the Cactus Computational Toolkit was used for four-hour astrophysics simulations involving Einstein’s General Relativity equations • Equipment: • At NCSA: 480 CPUs of three SGI Origin2000 systems • At SDSC: 1020 CPUs of Blue Horizon • OC-12 622-Mbit/s network D.H.J. Epema/PDS/TUD
Introduction (3): Problems 1 job: 2 3 cluster 3 fits with if flexible cluster 2 processors (pattern: idle) fits with if unordered cluster 1 time D.H.J. Epema/PDS/TUD
System Model • Multicluster system consisting of clusters of processors of equal speed • Communication speed ratio : the ratio of the wide-area and local message transfer times …. D.H.J. Epema/PDS/TUD
Job Components • A job consists of job components that each go to a single cluster, one task per processor • Distributions of job-component sizes: • Uniform: U[a,b] • Truncated and adapted geometric (favors small sizes and powers of 2): D(q) on [1,b] …. system job …. D.H.J. Epema/PDS/TUD
Job Request Types (1) • Ordered and unordered requests specify their job-component sizes: Ordered: Unordered: ? …. …. …. …. D.H.J. Epema/PDS/TUD
Job Request Types (2) • Flexible and total requests only specify the total number of processors needed: flexible: total: ? …. D.H.J. Epema/PDS/TUD
Fitting a Job (1) • It is clear when an ordered or a total request fits • For an unordered request: • order components according to decreasing sizes • use First-Fit (FF) or Worst-Fit (WF) .… job WF idle system …. in use D.H.J. Epema/PDS/TUD
Fitting a Job (2) • For a flexible request: • determine minimal number of clusters needed • fill least-loaded clusters (CF) completely, or balance load (LB) (variation: LB-A) CF LB idle job in use D.H.J. Epema/PDS/TUD
Scheduling Policies • First Come First Served • Fit Processors First Served: search queue for jobs that fit job queue system …. …. …. D.H.J. Epema/PDS/TUD
Interarrival/Service Times • Poisson arrival process in simulations • All tasks in a job have the same service time • Service-time distributions used: • Deterministic (mean 1) • Exponential (mean 1) • Hyperexponential (mean 1, coeff. of var. 3) • Derived from the DAS D.H.J. Epema/PDS/TUD
Communication • We model jobs without and with communication • With communication: • tasks alternate between compute and communication phases • communication phase: all-to-all personalized communication • time for a single local synchronous message send operation: 0.001 • communication speed ratios considered: 1-100 D.H.J. Epema/PDS/TUD
Single-cluster DAS Statistics service time number of jobs number of jobs nodes requested mean: 23.34 coeff. of var.: 1.11 mean: 356.45 (62.66) coeff. of var.: 5.37 D.H.J. Epema/PDS/TUD
Performance Evaluation • Parameters we vary: • job request structure • job-component-size distribution • service-time distribution • number and sizes of clusters (base case: 4x32) • placement of unordered and flexible jobs • scheduling policy • communication speed ratio • co-allocation versus no co-allocation • queueing structure (global/local) • Performance metrics: • mean response time (only simulation) • maximal utilization (analysis and simulation) D.H.J. Epema/PDS/TUD
Influence of Structure and Size ordered response time response time unordered total response time utilization utilization D.H.J. Epema/PDS/TUD
Influence of Communication Speed Ratio utilization response time response time 10 100 utilization Right to left: total, flexible, unordered, ordered D.H.J. Epema/PDS/TUD
Co-Allocation versus no Co-Alloc. (1) flexible 2 components 4 components 1 component utilization • no communication • unordered jobs • job size: • 4xD(0.9) on [1,8] • (fits on a single • cluster) response time D.H.J. Epema/PDS/TUD
Co-allocation versus no Co-alloc. (2) utilization LB-A, ratio 5 LB-A, ratio 50 no co-allocation, FF • communication • flexible jobs • job size: • 4xD(0.9) on [1,8] response time D.H.J. Epema/PDS/TUD
An Application on the DAS (1) • Solves the Poisson equation with a red-black Gauss-Seidel scheme • Measurements on the DAS (times in ms): • Time for diffusing local errors and computing the global error: 14 ms D.H.J. Epema/PDS/TUD
An Application on the DAS (2) total ordered utilization response time Equal mix of jobs of sizes (2,2,2,2) and (4,4,4,4) D.H.J. Epema/PDS/TUD
Maximal Utilization (1) • Assume: constant backlog, ordered jobs, exponential service (no communication) • Consider: the joint probability distribution of the sizes of jobs in the system • Result: this distribution is the same • when the system runs for a long time • when the system is filled from the empty state • Use the convolution of the job-size distribution to determine the distribution of the numbers of jobs in the system • Compute the maximal utilization D.H.J. Epema/PDS/TUD
Maximal Utilization (2) • We have an approximation for the maximal utilization for unordered jobs with WF • We use simulations to validate this approximation • Capacity loss (1-max. util.) for 4 clusters of size 32, uniform job-component sizes: D.H.J. Epema/PDS/TUD