1 / 31

Dynamic Load Balancing and Job Replication in a Global-Scale Grid Environment: A Comparison

Dynamic Load Balancing and Job Replication in a Global-Scale Grid Environment: A Comparison. IEEE Transactions on Parallel and Distributed Systems, Vol. 20, No. 2, February 2009 Menno Dobber, Student Member, IEEE, Rob van der Mei, and Ger Koole. Present by Chen, Ting-Wei. Index . Introduction

anitra
Télécharger la présentation

Dynamic Load Balancing and Job Replication in a Global-Scale Grid Environment: A Comparison

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Dynamic Load Balancing and Job Replication in a Global-Scale Grid Environment: A Comparison IEEE Transactions on Parallel and Distributed Systems, Vol. 20, No. 2, February 2009 Menno Dobber, Student Member, IEEE, Rob van der Mei, and Ger Koole Present by Chen, Ting-Wei

  2. Index • Introduction • Preliminaries • Experimental Setup • Experimental Results • Conclusions Chen, Ting-Wei

  3. Introduction (cont.) • Dynamics of grid environments • Dynamic Load Balancing • Job Replication • Easy-to-measure statistic Y Corresponding threshold value Y* • If Y>Y* ……DLB outperforms JR • If Y<Y* ……JR outperforms DLB Chen, Ting-Wei

  4. Introduction (cont.) • Easy-to-implement approach • Make dynamic decisions about whether to use DLB or JR • Two types of investigations accurately verify • Trace-driven simulation • Real implementation Chen, Ting-Wei

  5. Introduction (cont.) • Real implementation • To acquire more knowledge about DLB • Means of trace-driven simulations • Require detailed knowledge about the processes • Take less time • More extensive analyses can be performed Chen, Ting-Wei

  6. Introduction (cont.) • Analyze and compare the effectiveness of ELB, DLB, and JR • Using trace-driven simulations • Gathering from a global-scale grid testbed Chen, Ting-Wei

  7. Preliminaries (cont.) • Bulk Synchronous Processing (BSP) • Problem can be divided into subproblems or jobs • I iterations, P jobs, P processes • Each processor receives one job per iteration • After computing the jobs, all the processors send their data and wait for each others data before the next iteration starts • The standard BSP program is implemented according to the ELB principle Chen, Ting-Wei

  8. Preliminaries (cont.) • Implementations on ELB Chen, Ting-Wei

  9. Preliminaries (cont.) • Dynamic Load Balancing (DLB) • DLB starts with the execution of an iteration is the same with BSP • At the end of each iteration, the processors predict their processing speed for the next iteration • Select one processor to be the DLB scheduler • After every N iterations, the processors send their prediction to this scheduler Chen, Ting-Wei

  10. Preliminaries (cont.) • The processor calculate the “optimal” distribution • Send relevant information to each processor • All processors redistribute the load Chen, Ting-Wei

  11. Preliminaries (cont.) • Implementations on DLB Chen, Ting-Wei

  12. Preliminaries (cont.) • Job Replication (JR) • Two copies of a job • R copies of all P jobs have been distributed to P processors. • A processor has finished one of the copies, it sends a message to the other processors • The other processors can kill the job and start the next job Chen, Ting-Wei

  13. Preliminaries (cont.) • Implementations on JR Chen, Ting-Wei

  14. Experimental Setup (cont.) • Data-Collection Procedure Chen, Ting-Wei

  15. Experimental Setup (cont.) • Completely available Pentium 4, 3.0-GHz processor, the computations in the jobs would take 10000 ms • Set one’s job times are 72500 ms (average) • Distributed within the USA • More coherence between the generated datasets • Set two’s job times are 65000 ms (average) • Show more burstiness and have higher differences between the average job times on the processors • Globally distributed Chen, Ting-Wei

  16. Experimental Setup (cont.) • Trace-driven simulation analyses • with • , and with Chen, Ting-Wei

  17. Experimental Setup (cont.) • Simulation Details • Trace-driven DLB simulations • Assume a linear relation between the job size and their job times in BSP Chen, Ting-Wei

  18. Experimental Setup (cont.) • DLB simulation • Randomly select a resource set • The DES-based prediction • Derive the IT • Derive the runtime of the R-JR • Derive the expected runtime of a DLB run Chen, Ting-Wei

  19. Experimental Setup (cont.) • JT simulation • The same with step one of the DLB simulation • Divide the set of processors in execution groups • Drive the effective job times for all P processors • Derive the IT by repeating step two R times • Derive the runtime of the R-JR run by repeating step three • Derive the expected runtime of an R-JR run on P processors Chen, Ting-Wei

  20. Experimental Setup (cont.) • Dynamic Selection Method • Analysis Chen, Ting-Wei

  21. Experimental Results (cont.) • Simulate the runtimes of DLB for different numbers of processors with set one and two • Simulate runs of BSP parallel applications that use JR and analyze the expected speedups for different numbers of processors, replication, data sets and CCR values Chen, Ting-Wei

  22. Experimental Results (cont.) • Compare the results of the runtimes and the speedups of the ELB, DLB, and JR • Simulate the speedups of the proposed selection method Chen, Ting-Wei

  23. DLB Experimental Results (cont.) Chen, Ting-Wei

  24. Job Replication Experimental Results (cont.) Chen, Ting-Wei

  25. Comparison of ELB, DLB, and JR Runtimes of DLB and JR with CCR 0.01 Experimental Results (cont.) Chen, Ting-Wei

  26. Speedups of DLB and JR with sets of 40 and 90 data sets with CCR 0.01 Experimental Results (cont.) Chen, Ting-Wei

  27. Experimental Results (cont.) • Statistic Y against ITs of DLB and JR Chen, Ting-Wei

  28. Experimental Results (cont.) • Speedup of selection method, DLB and JR Chen, Ting-Wei

  29. Conclusions • Made an extensive assessment and comparison between DLB and JR • Y>Y* ……DLB outperforms JR • Y<Y* ……JR outperforms DLB • Propose the so-called DLB/JR method Chen, Ting-Wei

  30. Outlook • Bring the result to a higher level of reality • Make use of mathematical techniques to provide a more solid foundation • Determine the optimal number of job replicas needed to obtain the best speedup performance Chen, Ting-Wei

  31. Thanks for your attention

More Related