1 / 31

Online Scheduling in Grids

Online Scheduling in Grids. 1 Uwe Schwiegelshohn, 2 Andrei Tchernykh, 1 Ramin Yahyapour 1 Technische Universität Dortmund, Germany uwe.schwiegelshohn@udo.edu, ramin.yahyapour@udo.edu 2 CICESE Research Center, Ensenada, Baja California, Mexico chernykh@cicese.mx.

selene
Télécharger la présentation

Online Scheduling in Grids

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Online Scheduling in Grids 1Uwe Schwiegelshohn, 2Andrei Tchernykh, 1Ramin Yahyapour 1Technische Universität Dortmund, Germany uwe.schwiegelshohn@udo.edu, ramin.yahyapour@udo.edu 2CICESE Research Center, Ensenada, Baja California, Mexico chernykh@cicese.mx CIRM-Marseille-Luminy, May 12 - 16, 2008

  2. Computational Grid (by Christophe Jacquet) CICESE Parallel Computing Laboratory 2

  3. Grid Model • An encompassing and precise representation of a Grid is usually too complex to address various problems occurring in Grids. • Application of a suitable model that considers important properties of a Grid. CICESE Parallel Computing Laboratory 3

  4. Grid Model The Grid contains m machines. Machine Mihas size mi if it comprises miprocessors. All processors in the Grid are identical. • Each job J is described by a triple : • release date, • size (degree of parallelism), • execution time on processors. Job must be executed on processors on one machine without interruption (space sharing mode). GPm | sizej | Cmax Pm | rj, sizei | Cmax is referred to as PS while the scheduling on a set of parallel machines GPm | rj,sizei | Cmax is referred to as MPS. CICESE Parallel Computing Laboratory 4

  5. List Scheduling Non clairvoyant scheduling Time Processors Processors CICESE Parallel Computing Laboratory 5

  6. List Scheduling Cmax(LIST)=17 Cmax*=9 Time Processors Processors CICESE Parallel Computing Laboratory 6

  7. List Scheduling on Parallel Processors Cmax(LIST)/Cmax* ≤ 2-1 / m • All jobs are sequential and have release date 0. • Graham 1966 • Jobs have release date 0 and may be parallel. • Garey, Graham 1975 • Jobs are parallel and submitted over time (online scheduling) • Naroska, Schwiegelshohn 2002 Does the same bound hold for Grids as well? CICESE Parallel Computing Laboratory 7

  8. Applicability to Grids There is no polynomial time algorithm that always produces schedules S with Cmax(S)/Cmax ∗ < 2 for GPm | sizei | Cmax and all input data unless P = NP. CICESE Parallel Computing Laboratory 8

  9. Applicability to Grids • 2 machines with m processors each • All jobs have processing time 1 and different degrees of parallelism • Total requirement of all jobs: 2m processors • Consider an arbitrary algorithm A. machine 1 machine 2 Cmax (A)=1  Cmax*=1: optimal solution machine 1 machine 2 Cmax (A)=2 and Cmax *=2: optimal solution Cmax (A)=2 and Cmax *=1: optimal solution CICESE Parallel Computing Laboratory 9

  10. Applicability to Grids How do we know whether Cmax*=2 applies? • Partition: NP-hard • There is no algorithm A with polynomial time complexity guaranteeing Cmax(A)/Cmax* < 2. Scheduling in Grids is inherently more difficult than scheduling on a single parallel processor. CICESE Parallel Computing Laboratory 10

  11. List Scheduling in the Grid Cmax(LIST)=4 Time Machines with different numbers of processors CICESE Parallel Computing Laboratory 11

  12. List Scheduling in the Grid Cmax*=2 Time Machines with different numbers of processors CICESE Parallel Computing Laboratory 12

  13. Problems of List Scheduling • Cmax(LIST)/Cmax* = (k+1)/2 • Analysis of the problem • Jobs with little parallelism occupy large machines which are not available for highly parallel jobs. • In case of few highly parallel jobs it is inefficient to prevent jobs with little parallelism from using these large machines. • Simple approach • Increased priority for highly parallel jobs • Arranging jobs in descending order of their parallelism • Fairness is neglected. CICESE Parallel Computing Laboratory 13

  14. Few available processors for parallel jobs Predominantly execution of sequential jobs Sorting in Order of Parallelism Time Processors CICESE Parallel Computing Laboratory 14

  15. Does Ordering the Jobs Help? • We are interested in an algorithm that does not use a single list of jobs. • Some machines are blocked from executing some jobs under certain circumstances. CICESE Parallel Computing Laboratory 15

  16. Online Job Stealing Scheduling in Grids

  17. Does Ordering the Jobs Help? • We assume a machine indexing such that mi−1 ≤ miholds Three sets of jobs are considered • Set Ai contains all jobs that cannot execute on the previous (next smaller) machine and require more than 50% of the processors of machine Mi. • Set Bi contains all jobs that cannot execute on the previous machine but require at most 50% of the processors of machine Mi. • Set Hi contains all jobs that require more 50% of the processors of machine Mi but can also be executed on the previous machine. CICESE Parallel Computing Laboratory 17

  18. Grid Scheduling Algorithm 1. The machines are arranged in ascending order of processor numbers. 2. A job is assigned to the first machine that can execute it. Group A: >= half of the processors on this machine are required. Group B: < half of the processors on this machine are required. CICESE Parallel Computing Laboratory 18

  19. Grid Scheduling Algorithm 3. Any machine applies a priority order when selecting jobs for execution: Jobs of its group A Jobs of its group B Jobs that are enabled for execution on its previous machine. CICESE Parallel Computing Laboratory 19

  20. Performance of the Algorithm • Theoretical evaluation • Cmax(LIST)/Cmax* < 3 in the offline case • Cmax(LIST)/Cmax* < 5 in the online case U.Schwiegelshohn, A.Tchernykh, R.Yahyapour Online Scheduling in Grids. IEEE, IPDPS’08, 2008 CICESE Parallel Computing Laboratory 20

  21. Conclusion • Common list scheduling does not work well in Grids. • Jobs should receive priority on the machines that provide the right amount of parallelism. • Jobs with less parallelism are only executed on these machines if better suited jobs are not available. • The presented algorithm has a constant worst case bound and relatively small gap. CICESE Parallel Computing Laboratory 21

  22. Adaptive Admissible Allocation

  23. Workload Grid Broker Allocation Local queue Local queue Local queue Local scheduler Local scheduler Local scheduler node node node Two Level Grid Model We regard MPS as two stage (two layer) scheduling MPS = MPS_Allocation + PS. CICESE Parallel Computing Laboratory 23

  24. Allocation For each job: firstbe the minimum i such that node is able to execute a job . lastis the maximum i set of nodes first, first+1, . . . , last is a set M-available. m1 m2 m3 m4 m5 mm … first(Jj) = 2 last(Jj) = m M-available CICESE Parallel Computing Laboratory 24

  25. Allocation If last is the minimum r such that m1 m2 m3 m4 m5 mm … first(Jj) = 2 last(Jj) = 5 M-admis last(Jj) = m M-available CICESE Parallel Computing Laboratory 25

  26. a*m(f0,m) (1-a)*m(f0,m) a*m(f,m) (1-a)*m(f,m) 1 f0 fl0 l m CICESE Parallel Computing Laboratory 26

  27. For a set of machines with identical processors, and for a set of rigid jobs with admissible range the competitive factor of Min_LB-a+ Best_PS is CICESE Parallel Computing Laboratory 27

  28. Competitive factor CICESE Parallel Computing Laboratory 28

  29. Competitive factor CICESE Parallel Computing Laboratory 29

  30. Competitive factor • A.Tchernykh, U.Schwiegelshohn, R.Yahyapour, N.Kuzurin. • Online Hierarchical Job Scheduling in Grids. • IEEE, CoreGrid’08, EuroPar, 2008 CICESE Parallel Computing Laboratory 30

  31. Thank you

More Related