1 / 33

Multiprocessor Scheduling Under Uncertainty

Multiprocessor Scheduling Under Uncertainty. Rajmohan Rajaraman Northeastern University (Joint work with Guolong Lin). 0.9. Hill climbing. 0.1. Brute force. 1.0. 0.3. 0.5. Local search. 0.9. Jobs. Workers. Project management. A project divided into jobs

gryta
Télécharger la présentation

Multiprocessor Scheduling Under Uncertainty

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Multiprocessor Scheduling Under Uncertainty Rajmohan Rajaraman Northeastern University (Joint work with Guolong Lin) CIRM Workshop May 31, 2006

  2. 0.9 Hill climbing 0.1 Brute force 1.0 0.3 0.5 Local search 0.9 Jobs Workers Project management • A project divided into jobs • There are dependencies among the jobs. • A worker may not successfully complete the assigned job. • Multiple workers can be assigned to (key) jobs. • Goal: Assign the workers to the jobs to minimize the expected completion time of the project. CIRM Workshop May 31, 2006

  3. J: n unit-time jobs M: m machines C: precedence constraints among the jobs, represented as a dag Goal: compute a schedule with minimum makespan Traditional multiprocessor scheduling J M 1 a 2 b 3 C: 2 1 3 CIRM Workshop May 31, 2006

  4. pij: probability that job j is successfully completed by machine i at any time Multiple machines can be assigned to the same job simultaneously Goal: compute a schedule with minimum expected makespan Uncertainty in execution outcome J M 1 0.01 a 0.2 1 2 0.3 b 0.5 3 C: 2 1 3 CIRM Workshop May 31, 2006

  5. Problem (SUU): J: n unit-time jobs M: m machines C: precedence constraints among the jobs, represented as a dag pij: probability that job j is successfully completed by machine i at any time Multiple machines can be assigned to the same job simultaneously Goal: compute a schedule with minimum expected makespan Scheduling under uncertainty J M 1 0.01 a 0.2 1 2 0.3 b 0.5 3 C: 2 1 3 CIRM Workshop May 31, 2006

  6. Applications • Project management: • Assigning different tasks of a project to people with different skill sets • Uncertainty in successful execution expressed using probabilities • Grid computing: • m machines available • A project divided into n jobs, with dependencies • The execution of a job on any machine can be unreliable • Multiple machines can be assigned to the same job • Goal: Assign the machines to the jobs to minimize the expected completion time CIRM Workshop May 31, 2006

  7. Outline • Specification of a schedule • Related work • Main results • Independent jobs • Disjoint chains of jobs • Disjoint directed forests of jobs • Open problems CIRM Workshop May 31, 2006

  8. Schedule: A collection of functions fU,t: MJ{} U is the set of unfinished jobs fU,t(m)  U  m runs fU,t(m) fU,t(m)  U  m is idle 1 3 1 1 a 1 1 1 2 b t What is a schedule? U fU,t {1,2,3} a1,b2 {1,2} a1,b2 {1,3} a3,b1 {2,3} a3,b2 {1} a1,b1 {2} a2,b2 {3} a3,b3 • Regimen: A schedule whose fS,t only depends on S CIRM Workshop May 31, 2006

  9. A naive specification of a schedule requires 2n functions for infinite time Observe that there exists an optimal schedule which is a regimen. However, specification of an arbitrary regimen still requires 2n functions The schedules we compute can be specified much more succinctly, in space polynomial in the size of the input Oblivious schedule: fS,t is independent of S Specification of a schedule 3 3 2 3 a 1 2 2 2 b t = 1 2 3 4 time CIRM Workshop May 31, 2006

  10. Why oblivious schedules? • Oblivious schedules can potentially be specified succinctly • Since the entire job assignment can be computed in advance, little overhead is incurred at runtime • In a distributed system, where jobs and processors reside at different machines, an oblivious schedule can be executed independently by each job • We found them easier to analyze CIRM Workshop May 31, 2006

  11. Main results • For independent jobs, we can efficiently compute an oblivious schedule with approximation ratio O(log n log(min{m,n})) • For disjoint chains, we can efficiently compute an oblivious schedule with approximation ratio O(lognlognlog(m+n)/loglog(m+n)) • For a directed forest, we can efficiently compute an oblivious schedule with approximation ratio O(log2n log m log(m+n)/loglog(m+n)) CIRM Workshop May 31, 2006

  12. Related work • Problem introduced in [M 05]. • Solvable in poly-time if width of the dag (max number of independent jobs) and number of machines are both constant • NP-hard if either parameter is unbounded • Cannot be approximated within 5/4 unless P=NP • Job shop scheduling problem [LMR 94, SSW 94] • Scheduling unrelated machines with precedence constraints [SSW 94, KMPS 05] CIRM Workshop May 31, 2006

  13. Given an arbitrary schedule, determining its expected makespan is non-trivial MASS: Compute an oblivious schedule of minimum length, in which every job is successfully completed with a probability of at least 1/4 Three step argument: MASS: Give an approximation algorithm for MASS MASSSUU: Transform a schedule for MASS to an oblivious schedule for SUU, without a significant blowup in length SUUMASS: Derive a lower bound on the length of an optimal SUU schedule from the length of an optimal schedule for MASS High-level approach MASS SUU CIRM Workshop May 31, 2006

  14. The precedence constraints graph C has no edges Formal definition of mass: Mass up to time T Mass is a constant approximation to success probability Independent jobs J M 1 0.01 a 0.2 1 2 0.3 b 0.5 3 C: 2 3 1 CIRM Workshop May 31, 2006

  15. We are given an oblivious schedule  of length T in which for every job, the probability of successful execution is at least 1/4 Need to transform to a schedule that has low expected makespan Set the new schedule to periodic repetition of  With probability 1-1/poly(n), every job is completed in O(T log n) steps Expected makespan is O(T log n) Independent jobs: MASS  SUU CIRM Workshop May 31, 2006

  16. Theorem: OPTMASS= O(log n) OPTSUU Key Lemma: Consider a schedule  with expected makespan T*. For any job j, in a random execution of Pi for 2T* steps, j accumulates a mass of at least 1/4 with probability at least 1/4 Proof: Consider a random execution of the schedule A: event that j is finished within 2T* steps B: event that j gathers a mass of at least 1/4 Since the expected completion time of the schedule is T*, Markov’s inequality yields Pr(A) ≥ 1/2 Pr(B) ≤ 1 - Pr(A) + Pr(A  B) ≤ 1/2 + Pr(A  B) It remains to bound Pr(A  B) Independent jobs: SUU  MASS CIRM Workshop May 31, 2006

  17. Tree representation of a schedule 1,2,3 t=1 0.1 1,2,3 1,2 2,3 1,3 1 2 3 t=2 0.3 1,2 1 2 1 Need to bound the total probability of those nodes at level 2T* where j is complete, yet has mass ≤ 1/4 CIRM Workshop May 31, 2006

  18. Consider a random execution of schedule  Suppose we are at node N, j is unfinished and has mass ≤ 1/4 Claim: Probability that j will be finished within 2T* time and will have final mass at most 1/4 is at most (1/4 – mass at N) Applying to root node, we get Pr(AB) ≤ 1/4 Thus, probability that j gathers mass less than 1/4 is at most 3/4 Bound on Pr(AB) N j,… j is finished j,… CIRM Workshop May 31, 2006

  19. Independent jobs: SUU  MASS • Consider a schedule with expected makespan T* • If we execute the schedule for T* steps, then with probability at least 1/4, j accumulates a mass of at least 1/4 • Repeating this process O(log n) times, we obtain that the probability that job j does not accumulate a mass of at least 1/4 is < 1/n • There exists an oblivious schedule of length O(log n)T* in which every job accumulates a mass of at least 1/4 CIRM Workshop May 31, 2006

  20. Independent jobs: Solving MASS • Combinatorial approach: • Guess OPTMASS • Find an oblivious schedule that maximizes sum of masses gathered within time OPTMASS • Argue that in this schedule, at least constant fraction of jobs gather constant mass • Repeat O(log n) times • Do binary search over OPTMASS • O(log n) approximation • Putting it all together: • MASS: O(log n) approximation • MASS  SUU: O(log n) blowup in length • SUU  MASS: O(log n) blowup in lower bound • O(log3 n) approximation algorithm CIRM Workshop May 31, 2006

  21. Independent jobs: LP for MASS • Similar to LP for scheduling on unrelated machines [LST 90] • Time for job j on machine i is 1/pij • Difference: We allow a job to be processed simultaneously on multiple machines • Nevertheless, can use a similar approach to shave off an O(log n) factor, yielding an O(log2 n) approximation for independent jobs CIRM Workshop May 31, 2006

  22. The precedence constraints C form a collection of disjoint chains MASS: Redefine the problem so that a job does not gather mass until its predecessors have gathered mass at least 1/2 We compute a schedule S for MASS whose length is O(log2 n/log log n) LPMASS MASS  SUU: Compute schedule S* for SUU of length O(log n) times length of S SUU  MASS: LPMASS = O(OPTSUU) Jobs with chain constraints J M 1 0.01 a 0.2 1 2 0.3 b 0.5 3 C: 1 2 3 CIRM Workshop May 31, 2006

  23. Chains of jobs: MASS • Goal: Compute a schedule of smallest length T such that • Every job accumulates a mass of at least 1/4 • If j1 < j2, then j1 must gather 1/4 mass before any machine can be assigned to j2 • While solving MASS, we will actually compute an oblivious pseudo-schedule • Multiple jobs can be scheduled on a machine at the same time • During the MASSSUU step, we convert this pseudo-schedule to a feasible schedule CIRM Workshop May 31, 2006

  24. Chains of jobs: LP for MASS Load constraint Chain makespan constraint Number of steps in which job j is run on some machine Number of steps in which machine i runs job j CIRM Workshop May 31, 2006

  25. An integral pseudo-schedule can be obtained with load and length both O(log m) LPMASS High-level approach: From the fractional solution, for each job j, bucket pij (whose xij > 0) into O(log m) buckets. Pick a bucket which contributes the most mass, (1/log m). Construct a network flow instance, and obtain an integral solution (feasible flow). Chains of jobs: Solving MASS M J u v CIRM Workshop May 31, 2006

  26. The relaxed schedule can be viewed as an instance similar to the job shop scheduling problem, with preemptive processing. For each chain, randomly and uniformly delay the start execution time by an amount from [0,maxload]. With high probability, at most O(log (n+m)/loglog (n+m)) jobs are scheduled on any machine at any time. Hence the relaxed schedule can be converted into a feasible one, losing a factor of O(log (n+m)/loglog (n+m)). The algorithm can be derandomized. [LMR 94, SSW 94, SSS 95, KMPS 05] From a pseudo-schedule to a feasible schedule 1 2 a a a a a a a b b b 3 a a a b b b b 4 5 6 a a a a a a a b b b b b b b b b t M C1 1 2 C2 a 3 C3 4 5 6 b CIRM Workshop May 31, 2006

  27. Putting it all together • Obtain an integral oblivious pseudo-schedule for MASS • O(log n) factor approximation with respect to LPMASS • Convert pseudo-schedule to feasible schedule • O(log n/log log n) blowup • Convert feasible schedule for MASS to feasible schedule for SUU • Repeat each step O(log n) times to make the success probability high, then repeat ad infinitum • O(log n) blowup • Argue that LPMASS = O(OPTSUU) • Uses a variant of the key lemma proved for independent jobs CIRM Workshop May 31, 2006

  28. Directed forest • The precedence constraints C form a directed forest • Lemma [KMPS 05]: Every directed forest can be decomposed into O(log n) blocks. Each block consists of a collection of chains. The decomposition preserves the precedence constraints appropriately • Theorem: For directed forests, we can efficiently compute a schedule with expected makespan O(log2 n log m log(n+m)/loglog(n+m)) times optimal J M 1 0.01 a 0.2 1 2 0.3 b 0.5 3 C: 1 2 3 CIRM Workshop May 31, 2006

  29. Open problems • Improved bounds • Conjecture: O(log n) approximation is achievable for independent jobs using an oblivious schedule • Can give an (log n/(log log n log*n) lower bound for oblivious schedules • Conjecture: O(1) approximation achievable for independent jobs using adaptive schedules • For chains and directed forest, an improved analysis may yield a saving of log n and log2n factors, resp. • General dag precedence constraints: • Any result likely to apply to the unrelated machines problem as well CIRM Workshop May 31, 2006

  30. Extensions and variants • Restrictions in uncertainty • pij = pi or pj • Arbitrary job sizes • Jobs with release times • Online version • Other performance measures CIRM Workshop May 31, 2006

  31. Combinatorial algorithm for MASS • Assume that we can solve Problem MaxSumMass with an approximation factor 1/3. Denote the algorithm by MSM-ALG. • We know that there exists an assignment such that every job accumulates 1/4 mass. • Let y be the number of jobs that accumulate at least 1/24 mass in our MaxSumMass schedule • y + 1/24 (n-y)  n/4 x 1/3  y  n/23. • Hence, at most O(log n) invocations are needed so that every job accumulates 1/24 mass. • We can compute a schedule such that every job accumulates 1/24 mass within O(log2 n)T* steps. This schedule has an expected makespan of O(log3 n)T*. CIRM Workshop May 31, 2006

  32. Solving MaxSumMass MSM-ALG: • Sort pij in decreasing order. Initialize f(i) = nil, i  M. • For each pij in the order • Assign pij to j, i.e., f(i) j, if f(i) = nil and • Assign unused machines to . J M i pij j CIRM Workshop May 31, 2006

  33. MaxSumMass Analysis of MSM-ALG:(Charging argument) • Let Opt = {(i,j)} be the collection of edges picked by an optimal assignment f*. Let Sol be the solution output by f. For each edge (i,j)  Opt. • (i,j)  Sol, charge pijto itself. • (i,j)  Sol: • (i,j) is not added because f(I) nil. Let j’ = f(I). Then pij’  pij, charge pij to pij’. • (i,j) is not added because • Charge pij to • 3 Sol is sufficient to cover Opt. Hence MSG-ALG achieves an approximation factor of 1/3. J M i pij j CIRM Workshop May 31, 2006

More Related