1 / 61

Approximation algorithms for stochastic sequencing and scheduling problems

Approximation algorithms for stochastic sequencing and scheduling problems. David B. Shmoys Cornell University. Worst-Case vs Stochastic Analysis. Yin and Yang of algorithmic analysis Early results always focus on simple problems for which simple algorithms work Examples

cathal
Télécharger la présentation

Approximation algorithms for stochastic sequencing and scheduling problems

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Approximation algorithms for stochastic sequencing and scheduling problems David B. Shmoys Cornell University

  2. Worst-Case vs Stochastic Analysis • Yin and Yang of algorithmic analysis • Early results always focus on simple problems for which simple algorithms work • Examples Smith Ratio Rule for 1|| wj Cj Rothkopf WSEPT rule for 1||E[  wjCj ] • Stochastic literature focus on classes of distributions where nice characterizations could be proved e.g. Weiss & Pinedo (exponential distributions play a central role in this line of research)

  3. Advent of Competitive Analysis • Sleator & Tarjan for paging problems • Redefined the meaning of on-line more generally to be computing in the absence of full information • On-line scheduling used to mean just jobs arrived over time and existence of job is known only at its release date • Now, many other on-line scheduling models exist • But are these models too pessimistic?

  4. Approximation for Stochastic Models • Trend to make models less distribution specific • But now search for settings in which constant performance guarantees can be found • Several examples from a range of types of scheduling and sequencing problems • Common Theme: understand deterministic version first!

  5. One On-Line Model vs. Its Stochastic Analog • Processing times not known in advance, but scheduler learns them by processing each job to completion • Traditional stochastic models exactly like this – BUT –adversary not controlling the processing time (even as the job is being processed) instead each job’s processing time is drawn independently according to a pre-specified probability distribution • Benchmark for competitive analysis: best off-line solution • Benchmark for stochastic analysis: optimum expected value for any non-anticipatory policy • The KEY for BOTH: GET GOOD LOWER BOUNDS

  6. Example 1 Using LP to get near-optimal solutions for 1| rj | E[  wj Cj ] [Uetz; Möhring, Schulz, Uetz]

  7. Start with Deterministic Single-Machine Model • Good polyhedral description for model without release dates [Queyranne; Wolsey] minimize  wj Cj subject to j2A pj Cj¸ (j2A pj )2/2 + (j2A pj2 )/2 for each A µ N • View left-hand side as minimizing  wjCj with wj = pj (and so all ratios in Smith’s rule are equal)

  8. Now Extend to Identical Parallel Machines • Add constraints machine by machine to get the following valid inequalities for each feasible schedule: j2A pj Cj¸ (j2A pj )2/2m + (j2A pj2 )/2 for each A µ N • To add release dates, now just add Cj¸ rj + pj for each j 2 N

  9. Valid Inequalities for P|rj |E[ wj Cj ] Let Cj be output of non-anticipatory policy: j2A E[pj]E[Cj]¸ (j2A E[pj])2/2m + (j2A E[pj]2)/2 – (j2A Var[pj])(m-1)/2m for each A µ N. Proof: Take any realization pj & let Sj denote start times. Where does non-anticipatory come in? pj and Sj are independent r.v.’s

  10. Apply deterministic inequality to realization: j2A pj(Sj + pj )¸ (j2A pj )2/2m + (j2A pj2 )/2 for each A µ N • Take expectations over given distribution: j2A E[pj]E[Sj]¸ (i,j2A E[pi]E[pj])/2m + (1/2 +1/2m -1)(j2A E[pj2]) = (i,j2A E[pi]E[pj])/2m + [(m-1)/2m](j2A Var[pj]-E[pj]2 ) • Using that E[Cj] = E[Sj] + E[pj], we get desired j2A E[pj]E[Cj]¸ (j2A E[pj])2/2m + (j2A E[pj]2)/2 – (j2A Var[pj])(m-1)/2m for each A µ N.

  11. Making Constraints Just Like Deterministic Ones • We start with j2 A E[pj]E[Cj]¸ (j2 A E[pj])2/2m + (j 2 A E[pj]2)/2 – (j2 A Var[pj])(m-1)/2m for each A µ N But if we let ¢ be upper bound on Var[pj ].5/E[pj] can rewrite as j2 A E[pj]E[Cj]¸ (j2 A E[pj])2/2m +®(j 2 A E[pj]2)/2 where ® = ( ¢ (1-m) +m )/2m

  12. Easy to Adapt Release Date Constraints • From Cj¸ rj + pj for each j 2 N • Get E[Cj ]¸ rj + E[pj] for each j 2 N

  13. A Simple LP-based Policy • Solve the LP, viewing E[Cj] as the variables – requires knowing both E[pj] and Var[pj] for the input distribution !Cj Blue is LP optimal value (lower bound) • Sort so that C1· C2· … · Cn • List-based scheduling – wait until next job on list is released and schedule on first available machine

  14. Analyzing Performance Guarantee • Goal: show that Cj(¼) values computed by policy ¼ satisfy E[Cj(¼)] ·½Cj • Fix realization pj and consider job j – let J={1,2,…,j} After time R = maxk 2 J rk there can be no more idle time now, and there must exist machine with at most average load on which to process job j Cj(¼) · R + (1/m) k2J pk + pj ·Cj +(1/m) k2J pk + pj Take expectations (and use that Cj¸ E[pj] ) E[Cj(¼)] · 2 Cj+ (1/m) k2J E[pk]

  15. Analysis (continued) • From before: E[Cj (¼)] · 2 Cj+ (1/m) k2J E[pk] so now use LP to bound last term j2 A E[pj]E[Cj]¸ (j2 A E[pj])2/2m +®(j 2 A E[pj]2)/2 where ® = (¢ (1-m) +m )/2m First Case: ® > 0 (small ¢ ) with A=J={1,2,…,n} (k2 J E[pk])2/2m ·k2 J E[pk] Ck·Cjk2 J E[pk] Cancelling common factor implies that (1/m) k2J E[pk] · 2 Cj Second case: (1/m) k2J E[pk] · 2(1-®) Cj

  16. Theorem [Möhring, Schulz, Uetz] The LP-based policy ¼ has the guarantee, for each job j that E[Cj (¼)] · (3 + max{1,¢(m-1)/m}) Cj and thereby is a 3 + max{1,¢(m-1)/m}–approximation algorithm for P|rj|E[ wj Cj]

  17. Example 2 Using the m-fast single-machine relaxation for P|pmtn, rj|E[ wjCj ] [Megow & Vredeveld]

  18. The Simplest Preemptive Models 1| rj , pmtn | j Cj - Shortest Remaining Processing Time is optimal (SRPT) 1| rj , pmtn | j wj Cj Ratio rule (remaining processing time to weight) is a good approximation algorithm [Schulz & Skutella; Goemans, Wein, &Williamson] 1| pmtn |E[ wj Cj ] is solved by the Gittins Index Policy [Sevcik; Gittens; Weiss]

  19. A Useful Lower Bound for  wj Cj • Consider 1 machine & machine runs at speed s • With s constant, used to bound on-line algorithms for single-machine models [Kalyanasundaram & Pruhs; Phillips, Stein, Torng, Wein] • With s=m, used to bound performance for identical parallel machine models [Chekuri, Motwani, Natarajan, & Stein] To see this, take optimal m-machine schedule, and “slice and dice” each interval without preemptions to yield single-machine schedule of no greater obj. • Still holds for optimal policy for stochastic model with somewhat more subtle proof

  20. What is the Gittins Index Policy? • Consider deterministic models as a special case • Even with unit weights – Shortest Remaining Processing Time (SRPT) is optimal • Analogous Ratio rule is 2-approximation algorithm with general weight • Why? Get most completed weight per unit effort spent processing jobs

  21. What is the Gittins Index Policy? 1| pmtn | E[ wj Cj ] Consider a job j that has not been processed yet Suppose we process j until q time units pass, or job j completes, whichever comes first Expected contribution to obj. fnc. = wjPr[ pj· q ] Expected cost of processing = E[min{pj,q}] Index(j) = maxq wjPr[ pj· q ]/E[min{pj,q}] Q(j) = argmaxq wjPr[ pj· q ]/E[min{pj,q}] Policy = always process job with biggest Index

  22. Gittins Index Policy – More Details Once job j has biggest Index, remains job with biggest Index until first quantum Q(j) done For a job j already partially processed t time units, compute Index by considering conditional processing time distribution condit’nd that pj> t No longer optimal when there exist release dates (not surprising – deterministic setting is special case) Need complete specification of distribution Same schedule produced for machine at speed m

  23. A Simple LB-Based Policy (LB=LowerBound) For P|rj|E[ wjCj ] : • At each point in time, compute m jobs with highest Index value (w.r.t. conditional remaining processing time distribution) • Run those jobs, or as many still exist (quantum argument implies that schedule stays the same until first quantum completed)

  24. Analysis of Performance Guarantee

  25. Analysis of Performance Guarantee • Not today – too many miles to go before I rest • But can use good characterization of Gittens optimum for single-machine input to bound policy cost

  26. Theorem [ Megow & Vredeveld ] The Gittens Index Policy is a 2-approximation algorithm for P | pmtn | E[  wj Cj ]. In fact, can add release dates and get the same bound And furthermore, policy does not require knowledge of a job until it is released So performance guarantee is a mixture of competitive analysis (relative to knowing existence of jobs at time 0) and stochastic analysis (guarantee relative to what can be attained by non-anticipatory optimal policy)

  27. What is the Right Performance Guarantee for Stochastic Analysis? • We bound the expected objective function value obtained by a constant times the optimal policy’s objective function value • But the inputs are random – some we may do badly, and some we may do fine • In particular, if the optimal value is “unexpectedly small” we can “afford” to produce a horrible schedule – Is the just? • [Coffman & Gilbert; Scharbrodt, Schickenger, & Steger; Souza & Steger] bound worst-case value of expected ratio of policy to optimum

  28. Example 3 • Get rid of independence assumption for processing times of jobs 2-stage with recourse model of max-weight ontime set with release dates [S & Sozio] [Dye, Tomasgard, & Stougie]

  29. Two-Stage Recourse Model Given : Probability distribution over inputs. Stage I : Make some advance decisions – plan ahead or hedge against uncertainty. Observe the actual input scenario. Stage II : Take recourse. Can augment earlier solution paying a recourse cost. Choose stage I decisions to minimize (stage I cost) + (expected stage II recourse cost).

  30. A 2-Stage Variant of 1| rj |  wj Uj • Will consider maximization version – or equivalently – admission control variant of max-weight on-time set • Probabilistic content is not processing time, but whether job exists or not • Potentially 2n realizations – how do we specify distribution? Three Possible Approaches • Only a polynomial number of subsets A occur with probability pA • Existence of jobs are independent random events • Black box model – can draw independent samples to learn the distribution

  31. Maximum-weight on-time set Jobs N = {1,2,…,n}- job j has set of allowed time intervals Sj = {[s1j,e1j),…,[skj,ekj)} with corresponding weights wij Deterministic problem: Pick a maximum-weight collection of intervals 1 per job and at each time Height = Weight Time

  32. Maximum-weight on-time set Jobs N = {1,2,…,n}- job j has set of allowed time intervals Sj = {[s1j,e1j),…,[skj,ekj)} with corresponding weights wij Deterministic problem: Pick a maximum-weight collection of intervals 1 per job and at each time Height = Weight Optimal selection is shaded Time

  33. Maximum-weight on-time set Jobs N = {1,2,…,n}- job j has set of allowed time intervals Sj = {[s1j,e1j),…,[skj,ekj)} with corresponding weights wij Deterministic problem: Pick a maximum-weight collection of intervals 1 per job and at each time Linear Programming Relaxation Let Tt be the set of intervals (for all jobs) containing time t : (i,j) xij : indicates whether [sij,eij) selected for job j Maximize i,j wij xij Subject to i xij 1, for each j=1,…,n (i,j)  Tt xij1 for each t xij 0 for each i,j Theorem [Bar-Noy, Bar-Yehuda, Freund, Naor, & Schieber] Primal-dual 2-approximation algorithm for max-weight schedule

  34. 2-Stage Stochastic Variant Scenario A  N of active jobs occurs with probability pA Stage I: Choose setD  Nof jobs todeferto subcontractor and receive small weightj for each j  D Stage II: Given realized scenario A, make selectionTAwhere(i,j)  TA j  A-Dand has weightWij Goal: Maximize the total expected weight scheduled (where expectation is with respect to active subset probabilities)

  35. A Primal-Dual Theorem Theorem: [S & Sozio] We can (adapt the2-approximation algorithm for deterministic setting to) obtain a 2-approximation algorithmfor stochastic maximum-weight on-time scheduling. We focus first on the polynomial-scenario model

  36. Let Tt be the set of intervals (for all jobs) containing time t • Minimize j uj + t vt • Subject to • uj + t: (i,j)  Tt vt wij for each (i,j) • uj, vt 0 • The primal-dual algorithm has two phases: • first it constucts a feasible dual solution, while building a stack of possible pairs (i,j) to be selected; • next it pops the stack, selecting any pair that doesn’t conflict with those already selected; • amortization shows dual cost is at most twice the value of the primal. Dual Linear Program

  37. Linear Programming Relaxation for 2-Stage Problem Let Tt be the set of intervals (for all jobs) containing time t xj : indicates whether job j is deferred in stage I yij(S): indicates whether [sij,eij) selected for job j in stage II for scenario S Maximize jjxj + i,j,S pS Wij yij(S) Subject to xj + i yij (S) 1, for each j,S (i,j)  Tt yij(S) 1, for each t,S xj , yij (S)  0 for each i,j,S DUAL Minimize j,S uj (S) + t,S vt (S) Subject to S uj(S)j for each j uj(S) + t: (i,j)  Tt vt(S)  p(S) Wij for each (i,j), S uj (S), vt(S)  0

  38. A Simple 2-Stage Algorithm For each scenario A  N with probability pA >0 run the deterministic algorithm with job set A where weight of job j for [sij,eij) ispA Wij let uj (A) denote the dual values constucted by the algorithm Stage I: Let D be the set of jobs j for which j > A uj(A) Stage II: Given realized scenario A, recompute first phase of algorithm (to get duals) but in second phase never select (i,j) for j  D

  39. Main Idea of Analysis What is 2-stage dual? Block-structured by scenario A with additional linking constraints: A1 Each block is just like determ. problem A2 Am A uj(S) j j So we can adapt the scenario-by-scenario constuction as building a feasible dual solution for the 2-stage linear relaxation

  40. What about black box model? Just use sampling to estimate the deferral rule! - use M samples For each sampled scenario A  N run deterministic algorithm with job set A where weight of job j for [sij,eij) isWij to obtain dual values uj(A) – let Ak be kth sample Stage I: Let D be the set of jobs j for which (1+) j > (1/M) k uj(Ak) Stage II: Given realized scenario A, compute TA and then recompute first phase of algorithm (to get duals) but in second phase never select (i,j) for j  D Number of samples needed is polynomial in n, /, and  = maxj Wj/j Similar to “sample average approximation” results of [Swamy & S, Shapiro & Nemirovski, and Charikar, Chekuri, & Pál]

  41. Example 4 The a priori TSP – a stochastic sequencing model (again where probabilistic content is existence of a task) [ S & Talwar ]

  42. A priori optimization (no recourse) Given: Probability distribution over inputs. In advance: Compute master plan. Observe the actual input scenario. In real time: Adapt master plan to scenario. Compute master plan to minimize expected real time cost.

  43. The Traveling Salesman Problem (TSP) Given input points, compute tour  to minimize total length c()

  44. The Traveling Salesman Problem (TSP) Given input points, compute tour  to minimize total length c()

  45. The A Priori TSP Given input points N and a distribution  of active sets A 2 2N Need to specify the probability that a given set A is active

  46. The A Priori TSP Active Nodes Given input points N and a distribution  of active sets A 2 2N Need to specify the probability that a given set A is active

  47. The A Priori TSP Active nodes A Given input points N and a distribution  of active sets A 2 2N, compute master tour to minimize expected length of the tour  shortcut to serve only A

  48. The A Priori TSP Active nodes A Shortcut tour on A Given input points N and a distribution  of active sets A 2 2N, compute master tour to minimize expected length of the tour  shortcut to serve only A

  49. The A Priori TSP Active nodes A Shortcut tour on A Given input points N and a distribution  of active sets A 2 2N, compute tour  to minimize expected length EA [c( A )], where A is the tour  shortcut to serve only A

  50. The A Priori TSP Active nodes A Optimal tour on A Given input points N and a distribution  of active sets A 2 2N, compute tour  to minimize expected length EA [c( A )], where A is the tour  shortcut to serve only A

More Related