1 / 58

A Scheduling Service Oriented Approach for Workflow Scheduling

A Scheduling Service Oriented Approach for Workflow Scheduling. by Conan Fan Li. Supervisor: Dr. Wendy MacCaull Committee member: Dr. Man Lin Committee member: Dr. Iker Gondra. A SSO approach for workflow scheduling. Initiatives

etan
Télécharger la présentation

A Scheduling Service Oriented Approach for Workflow Scheduling

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. A Scheduling Service Oriented Approach for Workflow Scheduling by Conan Fan Li Supervisor: Dr. Wendy MacCaull Committee member: Dr. Man Lin Committee member: Dr. IkerGondra

  2. A SSO approach for workflow scheduling • Initiatives • AIF project: building decision-support through dynamic workflow systems, academia and industry working together for better healthcare • Workflow engines are not naturally built for scheduling problems

  3. What is a workflow? • A workflow or a workflow model is a depiction of a business process composed of a sequence of operations (tasks). • Tasks are connected in the form of a directed graph to provide an abstraction of the real work for further assessment.

  4. What is a workflow?

  5. What is a workflow?

  6. Why workflow? • Abstraction & visualization • Clarity & consistency • Automation

  7. What is scheduling? • The process of making decisions about the allocation of resources for a number of tasks to achieve one or more objectives? • Two main applications

  8. Application of scheduling • Manufacturing (e.g., a car factory)

  9. Application of scheduling Service industry (e.g., Gate Assignments at an Airport)

  10. Basics of Scheduling • A scheduling problem can be described by a triplet α | β | γ. • α describes the machine environment • β describes the processing characteristics and constraints • γ describes the objective • The triplet Jm | prec | Cmaxdescribes a job shop scheduling problem with precedence constraints and an objective to minimize the makespan

  11. A Job Shop Example

  12. The SSO approach • An approach that maps a scheduling problem onto a workflow model that consists of tasks with built-in services for scheduling.

  13. Definitions • A schedule has unforced idleness if some machines idle when there are jobs waiting for processing. A schedule is non-delay if unforced idleness is prohibited. • Examples of possible objective functions to be minimized are: • Makespan: completion time of the last job to leave the system. • Maximum lateness: the worst violation of the due dates. • Total weighted completion time: the sum of the weighted completion times of the n jobs. • ... • A multi-instancetaskis a task that may have multiple distinct execution instances running concurrently within the same workflow case.

  14. Definitions • We say a job is waiting when it is not assigned to any machine or finished. • We say a job is independent when it does not have a precedence constraint or the precedence constraint is satisfied (i.e., the preceding job is finished). • We say a job is machine-ready when its required machine is free. • We say a job is enabled when it is waiting, independent and machine-ready.

  15. Definitions • Schedule-flow is a framework that has a predisposition to model scheduling processes in workflow. Schedule-flow patterns are an extension to the workflow language formalism - Workflow Patterns. By assembling and modifying the existing workflow patterns, schedule-flow introduces new patterns that carries particular responsibilities and services in scheduling systems.

  16. Definitions • Schedule-flow is a framework that has a predisposition to model scheduling processes in workflow. It is an extension to the workflow language formalism - Workflow Patterns. By assembling and modifying the existing workflow patterns, schedule-flow introduces new patterns that carries particular responsibilities and services in scheduling systems. • A* search uses a distance-plus-cost evaluation function (f(x)) to determine the order in which the search visits nodes in the fringe. The distance-plus-cost heuristic is a sum of two functions: • the cost function, which is the cost from the starting node to the current node (usually denoted g(x)) • and an "heuristic estimate" of the distance to the goal (h(x)).

  17. Definitions • We say an event e=(m,j,start,end) is a future event of schedule S if e.start >= S.clock

  18. Relation • Scheduling is a ___ and a workflow is to represent a ___. • Why use workflow to model scheduling? • Workflow is concise, comprehensive and high-level • Scheduling is diverse, technical and low-level • We want to visualize the scheduling process Workflow has a wider range of audience than scheduling does. That is why we need to bridge them.

  19. Attempt • Case: a simple job shop scheduling problem • Objective: minimize makespan(Cmax)

  20. Attempt Messy What you see is messy What you do not see is messier

  21. Attempt

  22. Attempt 2. multi-instance tasks are confusing and high-maintenance 3. Need to incorporate smart choices We may choose Job0 to process first. Why does this seem like a smart choice?

  23. Attempt Finish Job0 Clock+=7 Job0.waiting=False

  24. Attempt 4. Options for unforced idleness We want to assign as many jobs as possible before processing

  25. Attempt We may choose Job4 to process

  26. Attempt Finish job1 Clock=7+9=16 Job1.waiting=False (Note, job4 has been processed for 9 time units)

  27. Attempt No other jobs are available to process except Job4

  28. Attempt Finish Job4 Clock=16+(15-9)=22 Job4.waiting=False

  29. Attempt Assign job2

  30. Attempt clock attribute of a schedule • … we get this: (57, {<0,0,0,7>, <0,4,7,22>, <0,2,22,43>, <1,1,7,16>, <1,5,22,52>, <1,3,52,57>}) An event: <machine, job,start,end>

  31. Attempt • … we get this: (57, {<0,0,0,7>, <0,4,7,22>, <0,2,22,43>, <1,1,7,16>, <1,5,22,52>, <1,3,52,57>}) Gantt chart Forced idleness

  32. Problems • The size issue • The size of the resulting workflow grows in accordance to the size of the problem (number of jobs and number of machines). There are also too many variables to configure inside the workflow. This is neither concise or comprehensive. • The complication of multi-instance tasks • Users may not understand when to use them • The lack of heuristic incorporation • When several jobs are presented to a machine, there should be a mechanism to decide which one appears to be the best option

  33. Problems • The lack of options for unforced idleness • There should be an easy way of expressing that we do not want to allow unforced idleness, that is, when a machine is free, we always try to assign a job to it if possible. • The lack of comparison and sorting • We do not want the workflow to stop as soon as it finds one feasible schedule. Instead, we need it to compare all the schedules and present the best one. There should be a mechanism to easily compare and sort the schedules.

  34. Proposal • The current workflow components are clearly not sufficient for constructing sophisticated schedulers. Therefore, we need a set of new workflow patterns to provide the services we need in scheduling. We call the extension schedule-flow patterns.

  35. Schedule-flow • Present the data (jobs and machines) in a single file instead of mapping each one of them to a task in the workflow. This way, the size of the workflow will not be proportional to the size of the scheduling problem. More importantly, the same workflow can now work with different sets of data. • Eliminate the usage of multi-instance tasks. Instead, we use a data structure called “fringe”(a collection of ideas, see A*) which is implemented as a priority queue. Different execution instances (schedules) will be evaluated first and then pushed into the fringe. • Heuristics may be given by users regarding the preference of assignment. For example, we may want to assign the jobs with the least processing times first (SPT).

  36. Schedule-flow • By default, we do not allow unforced idleness. We would like to keep the machines as busy as possible. • “For many the models that have regular objective functions, there are optimal schedules that are non-delay” • The fringe provides options for its priority rule, which is the order that the schedules are sorted.

  37. Schedule-flow For the same problem, use schedule-flow: Built-in variables: fringe, current

  38. Schedule-flow task components • Creation • Does nothing • Pop • current=fringe.pop() • Selection • By default, select jobs that are enabled (or independent, machine-ready…) • Allocation • Generate schedules by assigning jobs to corresponding machines • Process • Generate one schedule by advancing time until any job is finished • Push • fringe.push(schedules)

  39. Schedule-flow condition components • Exist • Tests if the input exists • Continue • By default, tests if the fringe is empty. Other options may be used such as limiting the execution time or the number of schedules generated…

  40. Use Schedule-flow in practice • Design a schedule-flow with a graphic editor • We used YAWL in this case.

  41. Use Schedule-flow in practice <task id="POP_93"> <name>POP</name> <flowsInto> <nextElementRef id="SELECT_87" /> </flowsInto> <join code="xor" /> <split code="and" /> </task> • Provide the files • Job file (e.g., “jobs.csv”) • Machine file • Schedule-flow XML file generate by the graphic editor • Parse the schedule-flow file and detect schedule-flow components by matching names. • We used a Python script to parse the YAWL’s XML file.

  42. Use Schedule-flow in practice • Configure the components, objective and heuristic • In this case, every component uses the default setting • Objective is set to minimize the makespan. Therefore, the cost function g(S)=S.clock • Heuristic is initially 0 (f(x)=g(x), see A*). For this case, we set the heuristic function to return the processing time of the future event (see future event) with the earliest end time • Bind: automatically connect the components according the schedule-flow • Run the schedule-flow

  43. Schedule-flow simulation S0 is the initial schedule where no jobs are assigned. S0={ }

  44. Schedule-flow simulation

  45. Schedule-flow simulation

  46. Schedule-flow simulation

  47. Schedule-flow simulation S1: (0,{<0,0,0,7>}) S2:(0,{<0,2,0,21>}) S3:(0,{<0,4,0,15>})

  48. Schedule-flow simulation f(S1)=0+7=7 f(S3)=0+15 f(S2)=0+21 S1: (0,{<0,0,0,7>}) S3:(0,{<0,4,0,15>}) S2:(0,{<0,2,0,21>})

  49. Schedule-flow simulation

  50. Schedule-flow simulation

More Related