650 likes | 788 Vues
This paper explores the critical aspects of temporal planning and resource allocation while executing reactive, model-based programs in complex, dynamic environments. It introduces a new programming paradigm called RMPL for cooperative autonomous agents, focusing on reasoning about contingencies, scheduling, and system interactions. We examine the synthesis of temporal, causal link, and HTN planning, highlighting a novel planner implementation. The paper emphasizes the significance of quickly finding temporally consistent plans through a generalized temporal planning network (TPN) for effective execution of embedded programs.
E N D
Temporal Planning and Resource Allocation Stefanie Chiou, Rob Kochman, and Gary Look
Running Plans in the Real World • Need to account for time and resources when creating plans • Papers featured: • "Executing Reactive, Model-Based Programs through Graph-Based Temporal Planning" by Phil Kim, Brian C. Williams, and Mark Abramson (IJCAI ’01) • "Managing Multiple Tasks in Complex, Dynamic Environments" by Michael Freed (AAAI ’98).
Paper • Executing Reactive, Model-based Programs through Graph-based Temporal Planning by Phil Kim, Brian Williams, and Mark Abramson
Familiar Examples Mars Climate Orbiter: 12/11/98 Mars Polar Lander: 1/3/99
Motivation • Embedded programming is hard • Easier to reason about state when programming
Overview/Contributions • RMPL provides a new programming paradigm for programming robust systems of cooperative autonomous agents • TPN -> synthesis of temporal, causal link, and HTN planning • A “holy grail” for autonomous agents • Planner that implements these ideas
RMPL Intro • RMPL supports four types of reasoning about system interactions • reasoning about contingencies • scheduling • inferring hidden state • controlling hidden state • This paper focuses on first two interaction types
(Model-based) Embedded Programs Embedded Program Model-basedEmbedded Program Obs Cntrl S Plant S Plant • Embedded programs interact withplant sensors/actuators: • Read sensors • Set actuators • Model-based programs interact with plant state: • Read state • Write state setState getState Programmer must map between state and sensors/actuators. Model-based executive maps between sensors, actuators to states.
Model-based Embedded Program Breakdown Model-basedEmbedded Program setState getState Model-based executive maps between sensors, actuators to states. Model-based Executive Sensor data Actuator commands S Plant
Example: The model-based program sets engine = thrusting, and the deductive controller . . . Deduces that thrust is off, andthe engine is healthy Plans actions to open six valves Deduces that a valve failed - stuck closed Determines that valves on the backup enginewill achieve thrust, andplans needed actions. Oxidizer tank Fuel tank
Time and Contingency Constructs in RMPL • if c thennext A • do A maintaining C • A,B (concurrency) • A;B (serialization) • A[l,u] (temporal bounds) • Choose{A,B} (choose)
Path 2 A Path 1 B Choosing a route from A to B RMPL Code Example Group-Enroute()[l,u] = { choose { do { Group-Fly-Path(PATH1) [l*90%,u*90%]; } maintaining PATH1_OK, do { Group-Fly-Path(PATH2) [l*90%,u*90%]; } maintaining PATH2_OK }; { Group-Transmit(FAC,ARRIVED_TAI)[0,2], do { Group-Wait(TAI_HOLD1,TAI_HOLD2)[0,u*10%] } watching PROCEED_OK } }
RMPL’s Representation of Time and Contingencies • Important to find a plan quickly • Idea: use a plan graph • Generalization of Simple Temporal Network (STN) • TPN defined (STN + conditionals + choices)
STN example Start End
Temporal Planning Networks (TPN) • A temporal planning network is just a generalization of a STN • Includes ability to represent conditionals and choices
Ask(Proceed=Ok) TPN Example
RMPL -> TPN conversion • A [l,u]: invoke activity A between l and u time units
RMPL -> TPN conversion • c [l,u]: Assert that condition c is true now until [l ,u]
RMPL -> TPN conversion • Ifc thennextA [l,u]: Execute A for [l ,u], if condition c is currently satisfied
RMPL -> TPN conversion • doA [l,u] maintaining c : Execute A for [l ,u], and ensure that condition c holds throughout
RMPL -> TPN conversion • A [l1,u1], B [l2,u2] : Concurrently execute A for [l1,u1], and B for [l2,u2]
RMPL -> TPN conversion • A [l1,u1]; B [l2,u2] : Execute A for [l1,u1], and then B for [l2,u2]
RMPL -> TPN conversion • choose {A [l1,u1]; B [l2,u2]} : Reduces to A [l1,u1] or B [l2,u2] non-deterministically
Kirk • Compiles RMPL program into a TPN • Searches TPN for a temporally consistent plan • Temporally consistent plan is “embedded” into the TPN.
Kirk Phase1 • Select plan from TPN • Essentially a graph traversal • Check plan for temporal consistency Start
Selecting the Plan Start Start
Checking for Temporal Consistency • Convert TPN to a distance graph • Run Bellman-Ford to check for negative cycles (if any found, inconsistent)
[30,40] 40 2 1 2 1 [10,20] 20 -30 [10,20] 0 -10 20 0 -10 [40,50] 3 4 50 3 4 70 -40 [60,70] -60 Converting TPNs to Distance Graphs • The interval [aij,bij] represents the statement: aij ≤Tj-Ti ≤bij • This is equivalent to: Tj-Ti ≤bij and Ti-Tj ≤-aij
Checking for Temporal Consistency • Convert TPN to a distance graph • Run Bellman-Ford algorithm to check for negative cycles:
Bellman-Ford Algorithm initializeCosts(G, s) for i=1 to |V(G)|-1 for each edge (u,v) in E(G) updateCost(u, v, w) for each edge (u, v) in E(G) if cost(v) > cost(u) + w(u. v) return false return true
40 20 -30 -10 20 0 -10 50 70 -40 -60 Bellman-Ford Example Source
40 20 20 -30 -10 20 0 -10 50 70 -40 -60 Bellman-Ford Example Source
40 60 20 20 -30 -10 20 0 -10 50 70 -40 -60 Bellman-Ford Example Source
40 60 20 20 -30 -10 20 0 -10 50 50 70 -40 -60 Bellman-Ford Example Source
40 60 20 20 -30 -10 20 0 -10 50 50 100 70 -40 -60 Bellman-Ford Example Source
40 60 20 20 -30 -10 20 0 -10 50 50 70 70 -40 -60 Bellman-Ford Example Source
40 60 20 20 -30 -10 20 0 -10 50 30 70 70 -40 -60 Bellman-Ford Example Source
40 50 20 20 -30 -10 20 0 -10 50 30 70 70 -40 -60 Bellman-Ford Example Source
Kirk Phase 2 • Resolve threats and open conditions • Analogous to threats and open conditions in causal link planning • Identify intervals of inconsistent constraints using Floyd-Warshall • Order intervals to resolve threats • Close open conditions by making sure open conditions satisfied by some action in the plan
Why This Paper? • It’s useful for our term project
Vision • "Managing Multiple Tasks in Complex, Dynamic Environments" by Michael Freed (AAAI ’98). • Achieve goals in “task environments” • Complex • Time-pressured • Uncertain • Co-existing/Interacting
APEX Goal: ATC • Goal: simulate human air traffic controllers • Largely routine activity • Complexity due to many simple tasks • Interruptions necessary
Resource Conflicts • Separate tasks make incompatible demands • What to do? • Determine relative priority of tasks • Assign control to winner • Deal with the loser
Conflict Resolution Strategies • Shed • Eliminate low importance tasks • When (Demand > Availability) • Delay/Interrupt • Introduces complications • Circumvent • Select methods that use different resources
APEX Architecture: Two Parts • Resource Architecture • Set of resources • Cognitive • Perceptual • Motor • Action Selection Component Action Selection Component commands events Resource Architecture perception actuators World
Procedure Definition Language (PDL) Example: Turning on headlights
Procedure Definition Language (PDL) (procedure (index (turn-on-headlights) (step s1 (clear-hand left-hand)) (step s2 (determine-loc headlight-ctl => ?loc)) (step s3 (grasp knob left-hand ?loc) (waitfor ?s1 ?s2)) (step s4 (pull knob left-hand 0.4) (waitfor ?s3)) (step s5 (ungrasp left-hand) (waitfor ?s4)) (step s6 (terminate) (waitfor ?s5))) Example: Turning on headlights