1 / 20

Planning and Execution with Phase Transitions

Planning and Execution with Phase Transitions. H åkan L. S. Younes Carnegie Mellon University. Follow-up paper to Younes & Simmons’ “Solving Generalized Semi-Markov Processes using Continuous Phase-Type Distributions” ( AAAI’04 ). Planning with Time and Probability. Temporal uncertainty

ella
Télécharger la présentation

Planning and Execution with Phase Transitions

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Planning and Executionwith Phase Transitions Håkan L. S. Younes Carnegie Mellon University Follow-up paper to Younes & Simmons’“Solving Generalized Semi-Markov Processes usingContinuous Phase-Type Distributions” (AAAI’04)

  2. Planning withTime and Probability • Temporal uncertainty • When to schedule service? Service F1 Fail F3 Serviced Working Failed Return F2 Memoryless distributions  MDP Planning and Execution with Phase Transitions

  3. The Role of Phases • Memoryless property of exponential distribution makes MDPs tractable • Many phenomena are not memoryless • Lifetime of product or computer process • Phases introduce memory into state space • Modeling tool—not part of real world (phases are hidden variables) Planning and Execution with Phase Transitions

  4. Exponential Two-phase Coxian  p1 2 1 1 2 (1 – p)1 n-phase Erlang     1 2  n Phase-Type Distributions • Time to absorption in Markov chain with n transient states and one absorbing state Planning and Execution with Phase Transitions

  5. Phase-Type Approximation F(x) Weibull(8,4.5) 1 Erlang(7.3/8,8) Erlang(7.3/4,4) Exponential(1/7.3) 0 x 0 2 4 6 8 10 12 Planning and Execution with Phase Transitions

  6. Solving MDPs withPhase Transitions • Solve MDP as if phases were observable • Maintain belief distribution over phase configurations during execution • AAAI’04 paper: Simulate phase transitions • Use QMDP value method to select actions • OK, because there are typically no actions that are useful for observing phases Planning and Execution with Phase Transitions

  7. Factored Event Model • Asynchronous events and actions • Each event/action has: • Enabling condition e • (Phase-type) distribution Ge governing the time e must remain enabled before it triggers • A distribution pe(s′; s) determining the probability that the next state is s′ if e triggers in state s Planning and Execution with Phase Transitions

  8. Event Example • Computer cluster with crash event and reboot action for each computer • Crash event for computer i: • i = upi; p(s′; s) = 1 iff s upi and s′upi • Reboot action for computer i: • i = upi; p(s′; s) = 1 iff supi and s′ upi Planning and Execution with Phase Transitions

  9. Event Example • Cluster with two computers up1up2 up1up2 up1up2 crash2 crash1 t = 0 t = 0.6 t = 0.9 Planning and Execution with Phase Transitions

  10. Exploiting Structure • Value iteration with ADDs: • Factored state representation means that some variable assignments may be invalid • Phase is one for disabled events • Binary encoding of phases with n≠ 2k • Value iteration with state filter f: Planning and Execution with Phase Transitions

  11. Phase Tracking • Infer phase configurations from observable features of the environment • Physical state of process • Current time • Transient analysis for Markov chains • The probability of being in state s at time t Planning and Execution with Phase Transitions

  12. Phase Tracking • Belief distribution changes continuously • Use fixed update interval  • Independent phase tracking for each event • Q matrix is n  n for n-phase distribution • Phase tracking is O(n) for Erlang distribution Planning and Execution with Phase Transitions

  13. The Foreman’s Dilemma • When to enable “Service” action in “Working” state? Service Exp(10) Fail G Servicedc = –0.1 Workingc = 1 Failedc = 0 Return Exp(1) Planning and Execution with Phase Transitions

  14. Foreman’s Dilemma:Policy Performance Failure-time distribution: W(1.6x,4.5) 100 90 80 Percent of optimal n = 8,  = 0.5 n = 4,  = 0.5 70 n = 8, simulate 60 n = 4, simulate SMDP 50 x 0 5 10 15 20 25 30 35 40 Planning and Execution with Phase Transitions

  15. Foreman’s Dilemma:Enabling Time for Action 30 20 Service enabling time optimal 10 n = 8,  = 0.5 n = 8, simulate SMDP 0 x 0 5 10 15 20 25 30 35 40 Planning and Execution with Phase Transitions

  16. System Administration • Network of n machines • Reward rate c(s) = k in states where k machines are up • One crash event and one reboot action per machine • At most one action enabled at any time (single agent) Planning and Execution with Phase Transitions

  17. System Administration:Policy Performance Reboot-time distribution: U(0,1) 45 40 35 Expected discounted reward n = 4,  =  n = 2,  =  30 n = 4, simulate n = 2, simulate 25 n = 1 20 2 3 4 5 6 7 8 9 10 Number of machines Planning and Execution with Phase Transitions

  18. System Administration: Planning Time 104 103 no filter, n = 5 102 Planning time no filter, n = 4 pre-filter, n = 5 101 pre-filter, n = 4 filter, n = 5 100 filter, n = 4 10–1 2 3 4 5 6 7 8 9 10 Number of machines Planning and Execution with Phase Transitions

  19. Discussion • Phase-type approximations add state variables, but structure can be exploited • Use approximate solution methods (e.g., Guestrin et al., JAIR19) • Improved phase tracking using transient analysis instead of phase simulation • Recent CTBN work with phase-type dist.: • modeling—not planning Planning and Execution with Phase Transitions

  20. Tempastic-DTP • A tool for GSMDP planning: http://sweden.autonomy.ri.cmu.edu/tempastic/ Planning and Execution with Phase Transitions

More Related