1 / 27

HiPPo: Hierarchical POMDPs for Planning Information Processing and Sensing Actions on a Robot

HiPPo: Hierarchical POMDPs for Planning Information Processing and Sensing Actions on a Robot. Mohan Sridharan Joint work with Jeremy Wyatt and Richard Dearden University of Birmingham, UK. CoSy – Description. Focus: Systems that can perceive, understand and interact with the environment.

joanne
Télécharger la présentation

HiPPo: Hierarchical POMDPs for Planning Information Processing and Sensing Actions on a Robot

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. HiPPo: Hierarchical POMDPs for Planning Information Processing and Sensing Actions on a Robot Mohan Sridharan Joint work with Jeremy Wyatt and Richard Dearden University of Birmingham, UK CS5331: Autonomous Mobile Robots

  2. CoSy – Description • Focus: • Systems that can perceive, understand and interact with the environment. • Sense/Manipulate objects on a tabletop. • Back to blocks-world?  • Dynamic response, reliability. • Components: • Images with objects segmented to form ROIs, speech commands. • Bind information across different modalities like speech, vision, touch. • Actuator: 5 DOF “Katana” arm. CS5331: Autonomous Mobile Robots

  3. Communication SA Communication SA Communication SA Communication SA Comm. SA Visual SA Planning SA Manip. SA Binding SA Coordinator SA Communication SA Communication SA Linguistically-driven Manipulation • Goals raised by language. • Refer to objects by learned features. • Plan intentional actions using planner. • Intention shifting handled by monitoring and replanning. CS5331: Autonomous Mobile Robots

  4. Questions/Problems in CAS • Binding: How do we match information from one component with information from another? • Filtering: How does an architecture pick where a piece of information should go? • Processing Management:How does the robot decide which bits of information should be processed, and what processing should be performed? • Action Fusion: How should low-level actions be coordinated at a fine-grained level? CS5331: Autonomous Mobile Robots

  5. Sample Video – CoSy CS5331: Autonomous Mobile Robots

  6. Visual Processing Management • Robot and human manipulate and converse about objects. • Is there a red triangle?, move the mug to the right of the blue circle. • Features: • State not observable, actions modify belief. • Non-deterministic actions: color, shape etc. • Computational complexity. • Constraints: • Dynamic response, reliability! • Approach: plan visual processing –where to look? what to look for? CS5331: Autonomous Mobile Robots

  7. Related Work • Planning sequences of visual operations: • Image interpretation (POMDP: Darrell 97, MDP: Li et al. IIS03), Image processing (Borg: Clouard et al. PAMI99, Astronomy: Chiens et al. ProcSoft00) • Classical Planning schemes: • Layered architecture (Brooks, RA86), ACT-R(Laird et al. AI87), SOAR(Anderson et al. PR04), FF (Hoffmann and Nebel, JAIR01), • Observation Planners: • C-BURIDAN (Draper et al. UAI94), PKS (Petrick and Bacchus, ICAPS04), CP (Brenner and Nebel, PCAR06). • Hierarchical planning: • MAXQ (Dietterich, ICML98), Nursebot (Pineau et al. RAS03), RN-POMDP (Foka et al. IJCAI 05). • Imposing/learning structure in POMDPs: • FSC (Hansen et al. ICAPS03), DBN (Toussaint et al. UAI08). CS5331: Autonomous Mobile Robots

  8. POMDP for one ROI – • States: Cartesian product of individual state vectors. • Actions: visual+”special”. • Observations: red, green, blue, circle, triangle, square, empty, unknown. CS5331: Autonomous Mobile Robots

  9. POMDP for one ROI – • Transition function. • Observation function. • Reward specification. • Excellent mathematical machinery to model desired features – probabilistic representation for uncertainty in action outcomes and states. • Drawback:Exponential state explosion with several ROIs and actions – 25n + 1 states for n ROIs with just two visual actions!! CS5331: Autonomous Mobile Robots

  10. Hierarchical POMDP Formulation • Proposed solution: Hierarchical Planning in POMDPs – HiPPo  • One POMDP for planning the processing actions on each ROI. • Higher-level POMDP to choose one of the LL-POMDPs at each step. • Significantly reduces complexity of the state-action-observation space. • Model creation and policy generation are completely autonomous, based on the input query. Which Region to Process? HL-POMDP LL-POMDP How to Process? CS5331: Autonomous Mobile Robots

  11. HiPPo – LL Formulation • Operates on a single ROI. • Key points: • Observation functions learned. • Transition function is an identity matrix, except for special actions and actions that change the state. • Reward function trade-off: time-based cost for actions and answer quality. • LL-policy is terminated after N levels. CS5331: Autonomous Mobile Robots

  12. HiPPo – HL Formulation • HL-POMDP: • State space: object presence in different combinations of regions. • Action ui means process Ri • FRi means desired object found in Ri • Key points: • Observation functions and costs derived from the policy trees of LL-POMDPs. • LL-POMDPs are black boxes that return definite labels (not belief densities). CS5331: Autonomous Mobile Robots

  13. Illustrative Example • Consider the scene with two ROIs extracted. • Query: Where is the blue circle? • Available operators: Color, Shape, SIFT. CS5331: Autonomous Mobile Robots

  14. Example – Where is the Blue Circle? CS5331: Autonomous Mobile Robots

  15. Example – Where is the Blue Circle? CS5331: Autonomous Mobile Robots

  16. Example – Where is the Blue Circle? CS5331: Autonomous Mobile Robots

  17. Estimating OH and RH • Condition LL observation probabilities by high level states. • Determine expected cost of running the tree and likelihood of finding target object, conditioned on the high level state. . . . CS5331: Autonomous Mobile Robots

  18. Experimental Setup • The HL-POMDP and LL-POMDPs are query-specific. • LL-POMDPs for each ROI written in ZMDP format. • Solved using point-based VI [Smith & Simmons, 05] • Generate observation probabilities, costs for HL-POMDP, which is solved in a similar manner. • Performed ~60 queries, multiple trials of each. • Occurrence: “Is there a red cup in the scene?” • Location: “Where is the blue circle?” • Property: “What colour is the box?” • Global Scene: How many green squares are there?” CS5331: Autonomous Mobile Robots

  19. Joint POMDP vs. HiPPo CS5331: Autonomous Mobile Robots

  20. A ‘Modern’ Classical Planner • Continual Planning (CP) [Brenner & Nebel, 06] provides a solution to this problem. • CP allows actions with non-deterministic effects: • Use these to represent information gathering actions. • Assumes that actions are reliable. • At plan time, the planner asserts that the effect it wants will actually occur • If the effect doesn’t occur at execution time, replan. • Intuitively: build a contingent plan, but replanning ensures you only build the branches you need. CS5331: Autonomous Mobile Robots

  21. Comparison of Planning Time CS5331: Autonomous Mobile Robots

  22. Comparison of Planning + Execution Time CS5331: Autonomous Mobile Robots

  23. Reliability Analysis • Modern planners that do not model uncertainty cannot do much better than naïve visual processing. • HiPPo exploits models of action outcomes to provide higher reliability. CS5331: Autonomous Mobile Robots

  24. Summary and Future Work • Visual processing management posed as a planning problem. • HiPPo models uncertainty well, provides efficient and reliable performance. • Slightly more time than CP but significantly more reliable. • Lots of other operators to integrate: Viewpoint Change, Zoom, … • Object interaction: • Push, poke object? • Learn object (epistemic) affordances. CS5331: Autonomous Mobile Robots

  25. Summary and Future Work • From image analysis to scene analysis: • Should I look somewhere else or analyse the ROIs I have now? • Use information maximization principles. • Incorporate on a mobile robot, and a team of mobile robots. • Model human feedback to learn from and interact with humans. • Collaborate with humans in real-world tasks. • Joint project with UT-Austin and University of Arizona (3-5 years). CS5331: Autonomous Mobile Robots

  26. That’s all folks  CS5331: Autonomous Mobile Robots

  27. We really are done  CS5331: Autonomous Mobile Robots

More Related