340 likes | 475 Vues
This presentation explores the use of Partially Observable Markov Decision Processes (POMDP) to enhance wheelchair navigation by predicting long-term user intentions. With the increasing aging population and neurological conditions affecting motor control, smart wheelchairs are vital for improving mobility. We discuss the formulation of POMDP in assistive applications, online navigation mechanisms, and test results showcasing a 100% success rate in destination prediction. Future directions aim to further automate activity monitoring for smarter navigation systems.
E N D
COMP 650: POMDP’s real life applications • Rahul Kumar • Department of Computer Science • Rice University • April 18, 2013
Long-term user intention prediction for wheelchair navigation using POMDP • References: Taha et al POMDP-based long-term user intention prediction for wheelchair navigation. • *image taken from http://robotzeitgeist.com/tag/dementia
Outline • Motivation • Introduction • POMDP quick review • Problem Specification / formulation • On Line assistance • Experimental results • Conclusions
Motivation Why to make wheel chairs smart ? • Growing number of aging population. • Increase in accidents or other calamities. • Terrible diseases which affect motor control
Reactive Wheelchairs • Reactive refers to systems that do not use representation of environment. • Most Popular among Intention recognition wheel chairs. • Rely on local or temporal information collected online. • Systems with limited power or processing power use this technique. • Examples: Rolland-III, NavChair etc.
POMDP - 1 • General framework for sequential decision making where states are hidden and actions are stochastic. • Widely used in assistive applications.
POMDP -2 • S – set of states • A – set of action • Z – set of observations • T – conditional transition probabilities S x A x S -> [0,1] • Z – conditional observation probabilities A x S x Z -> [0,1] • R: A x S -> real number
POMDP agent overview Observation Action Environment StateEstimator BeliefState Policy
POMDP Generation For efficient POMDP system , we need to have proper • State Space • Transition States • Observation States
State Space • Spatial States : Wheelchair location = {s1,s2,s3,… } • Destination states : Places of Interest = {d1,d2,..} • Joint representation of both of them = {s1d1,s2d1,…}
Transition model • Transition model specifies the probability of transition from one state to another given when a certain action is executed. • Actions= global navigation commands = {North, South, East, West, Stop } • Observation = Joystick movements = { Up, Down, Right, Left, NoInput} • Directly calculated from the map topology.
Observation model • We use training data from particular user. • In indoor settings, wheelchair user usually performs repetitive tasks. • For example, A task can be going from living room to kitchen etc.
Reward function • -1 for each action • +100 for an action that leads to Destination.
Experimental result -1 • Artificial data was generated based on the activity of user in the environment. • Zmdp software package was used. Zmdp package has several heuristic search algorithm for POMDPs and MDPs. • Known starting points but unknown destinations. • 100% success in predicting destination.
Conclusion • Employing POMDP for long term user intention prediction for wheel chair navigation. • No behavioral selection like other papers.
Future work • Enhance the capabilities and the intelligence of the system through automated activity monitoring and task extraction.
POMDP Hands * Image taken from http://matanyahorowitz.com/index.php
Overview • Motivation • Approach/Big Picture • Example/ Intution • Model Construction • Results
Motivation If you know all shapes and positions exactly, you can generate a trajectory that will work *Slide taken from Hsiao etal.
Problem at hand • How to decide on configuration of object when robot have to manipulate an object!
Approach/Big Picture • Partition Space : Identify and separate regions where we will have similar properties. • Reducing uncertainty in configuration by taking actions which acts as “funnels” i.e. mapping large sets of initial states to smaller set of resulting states. • We will work with set of guarded complaint motion. These actions acts as funnels.
Example Partial policy graph for robot
Abstract model construction • Action space: Two guarded complaint move commands for each degree of freedom. • Transition probability: Sample large number of triplets from given initial states. • Observation probability : Contact sensors have some uncertainty in determining contact. • Reward : 15 for reaching the goal, -50 for lifting in wrong configuration, -1 for each motion, -5 for being in unstable states or boundary states
Experiment • Similar to previous problem except that block is stepped • High fidelity simulation : 92% success, average reward: -1.59 • Fixed policy : 81% success, average reward= -10.632
Future work • To address problem with shape uncertainty. • To handle interaction with other objects,
References • Shio et al Grasping POMDP’s