1 / 15

Learning Prospective Robot Behavior

Learning Prospective Robot Behavior. Shichao Ou and Roderic Grupen Laboratory for Perceptual Robotics University of Massachusetts Amherst. A Developmental Approach. Infant Learning In stages Maturation processes Parents provide constrained learning contexts Protect Easy Complex

dallshouse
Télécharger la présentation

Learning Prospective Robot Behavior

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Learning Prospective Robot Behavior Shichao Ou and Roderic Grupen Laboratory for Perceptual Robotics University of Massachusetts Amherst

  2. A Developmental Approach • Infant Learning • In stages • Maturation processes • Parents provide constrained learning contexts • Protect • EasyComplex • Motion mobile for newborns • Use brightly colored, easy to pick up objects • Use building blocks • Association of words and objects

  3. Application in Robotics • Framework for Robot Developmental Learning • Role of teacher: setup learning contexts that make target concept conspicuous • Role of robot: acquire concepts, generalize to new contexts by autonomous exploration, provide feedback • Control Basis • Robot actions are created using combinations of <σ,ф,τ> • Establish stages of learning by time-varying constraints on resources • Easy  Complex

  4. Example • Learning to Reach for Objects • Stage 1: SearchTrack • Focus attention using single brightly colored object (σ) • Limit DOF (τ) to use head ONLY • Stage 2: ReachGrab • Limit DOF (τ) to use one arm ONLY • Stage 3: Handedness, Scale-Sensitive Hart et. al, 2008

  5. Prospective Learning • Infant adapts to new situations by prospectively look ahead and predict failure and then learn a repair strategy

  6. Robot Prospective Learning with Human Guidance a1 ai-1 ai aj-1 aj an-1 a0 S0 S1 Si Sj Sn a1 ai-1 ai aj-1 aj an-1 a0 S0 S1 Si Sj Sn Challenge g(f)=0 g(f)=1 a1 ai-1 ai aj-1 aj an-1 a0 S0 S1 Si Sj Sn sub-task Si1 Sij Sin

  7. A 2D Navigation Domain Problem • 30x30 map • 6 doors, randomly closed • 6 buttons • 1 start and 1 goal • 3-bit door sensor on robot

  8. Flat Learning Results • Flat Q-Learning • 5-bit state • (x,y, door-bit1, door-bit2, door-bit3) • 4 actions • up, down, left, right • Reward • 1 for reaching the goal • -0.01 for every step taken • Learning parameter • α=0.1, γ=1.0, ε=0.1 • Learned solutions after 30,000 episodes

  9. Prospective Learning • Stage 1 • All doors open • Constrain resources to use only (x,y) sensors • Allow agent learn a policy from start to goal Down Right Right Up Right Right Right S0 S1 Si Sj Sn

  10. Prospective Learning • Stage 2 • Close 1 door • Robot learns the cause of the failure • Robot back tracks and finds an earlier indicator of this cause

  11. Prospective Learning • Stage 2 • Close 1 door • Robot learns the cause of the failure • Robot back tracks and finds an earlier indicator of this cause • Create a sub-task • Learn a new policy to sub-task

  12. Prospective Learning • Stage 2 • Close 1 door • Robot learns the cause of the failure • Robot back tracks and finds an earlier indicator of this cause • Create a sub-task • Learn a new policy to sub-task • Resume original policy

  13. Prospective Learning Results Learned solutions < 2000 episodes

  14. Humanoid Robot Manipulation Domain • Benefits of Prospective Learning • Adapt to new contexts by maintaining majority of the existing policy • Automatically generates sub-goals • Sub-task can be learned in a completely different state space. • Supports interactive learning

  15. Conclusion • A developmental view to robot learning • A framework enables interactive incremental learning in stages • Extension to the control basis learning framework using the idea of prospective learning

More Related