1 / 27

From Motor Babbling to Planning

From Motor Babbling to Planning. Cornelius Weber Frankfurt Institute for Advanced Studies Johann Wolfgang Goethe University, Frankfurt, Germany Bio-Inspired Autonomous Systems Workshop 26 th - 28 th March 2008, Southampton. Reinforcement Learning: Trained Weights. actor units. value.

Télécharger la présentation

From Motor Babbling to Planning

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. From Motor Babbling to Planning Cornelius Weber Frankfurt Institute for Advanced Studies Johann Wolfgang Goethe University, Frankfurt, Germany Bio-Inspired Autonomous Systems Workshop 26th - 28th March 2008, Southampton

  2. Reinforcement Learning: Trained Weights actor units value fixed reactive system that always strives for the same goal

  3. reinforcement learning does not use the exploration phase to learn a general model of the environment that would allow the agent to plan a route to any goal so let’s do this

  4. Learning randomly move around the state space actor learn world models: ● associative model ● inverse model ● forward model state space

  5. Learning: Associative Model weights to associate neighbouring states use these to find any possible routes between agent and goal

  6. Learning: Inverse Model weights to “postdict” action given state pair use these to identify the action that leads to a desired state Sigma-Pi neuron model

  7. Learning: Forward Model weights to predict state given state-action pair use these to predict the next state given the chosen action

  8. Planning

  9. Planning

  10. Planning

  11. Planning

  12. Planning

  13. Planning

  14. Planning

  15. Planning

  16. Planning

  17. Planning

  18. Planning

  19. Planning

  20. Planning

  21. Planning

  22. Planning actor units goal agent

  23. Planning

  24. Planning

  25. Planning

  26. Discussion - reinforcement learning ... if no access to full state space - previous work ... AI-like planners assume links between states - noise ... wide “goal hills” will have flat slopes - shortest path ... not taken; how to define? - biological plausibility ... Sigma-Pi neurons; winner-take-all - to do: embedding ... learn state space from sensor input - to do: embedding ... let the goal be assigned naturally - to do: embedding ... hand-designed planning phases

  27. Acknowledgments Collaborators: Jochen Triesch FIAS J-W-Goethe University Frankfurt Stefan Wermter University of Sunderland Mark Elshaw University of Sheffield

More Related