1 / 10

Latent Learning in Agents

Latent Learning in Agents. iCML 03 Robotics/Vision Workshop Rati Sharma. Problem statement. Determine optimal paths in spatial navigation tasks. We use a deterministic grid environment as our world model. Various approaches have been used: ANN’s, Q-learning, Dyna. Latent Learning.

sheryl
Télécharger la présentation

Latent Learning in Agents

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Latent Learning in Agents iCML 03 Robotics/Vision Workshop Rati Sharma

  2. Problem statement • Determine optimal paths in spatial navigation tasks. • We use a deterministic grid environment as our world model. • Various approaches have been used: ANN’s, Q-learning, Dyna

  3. Latent Learning • Tolman proposed the idea of Cognitive Maps based on experiments on Rats in a T-maze • Latent Learning is described as learning that is not evident to the observer at the time it occurs but is apparent once a reinforcement is introduced.

  4. Algorithms used: Q-Learning • Q-Learning – Eventually converges to an optimal policy without ever having to learn and use an internal model of the environment • Update rule Q(s,a) = r(s,a) + *max Q((s,a),a’) r(s,a) is the reward function, (s,a) is the new state Disadvantage: Convergence to the optimal policy can be very slow

  5. Model based learning- Dyna • A form of planning is performed in addition to learning. • Learning updates the appropriate value function estimates according to experience • Planning updates the same value function estimates for simulated transitions chosen from the world model.

  6. Problems considered • Blocking problem • Shortcut problem

  7. Results

  8. Results

  9. Conclusion • Model based learning performs significantly better that Q-Learning. • On the blocking and shortcut problems the agent demonstrates latent learning

  10. Acknowledgements • Prof. Littman • Prof. Horatiu Voicu

More Related