1 / 41

Lecture about Agents that Learn

Lecture about Agents that Learn. 3rd April 2000 INT4/2I1235. Agenda. Introduction Centralized learning vs decentralized learning Credit Assignment Problem Learning and Activity Coordination Learning about and from other agents Learning and Communication Summary. Introduction.

verda
Télécharger la présentation

Lecture about Agents that Learn

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Lecture about Agents that Learn • 3rd April 2000 • INT4/2I1235

  2. Agenda • Introduction • Centralized learning vs decentralized learning • Credit Assignment Problem • Learning and Activity Coordination • Learning about and from other agents • Learning and Communication • Summary

  3. Introduction • Todays topic • Who is the lecturer • Why do we have this lecture

  4. Todays topic • How do agents learn? • What are the benefits of learning agents? • Learning in isolation, or in cooperation?

  5. Who is the lecturer • Johan Kummeneje • Doctoral Student • RoboCup, Social Decisions, and Java

  6. Why do we have this lecture • Beats me….. You tell me. • Take 2 minutes to think about why this is interesting, and then I will ask 2 or 3 of you what you think.

  7. Agenda • Introduction • Centralized learning vs decentralized learning • Credit Assignment Problem • Learning and Activity Coordination • Learning about and from other agents • Learning and Communication • Summary

  8. Centralized vs Decentralized • Introduction • The Degree of Decentralization • Interaction-specific features • Involvement-specific features • Goal-specific features • The learning method • The learning feedback

  9. Introduction • Learning process => planning, inference, decision steps etc. • Centralized learning or isolated learning • Decentralized learning or interactive learning

  10. The Degree of Decentralization • Distributedness • Paralellism

  11. Interaction-specific features • Level of interaction ( ”simple” observation to complex negotiations and dialogues) • Persitence of interaction (short-long) • Frequency (low -high) • Pattern ( unstructured- hierarchical) • Variability (fixed - dynamic)

  12. Involvement-specific features • Relevance to the learning process • Role in the learning process • Generalist-- Specialist

  13. Goal-specific features • Improvement (Individual vs Social) • Conflict vs Compatible Goals

  14. The learning method • Rote learning (”Korvstoppning”) • Instructed and adviced • Examples and practice (Learning by Doing, Baden-Powell) • Analogy • Discovery Efforts increase from top to bottom.

  15. The learning feedback • Supervised (tells which action that is the best) • Reinforcement (maximizing the utility of action) • Unsupervised (no explicit feedback)

  16. Agenda • Introduction • Centralized learning vs decentralized learning • Credit Assignment Problem • Learning and Activity Coordination • Learning about and from other agents • Learning and Communication • Summary

  17. Credit Assignment Problem • Inter Agent CAP (how to divide credit to the different agents) • Intra Agent CAP (how to divide credit between different actions performed in an agent)

  18. Agenda • Introduction • Centralized learning vs decentralized learning • Credit Assignment Problem • Learning and Activity Coordination • Learning about and from other agents • Learning and Communication • Summary

  19. Learning and Activity Coordination • Introduction • Reinforcement Learning • Q-Learning and Learning Classifier Systems • Isolated, Concurrent Reinforcement Learners • Interactive Reinforcement Learning of Coordination • ACE and AGE

  20. Introduction • Activity Coordination • Adaption to to differences in the coordination process • Effectively utilize opportunities and avoidance of pitfalls.

  21. Reinforcement Learning • Optimise the feedback (reinforcement) • Modeled by a Markov decision process • <S, A, SxSxA,r>

  22. Q-Learning • When getting feedback=> update the Q-value • Q(s,a) <- (1-b)Q(s,a)+b(R+y max(Q(s',a')) • where b is a small constant called the learning rate

  23. Learning Classifier Systems • A classifier is (condition, action) • Strength of the classifier at a time S(c,a) • At each timestep a classifier is choosen from a matchset ( according to environment) • Feedback is received and the S is modified accordingly.

  24. Isolated, Concurrent Reinforcement Learners • Agent Coupling • Agent relationships • Feedback timing • Optimal behaviour combinations • CIRL • No modelling of other agents • In cooperative situations, complimentary policies can be developed • Adapts to similar situations.

  25. Interactive Reinforcement Learning of Coordination • Eliminates incompatible actions • Agents can observe the set of considered actions of other agents. • Two different alternatives are ACE and AGE

  26. Action Estimate Algorithm (ACE) • Each agent calculates the set of performable actions • For each of these the agent calculates the goalrelevance. • For all agent with a GR above a treshold, the agents calc. And announces a bid with a risk factor and a noise term : • B(S)= (a+b)E(S) • Removal of incompatible actions. It thereafter executes the one with the highest bid. • The feedback increases the probability for succesful actions to be performed in future.

  27. Action Group Estimate Algorithm (AGE) • All applicable actions from each agent is collected in to all possible activity contexts, in which all actions are mutually compatible. • Using the same bidding strategy from ACE, the highest sum of bids for a activity context, chooses the activity context to execute. • Credit assignment is dependent on the actions performed and the relevance of the action. • Requires more computational effort than ACE.

  28. Agenda • Introduction • Centralized learning vs decentralized learning • Credit Assignment Problem • Learning and Activity Coordination • Learning about and from other agents • Learning and Communication • Summary

  29. Learning about and from other agents • Introduction • Learning Organizational Roles • Learning in Market Environments

  30. Introduction • Learning to improve the individual performance • On the expense of other agents • Anticipatory Agents, RMM

  31. Learning Organizational Roles • Learns roles, to better complement each other. • Each agent can be in a set of roles (one at a time), and the choice is to choose the most appropriate role. (Minimise costs). • f(U, P, C, Potential)

  32. Learning in Market Environments • Agents sell/buy information from each other. • 0-level agents do not model other agents • 1-level agents model other agents as 0-level agents • 2-level agents model other agents as 1-level agents

  33. Agenda • Introduction • Centralized learning vs decentralized learning • Credit Assignment Problem • Learning and Activity Coordination • Learning about and from other agents • Learning and Communication • Summary

  34. Learning and Communication • Introduction • Reducing Communication by Learning • Improving Learning by Communication

  35. Introduction • Learning to communicate • Communicating as learning • What to communicate? • When to communicate? • With whom to communicate? • How to communicate?

  36. Reducing Communication by Learning • Learning about the abilities of other agents. • Learning which agents to ask, instead of broadcasting • Problem similarities

  37. Improving Learning by Communication • Communicating beliefs and pieces of information • Explanation • Ontologies • Finding out complex relationships between different agents and actions.

  38. Agenda • Introduction • Centralized learning vs decentralized learning • Credit Assignment Problem • Learning and Activity Coordination • Learning about and from other agents • Learning and Communication • Summary

  39. Summary • We have seen the move of foci from isolated (individual, centralized) learning to a more diverse flora of learning. • Besides standard (old) ML-methods there are some new ML-algorithms proposed. • Agents learn to improve communication and cooperation.

  40. Further reading • Peter Stone, Ph.D-thesis • Weiss (coursematerial), chapter 6 • Russell and Norvig, AI. A modern Approach

  41. THE END

More Related