1 / 41

Evolution of Teamwork in Multiagent Systems

Evolution of Teamwork in Multiagent Systems. Research Preparation Examination by Jacob Schrum. Why Multiple Agents?. Many applications Physical World Robotics Autonomous automobiles Military applications Network Systems Artificial World Games Graphics Entertainment Artificial Life.

lynnea
Télécharger la présentation

Evolution of Teamwork in Multiagent Systems

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Evolution of Teamwork in Multiagent Systems Research Preparation Examination by Jacob Schrum

  2. Why Multiple Agents? • Many applications • Physical World • Robotics • Autonomous automobiles • Military applications • Network Systems • Artificial World • Games • Graphics • Entertainment • Artificial Life

  3. Why Multiagent Perspective? • Decentralized control • Failure recovery • Individual agents simpler than whole • Some environments don’t support central control • Human interaction • Humans are also agents • Agents interacting with humans are in MAS

  4. Teamwork in Multiagent Systems • Problem divided amongst many agents • Teamwork often required for success • Communication sometimes an issue • How to learn teamwork: open question

  5. Direct Approach: Careful Design • Hand code everything • Benefits: • Understand end product • Drawbacks: • Not general • Difficult • Programmer time • Common in: • Robotics • Video games • Most deployed systems • What if no one knows how to program it?

  6. Learn it: Reinforcement Learning • Environment is Markov Decision Process • Learn optimal policy • Depends on value function (TD methods) • Proven convergence in tabular case • Function approximation needed for bigger problems • Problems with Partially Observable MDPs • Successes in • Pred/Prey Scenarios (Tan 1993) • Soccer keep away (Kalyanakrishnan, Stone 2009) • Robocup soccer (many…)

  7. Breed it: Evolution • Based on evolution via natural selection • Benefits: • Less restrictive policy representation • Demonstrated success in POMDP domains • Drawbacks: • Computationally intensive • Time intensive • Focus of talk

  8. Evolution Basics • Initialize population P • Evaluate all p in P (assign fitness) • Derive P’ by selecting/modifying members of P based on their fitness scores • Repeat from step 2 with P’ as P until done • P’ is usually similar to P, but slightly better • Many variations: • Genetic Algorithms, Evolution Strategies, etc.

  9. Evolution in Multiagent Systems • Team Composition • Homogeneous • Heterogeneous • Heterogeneous from Subpopulations • Entire population • Type of Selection • Individual • Team • Self-Selection • Multiple Objectives Pick one member from each subpopulation to make a team

  10. 1.A. Homogeneous Teams • Team members share same policy • Members know what to expect from team members • One individual evaluated per trial • Evaluations reliable because of consistent team composition

  11. 1.B. Heterogeneous Teams • Team composed of several policies • Uncertainty as to who teammates will be • Multiple individuals evaluated per trial • Evaluation differs depending on choice of team members

  12. 1.C. Subpopulations • Each slot filled by representative from specific subpopulation • Subpopulations specialize • Individuals know what to expect of members in each slot • Team composition is still heterogeneous

  13. 1.D. Entire Population • The entire population is seen as a cooperating team • Team level selection not possible • Population may divide into competing subpopulations • Mating restrictions • Genetic/Tag-based recognition

  14. 2.A. Individual Selection • Individuals selected based on own fitness • Commonly used with heterogeneous teams • Can result in selfish behaviors • Altruism relevant • sacrificing own fitness to raise fitness of another • Reciprocity relevant • helping another to get help in return

  15. 2.B. Team Selection • Individuals selected based on team fitness • Common fitness, sum, average, etc. • Commonly used with homogeneous teams • Enables slackers in heterogeneous teams • Altruism and reciprocity have no meaning • No credit assignment problems between members

  16. 2.C. Self-Selection • Individuals choose when and with whom to mate • Common in Artificial Life simulations • AL studies emergence of biological phenomena • Usually involves a spatial component • Extinction is possible • Auto restart • Spawn new members

  17. 3. Multiple Objectives • Assume individual has fitness scores: • F = (f1,…,fN) in objectives 1 through N • Which values of F are best? • Traditional approach • fitness(F) = f1*w1 + … + fN*wN for weights w1,…,wN • Pareto-based approach • Partition population into non-dominated Pareto fronts • Assign fitness based on Pareto-front

  18. Pareto Front Example • Each point represents an individual’s scores • Point dominates other points in its box • 3 Pareto fronts of non-dominated points

  19. Case Studies • Review State of the Art • For each study: • Classify type of selection • Classify team composition • Identify unanswered questions • Future research directions

  20. AntFarm • Evolve foraging behavior • Pheromones to communicate • Individual selection • Entire population as a team • No cooperative foraging! • Likely cause: individual selection • Individual selection offers less incentive for teamwork • Teamwork especially difficult when there is only one team * AntFarm: Towards Simulated Evolution. Collins, Jefferson. 1991

  21. Evolving Communication • Exploration task • Pheromones to communicate • Team selection • Homogeneous teams vs. static bots • Pairs of objectives, Pareto-based • Different behaviors in different runs • Compromise strategy • Blocking strategy • Teamwork possible with homogeneous teams • Need to move beyond grid-worlds • Move beyond two objectives * Emergence of Communication in Competitive Multi-Agent Systems: A Pareto Multi-Objective Approach. McPartland, Nolfi, Abbass. 2005

  22. SwarmEvolveTags • Birds visit food stations • Energy can be shared • Sharing based on tags • Self-selection • Entire population as team • Competing subpopulations emerged • Cooperation in entire population without team selection • Altruism via aiding similar individuals • Teamwork as a result of subpopulation homogeneity * Evolution of cooperation without reciprocity. Riolo, Cohen, Axelrod. 2001 * Tags and the Evolution of Cooperation in Complex Environments. Spector, Klein, Perry. 2004

  23. Legion-I • Roman legions defend countryside and cities • Team level selection • Homogeneous teams • Multi-modal behavior • Defend city • Pursue barbarians • Homogeneous team members must fill all roles • Could not learn more complicated/strategic tasks • Example: building roads to speed up travel * Neuroevolution for Adaptive Teams. Bryant, Miikkulainen. 2003

  24. Role-Based Cooperation • Toroidal predator/prey grid world • Individual selection • Team fitness shared by team members • Multi-Agent ESP: subpopulation based • Simple non-communicating method outperforms communicating method • Teamwork without homogeneity • Communication not always needed • May only apply to simple domains • Still need to scale up complexity • Get away from grid worlds * Coevolution of Role-Based Cooperation in Multi-Agent Systems. Yong, Miikkulainen. 2007

  25. NERO • Machine Learning game • Human interaction via fitness function • Individual selection • Entire population is team • Multiple objectives • User defines weights dynamically • Maintenance of fitness function • Old behaviors can be forgotten when learning new ones • Need to learn multiple tasks simultaneously * Evolving Neural Network Agents in the NERO Videogame. Stanley, Bryant, Miikkulainen. 2005

  26. Pareto Multi-objective NPCs • Evolved monsters vs. bot with stick • Individual selection • Large heterogeneous teams of 15 • Third of entire population • Multiple objectives, Pareto-based • Credit assignment trick • Learns multiple objectives simultaneously • Different runs can lead to very different results • Different areas of trade-off surface • Population becomes mostly homogeneous * Constructing Complex NPC Behavior via Multi-Objective Neuroevolution. Schrum, Miikkulainen. 2008

  27. Dead End Game • Human prey vs. predators • Offline evolution vs. bot • Team level selection • Homogeneous teams • Online evolution vs. human • Individual selection • Small heterogeneous team • Different configurations appropriate at different levels • Sometimes the domain leaves no choice * Interactive Opponents Generate Interesting Games. Yannakakis, Hallam. 2004

  28. Cooperating Robots • Retrieve tokens • Simulation → Robots • Compared selection levels • Individual vs. Team • Compared team compositions • Homogeneous vs. heterogeneous • Homogeneous better with teamwork and altruism • Homogeneous best with team selection • Heterogeneous best with individual selection • Did not consider subpopulations • Tasks only involved foraging (no other objectives) * Genetic Team Composition and Level of Selection in the Evolution of Cooperation. Waibel, Keller, Floreano. 2008

  29. Summary of Issues • More complexity • Move beyond grid worlds • Need multiple contradictory objectives • Act in continuous, real-time world • Best evolutionary configuration • More comparisons between team compositions • Especially subpopulation-based method • Task/configuration pairings? • Credit assignment issues • Multi-modal behavior • What to do and when

  30. Experiment • Four monsters vs. bot with stick • Smaller team makes task harder • Compare homogeneous, heterogeneous and subpopulation • Homogeneous uses team selection • Others use individual selection • Multiple objectives: • Group damage • Individual injury • Individual time alive

  31. Heterogeneous Results • Many generations (600+) • Not that long in real time • Mostly selfish • Good teamwork can arise though (Baiting) • Teamwork depends on population being homogeneous Selfish Teamwork

  32. Homogeneous Results • Fewer Generations (100-200) • Actually longer in real time • Always some form a teamwork • Baiting • Timed Assault Time Assault Baiting

  33. Subpopulations Results • Many Generations (400+) • Each generation takes a lot of real time • Easy for slacker subpopulation to persist • Limited teamwork • Only some members participate Cooperating Pair

  34. Discussion • Can subpopulation method do better? • Better credit assignment • Team level selection (how?) • Speed up homogeneous and subpopulations • Heterogeneous: discourage selfishness

  35. Future Research Questions • Credit assignment issues • Cooperating individuals cannot be identified • Objectives define best evolutionary configuration? • Complex domains/real problems • Many objectives • Continuous, real-time • Potential challenge domains • Robocup Soccer • Unreal Tournament

  36. Conclusion • Teamwork in Multiagent Systems important area • Evolution has been successful • Better understand why • Team configuration • Level of selection • Presence/absence of credit assignment problems • Apply to harder domains • Real-time • Continuous/noisy • Multiple contradictory objectives

  37. Questions? schrum2@cs.utexas.edu

  38. Auxiliary Slides

  39. Cooperation Without Reciprocity • Abstract study of the evolution of cooperation • Donor/recipient model • 3 random pairings with option of donating fitness c so that recipient can gain fitness b • Choice to donate based on similarity of tags • Individual selection with entire population as team • Subpopulations emerged based on tags • Donation rate changes cyclically, but generally stays high (73%) for c < b • Need to apply in actual domain requiring teamwork * Evolution of cooperation without reciprocity. Riolo, Cohen, Axelrod. 2001

  40. Cooperation Without Reciprocity Results

  41. Team Composition in MAS • Taxonomy proposed by Stone*: • Definition of communication is broad: • Message passing, blackboard, information sharing, etc. * Multiagent Systems: A Survey from a Machine Learning Perspective. Stone. 2000

More Related