1 / 16

History-Dependent Graphical Multiagent Models

History-Dependent Graphical Multiagent Models. Quang Duong Michael P. Wellman Satinder Singh Computer Science and Engineering University of Michigan, USA Yevgeniy Vorobeychik Computer and Information Sciences University of Pennsylvania, USA.

hubert
Télécharger la présentation

History-Dependent Graphical Multiagent Models

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. History-Dependent Graphical Multiagent Models Quang Duong Michael P. Wellman Satinder Singh Computer Science and Engineering University of Michigan, USA Yevgeniy Vorobeychik Computer and Information Sciences University of Pennsylvania, USA

  2. Modeling Dynamic Multiagent Behavior • Design a representation that: • expresses a joint probability distribution over agent actions over time • supports inference (e.g., prediction) • exploits locality of interaction • Our solution: • history-dependent graphical multiagent models (hGMMs)

  3. Example Consensus Voting [Kearns et al. ’09]: shown from agent 1’s perspective t=10s 2 5 3 1 6 4 time

  4. Graphical Representations • Exploit locality in agent interactions • MAIDs [Koller & Milch ’01], NIDs [Gal & Pfeffer ’08], Action-graph games [Jiang et al. ’08] • Graphical games [Kearns et al. ’01] and Markov random field for graphical games [Daskalakis & Papadimitriou ’06]

  5. Graphical Multiagent Models (GMMs) 2 • [Duong, Wellman and Singh UAI-08] • Nodes: agents • Edges: dependencies between agents • Neighborhood Ni includes i and its neighbors • accommodates multiple sources of belief about agent behavior for static (one-shot) scenarios 5 3 1 6 4 potential of neighborhood’s joint actions Joint probability distribution of system’s actions normalization

  6. Contribution Extend static GMM for modeling • dynamic joint behaviors • by conditioning on local history

  7. History-dependent GMM (hGMM) • Extend static GMM: condition joint agent behavior on abstracted history of actions • directly captures joint behavior using limited action history potential of neighborhood’s joint actions at t Joint probability distribution of system’s actions at time t abstracted history normalization neighborhood-relevant abstracted history

  8. Joint vs. Individual Behavior Models Autonomous agents’ behaviors are independent given complete history. Agent i’s actions depend on past observations, specified by strategy function σi(Ht) • Individual behavior models (IBMM): conditional independence of agent behavior given complete history. Pr(at| Ht)= Πiσi(Ht) History is often abstracted/summarized (limited horizon h, frequency function f, etc.), resulting in correlations in observed behavior. • Joint behavior models (hGMM) • no independence assumption σ2(Ht2) σ3(Ht3) 3 σ1(Ht1) 2 3 3 1

  9. Voting Consensus Simulation • Simulation (treated as the true model): smooth fictitious play [Camerer and Ho ’99] • agents respond probabilistically in proportion to expected rewards (given reward function and beliefs about others’ behavior) • Note: • This generative model is individual behavior • Given abstracted history, joint behavior models may better capture behavior even if generated by an individual behavior model

  10. Voting Consensus Models Reward for action ai,regardless of neighbor’s actions Individual Behavior Multiagent Model (IBMM) Joint Behavior Multiagent Model (hGMM) Frequency that actionaiis previously chosen by each of i’s neighbors normalization Frequency that aNiis previously chosen by neighborhood Ni Expected reward for aNi, discounted by the number of dissenting neighbors

  11. Model Learning and Evaluation • Given a sequence of joint actions over m time periods X = {a0,…,am}, the log likelihood induced by the model M: LM(X;θ) • θ: model’s parameters • Potential function learning: • assumes a known graphical structure • employs gradient descent • Evaluation: • computes LM(X;θ) to evaluate M

  12. Experiments • 10 agents • i.i.d. payoffs for consensus red and blue results (between 0 and 1), 0 otherwise. • max node degree d • T = 100 or when the vote converges • 20 smooth fictitious play game runs generated for each game configuration (10 for training, 10 for testing)

  13. Results Evaluation: log likelihood for hGMM / log likelihood for IBMM • Green: hGMM > IBMM • Yellow: hGMM < IBMM hGMMs outperform IBMMs in predicting outcomes for shorter history lengths. Shorter history horizon  more abstraction of history  more induced behavior correlation  hGMM > IBMM hGMMsoutperformIBMMs in predicting outcomes across different values of d

  14. Asynchronous Belief Updates • hGMMs outperform IBMMs more for longer summarization intervals v (which induce more behavior correlations)

  15. Direct Sampling • Compute the joint distribution of actions as the empirical distribution of the training data • Evaluation: Log likelihood for hGMM / log likelihood for direct sampling • Direct sampling is computationally more expensive but less powerful

  16. Conclusions • hGMMs support efficient and effective inference about system dynamics, using abstracted history, for scenarios exhibiting locality • hGMMs provide better predictions of dynamic behaviors than IBMMs and direct fictitious play sampling • Approximation does not deteriorate performance • Future work: • More domain applications: authentic voting experimental results, other scenarios • (Fully) dynamic GMM that allows reasoning about unobserved past states

More Related