740 likes | 759 Vues
Explore agent perspectives, decision-making challenges, and robust algorithms for multi-agent systems using game theory. Discover C-competitive strategies for decentralized load balancing and position auctions. Learn how equilibrium concepts impact agent behaviors.
E N D
Game-Theoretic Recommendations:Some Progress in an Uphill Battle Moshe Tennenholtz Technion—Israel Institute of Technology
When CS meets GT • Game theory has become a central tool for the analysis of multi-agent systems in many areas of computer science. . • The question is whether game theory provides answers to the two central challenges of multi-agent systems: 1. Providing a recommendation to an agent on its course of action in a multi-agent encounter (the agent perspective) 2. Leading a group of rational agents to perform a socially desired behavior (the mediator perspective)
When CS meets GT: The agent’s challenge • We are given a multi-agent environment. • How should an agent choose its action?
Decision making in multi-agent systems: load balancing Which route should an agent take? Taking the route that goes through s is α slower than taking the route the goes through f, but service is splitted when shared among agents. Agent 1 Agent 2 f s Target
Decision making in multi-agent systems: the game-theoretic approach • Agents behave according to equilibrium. • When α >= 0.5, select f with probability (2- α)/(1+ α) is in equilibrium
Decision making in multi-agent systems: the agent perspective • Equilibrium may be of interest from a descriptive perspective, but would you recommend your agent the strategy prescribed in a Nash equilibrium?
Decision making in multi-agent systems: a robust normative approach • Safety level strategy: choose f with probability α/(1+α): • The safety-level strategy does not define a Nash equilibrium, but the expected payoff of it is as in a Nash equilibrium!
The Game Theory Literature and Robust Algorithms • One powerful attack on Nash equilibrium is in an example by Aumann: • The maximin value is as a high as what we obtain in the Nash equilibria. Why should one therefore use the strategy that corresponds to Nash equilibrium?
The Game Theory Literature and Robust Algorithms • Our example of robust decision-making in the context of decentralized load balancing is of similar flavor of Aumann’s criticism. • Our aim however is a normative one: our finding(s) serve as a useful approach for agent design in multi-agent systems! • We will also extend the concept to C-competitive strategies allowing surprisingly positive results for wide settings. • The maximin value if as a high as what we obtain in the Nash equilibria. Why should one therefore use the strategy that corresponds to Nash equilibrium?
C-competitive strategies • Given a game G, and a set of strategies S for the agent. • A mixed strategy t Δ(S) will be called a C-competitive safety-level strategy if the ratio between the expected payoff of the agent in a Nash equilibrium to its expected payoff in t is bounded by C. • In most cases we refer to best Nash equilibrium. • If C is small we have a good suggestion for our agent!
C-competitive strategies for decentralized load balancing: Many agents (rather than only two) attempt to reach the target: Agent1 Agent j Agent n .…. ….. ….……….. .…. .…. Agent i f s Target
C-competitive strategies for decentralized load balancing: a 9/8 ratio Theorem:There exists a 9/8-competitive safety strategy for the extended decentralized load balancing setting. • The 9/8-competitive strategy: choose f with probability α/(1+α)
C-competitive strategies for decentralized load balancing: many links, arbitrary speeds • m links to the target, normalized to have speeds 1= α1 α2 …. αm > 0 • Theorem: There exists a (Σi=1 αi )(Σi=1 Πji αj)/ Πj=1αj competitive safety strategy for the extended load balancing setting, when we allow m (rather than only 2) parallel communication lines, and arbitrary αi. m m m
Decentralized load balancing: many parallel links m • The average network quality is Q=(Σi=1 αi )/m • A network is k-regular if (Q / αm ) k • Theorem: Given a k-regular network, there exists a k-competitive safety strategy for the extended load balancing setting, where we allow m (rather than just 2) parallel links.
Position Auctions Position Auction Search Results
Position Auctions - Model • k– #positions, n - #players n>k • vi - player i’s valuation per-click • j- position j’s click-through rate 1>2>>k Allocation rule – jth highest bid to jth highest position Tie breaks - fixed order priority rule Payment scheme pj(b1,…,bn) – position j’s payment under bid profile (b1,…,bn) Quasi-linear utilities: utility for i if assigned to position j and pays qi per-click is j(vi-qi) Outcome(b) = (allocation(b), position payment vector(b))
Some Position Auctions • VCG pj(b)=l¸j+1b(l)(k-1-k)/j • Self-price pj(b)=b(j) • Next –price pj(b)=b(j+1) A variant of next-price is what is used in practice.
C-competitive Strategies for Next-Price Position Auctions • While there is no C-competitive strategies, for a constant C, for the complete information setting, such strategies exist in realistic incomplete information settings! • We devise a robust strategy that competes with the best equilibrium expected payoff for any given agent valuation. • For example, with uniform distribution on valuations, and linear click-rates, we get a 2-competitive strategy. • For an agent with small valuations a “close to 1” competitive safety strategy is provided!
From Robust Agents to Prediction-Based Agents • What should we do if the robust approach is not satisfactory, and there are no useful competitive safety strategies? • We suggest the use of ML techniques to predict opponent behaviors, and contrast them with existing techniques in cognitive psychology and experimental economics. • Surprisingly positive results are obtained, as validated in experiments with human subjects, and compared to leading approaches in cognitive psychology and experimental economics.
Prediction rules • We are trying to predict the behavior of player p in game g. • A prediction rule maps the strategies chosen by all other players in all games and the strategies player p chose in all games but g, to a probability distribution over all strategies in game g.
Existing Approaches • Population statistics • Cognitive Methods (agent modelling): • Cognitive hierarchy (Camerer et. al.) • Agent types (Costa-Gomes et. al.)
Existing Approaches • Population statistics • Cognitive Methods (agent modelling): • Cognitive hierarchy (Camerer et. al.) • Agent behavioral types (Costa-Gomes et. al.) • We offer: A new ML approach
Population Statistics • The probability an action will be selected in a game is the frequency of times it has been selected by other agents in that game.
Cognitive hierarchy(Camerer et. al.) • The cognitive hierarchy model defines players’ type by the number of reasoning levels the player uses. • A "type 0" player does not use any level of reasoning: randomly chooses a strategy. • Players doing one or more steps of thinking assume other players use less thinking steps. • They have an accurate guess about the proportions of players who use fewer steps than they do. • They best respond to the probability distribution induced by the above mentioned proportion.
Agent Behavioral Types (Costa-Gomes et. al.) • In this model each participant's behavior is determined by one of nine decision rules or types: • Altruistic - maximize the sum of payoffs • Pessimistic – maxmin decision • Naïve – BR to uniform distribution • Optimistic – maximax decision • L2 – BR to Naive • D1 – 1 round of deletion of dominated strategies. • D2 – 2 rounds of deletion dominated strategies. • Equilibrium • Sophisticated – BR to population distribution.
A Machine Learning Approach • In order to predict the play of the query agent, we suggest learning association rules from the test set. • Association rules map a strategy in a known game to a strategy in the unknown game. • Notice that what we suggest is learning in ensemble of games.
Prediction using Association Rules • In order to predict the play in the unknown game, association rules are learned from the training data. • The estimation of probabilities is done using confidence levels of association rules combined with population statistics.
Boosting Technique • We calculate for each action i and game g: • The frequency that action i was played • The confidence of the best applicable association rule • The average confidence of the 10 best applicable association rules
Boosting Technique (cont.) • Let be 1 if strategy i was played in g and 0 otherwise. • We use linear regression to estimate constants B so that will be as similar as possible to over all other players. • We use this formula to estimate the probability distribution over actions in the unknown game.
Data Sets • The Costa-Gomes data set we used is the same data set used by Costa-Gomes et. al. in their paper. • The Camerer data set we used is the same data set used by Camerer et. al. in their paper. • We have conducted an experiment with 96 subjects
Evaluation Criteria • Absolute prediction. • MSE • MLE • Best response
Analysis • Our method of using Machine Learning techniques provides better prediction than the related cognitive theories. • This shows that our approach of assuming a relation without defining its nature can be useful in prediction. • Furthermore, the most useful association rules can now be explained.
cooperate in Prisoner's Dilemma trust in the Trust Game Best Sum Some Useful Rules
Prediction-Based Agents: Concluding Remarks • Novel application of machine learning to one-shot games. • Positive experimental results. • Can be used as a technique for predicting behaviour in e.g. auctions, and other complex mechanisms, based on observing behavior in a set of basic games.
Conclusion: re-considering the agent’s perspective • The agent is not hopeless! Introduced competitive safety analysis and showed it may go quite far in non-trivial settings. Prediction-Based agents employing “learning in game ensembles” technique can be used and outperform existing techniques in the cognitive psychology and experimental economics literature.
When CS meets GT • Game theory has become a central tool for the analysis of multi-agent systems in many areas of computer science. . • The question is whether game theory provides answers to the two central challenges of multi-agent systems: 1. Providing a recommendation to an agent on its course of action in a multi-agent encounter (the agent perspective) 2. Leading a group of rational agents to perform a socially desired behavior (the mediator perspective)
From an agent perspective to a mediator perspective So far we were interested in recommendations for an agent acting in a multi-agent environment. In many systems there exists a reliable entity, such as a router, broker, or system administrator, who may wish to lead the agents to desired behaviors. Hence, we are now interested in: recommendations provided by a mediator.
From an agent perspective to a mediator perspective When there is a mediator that attempts to lead rational agents to desired behavior the Nash equilibrium is a sound concept. It provides a solution that any single agent will not want to deviate from it, assuming all other agents stick to it. • However, it does not provide an answer to two major challenges: a. What about stability against deviations by coalitions? b. What happens if agents have “minimal rationality”, and all that we can assume is that an agent will use a strategy if it a dominant strategy maximizing its payoff regardless of other agents’ actions?
Multi-Agent Systems: A Central Challenge • Providing agents with strategy profiles that are stable against deviations by subsets of the agents is a most desired property. • In game theoretic terms: the existence of strong equilibrium is a most desired property. Unfortunately, such situations rarely exist.
Example: the Prisoners Dilemma • In the only equilibrium both agents will defect, yielding both of them a payoff of 1. • If both agents deviate from defection to cooperation then both of them will gain: mutual defection is not a strong equilibrium. • More generally, in larger games, strong equilibrium requires stability against deviations by subsets of the agents.
Mediators • A mediator is a reliable entity that can interact with the agents, in order to try and lead them towards some useful/rational behavior in a game. • A mediator can not enforce behavior. • The classical example: a mediator can flip coins and recommend behavior to a set of agents, in a way which may improve the social surplus (correlated equilibrium and communication equilibrium).
Action Mediators • Action mediator can play in the given game on behalf of agents that give it the right of play. • Agents may decide to participate in the game directly.
Action Mediators: the Prisoners Dilemma The mediator offers the agents the following protocol: • If both agents select to use the mediator’s services then the mediator will perform cooperate on behalf of both agents. • If only one agent selects to use the mediator’s services then the mediator will perform defect on behalf of that agent. Notice that when accepting the mediator’s services the agent is committed to actual behavior as determined by the above protocol. However, there is no way to enforce the agents to accept the suggested protocol, and each agent is free to cooperate or defect without using the mediator’s services.
Actions Mediators: the Prisoners Dilemma • The mediated game has a most desirable property: in this game there is a strong equilibrium; that is, an equilibrium which is stable against deviations by coalitions. • In this equilibrium both agents will use the mediator services, which will lead them to a payoff of 4 each! • We call a strong equilibrium in a mediated game: a strongmediated equilibrium. Hence, we get cooperation in the prisoners dilemma as the outcome of a strong mediated equilibrium.
The Power of Mediators Proposition: Every 2-person game has a strong mediated equilibrium. We show the existence of strong mediated equilibrium in a variety of other basic settings.