1 / 17

Concepts of Game Theory II

Concepts of Game Theory II. The prisioner’s reasoning…. Put yourself in the place of prisoner i (or j )… Reason as follows: Suppose I cooperate … If j cooperates , we both get a payoff of 3. If j defects , then I will get a payoff of 0.

masako
Télécharger la présentation

Concepts of Game Theory II

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Concepts of Game Theory II

  2. The prisioner’s reasoning… • Put yourself in the place of prisoner i (or j)… • Reason as follows: • Suppose I cooperate… • If jcooperates, we both get a payoff of 3. • If jdefects, then I will get a payoff of 0. Best payoff I can be guaranteed to get if I cooperate is 0. • Suppose I defect… • If jcooperates, I get a payoff of 5. • If j defects, then I will get a payoff of 2. Best payoff I can be guaranteed to get if I defect is 2. • In summary: • If I cooperate the worst case is that I will get a payoff of 0 • If I defect the worst case is that I will get a payoff of 2 • I’d prefer a guaranteed payoff of 2 to a payoff of 0! i j

  3. Features of Prisoner’s Dilemma (1) • The individual rational action is defect • This guarantees a pay-off of no worse than 2 • Whereas cooperating guarantees a pay-off of at most 1. • So, defection is the best response to all possible strategies: • Both agents defect and get a pay-off of 2 • But naïve intuition says this is not the best outcome: • They could both cooperate and each get a pay-off of 3!

  4. Features of Prisoner’s Dilemma (2) • This apparent paradox is the fundamental problem of multi-agent interactions. • It seems to imply that cooperation will not occur in societies of self-interested agents. • A real world example: nuclear arms reduction • The prisoner’s dilemma is ubiquitous (very common!) • Can we recover cooperation?

  5. Arguments for Recovering Cooperation • Some conclusions that have been drawn from this analysis: • The game theory notion of rational action is wrong! • Somehow the dilemma is being formulated incorrectly. • Arguments to recover cooperation: • We are not all Machiavellian! • The other prisoner is my twin! • People are not (always) rational! • The shadow of the future…

  6. The Iterated Prisoner’s Dilemma • One answer: play the game more than once • Let’s use an applet: • If you know you will be meeting your opponent again • Then the incentive to defect appears to evaporate. • Cooperation is the rational choice in the infinitely repeated prisoner’s dilemma

  7. Backwards Induction • Suppose you both know that you will play the game exactly n times • On round n, you have an incentive to defect to gain that extra bit of pay-off. • This makes round n-1 the last “real” game, and so you have an incentive to defect there too • And so on… • When playing the prisoner’s dilemma with a • fixed • finite • pre-determined and • commonly known number of rounds, defection is the best strategy.

  8. Axelrod’s Tournament • Suppose you play the prisoner’s dilemma game against a range of opponents. • What single strategy should you use to play against all these opponents so that you maximise your overall pay-off? • Axelrod (1984) investigated this problem with a tournament for computer programs playing the prisoner’s dilemma. Robert Axelrod http://www-personal.umich.edu/~axe/

  9. Strategies • ALL-D • Always defect — the hawk strategy. • TIT-FOR-TAT • On round u=0, cooperate • On round u>0, copy the opponent’s round u-1 move • TESTER • On round u=0, defect. • If the opponent retaliated, then play TIT-FOR-TAT • Otherwise intersperse cooperation and defection • JOSS • As for TIT-FOR-TAT, except periodically defect

  10. Howto succeedinAxelrod’sTournament Axelrod suggests the following: • Don’t be envious • Don’t play as if it were a zero sum game • You don’t have to beat your opponent for you to do well • Be nice (don’t be the first to defect) • Start by cooperating, and reciprocate cooperation • Retaliate appropriately • Always punish defection immediately, • But use “measured” force — don’t overdo it • Don’t hold grudges • Always reciprocate cooperation immediately

  11. Who wins? • In the 1980s tournament, TIT-FOR-TAT won. • But, when paired with a mindless strategy like RANDOM, TIT-FOR-TAT sinks to its opponent's level. • So, it can’t be seen as a “best” strategy. • The tournament was run again in 2004, and TIT-FOR-TAT did not win. • What strategy won, and why?

  12. Game of Chicken i j • Difference to prisoner’s dilemma: • Mutual defection is the most feared outcome. • Strategies (C,D) and (D,C) in Nash equilibrium.

  13. The Stag Hunt (1) • You can hunt deer (cooperate) or hare (defect) • Only if both cooperate will they succeed in catching the deer and receive the maximum pay-off. i j

  14. The Stag Hunt (2) • A pessimist would always hunt hare. • A cautious player who is uncertain about what the other player will choose to do would also hunt hare. • For agents to cooperate in the Stag Hunt, there must be a measure of trust between them. • This measure of trust is a kind of social contract between the players; a contract that requires prior agreement.

  15. A Variation of the Prisoner’s Dilemma • A spatial variant of the iterated prisoner's dilemma • A model for cooperation vs. conflict in groups • It shows spread of • altruism • exploitation for personal gain in an interacting population of agents learning from each other • Initially population consists of cooperators and a certain amount of defectors • Advantage of defection is determined by value of b in the 'payoff matrix' • A player determines its new strategy by selecting the most favourable strategy from itself and its direct neighbours

  16. Variation of the Prisoner’s Dilemma • Applet:

  17. Recommended Reading • An Introduction to Multi-Agent Systems, M. Wooldridge, John Wiley & Sons, 2002. Chapter 6. Also check: • Various applets for the prisoner’s dilemma: http://www.gametheory.net/applets/prisoners.html • Spatial variant of the iterated prisoner’s dilemma: http://prisonersdilemma.groenefee.nl/ • Software for Axelrod’s Tournament: http://www.econ.iastate.edu/tesfatsi/demos/axelrod/axelrodt.htm

More Related