Chapter 2

Chapter 2 Decisions and Games

“Доверяй, Но Проверяй” (“Trust, but Verify”) - Russian Proverb (Ronald Reagan)

Criteria for evaluating systems • Computational efficiency • Distribution of computation • Communication efficiency • Social welfare: maxoutcome ∑i ui(outcome) where ui is the utility for player i. • Surplus: social welfare of outcome – social welfare of status quo • Constant sum games have 0 surplus. Markets are not constant sum • Pareto efficiency: An outcome o is Pareto efficient if there exists no other outcome o’ s.t. some agent has higher utility in o’ than in o and no agent has lower utility • Implied by social welfare maximization • Individual rationality: Participating in the negotiation (or individual deal) is no worse than not participating • Stability: No agents can increase their utility by changing their strategies (given everyone else keeps the same strategy) • Symmetry: No agent should be inherently preferred, e.g. dictator

The term pareto efficient… • The term pareto efficient is named after Vilfredo Pareto, an Italian economist who used the concept in his studies of economic efficiency and income distribution. • If an economic system is not Pareto efficient, then it is the case that some individual can be made better off without anyone being made worse off. It is commonly accepted that such inefficient outcomes are to be avoided, and therefore Pareto efficiency is an important criterion for evaluating economic systems and political policies. • He is also the one credited with the 80/20 rule to describe the unequal distribution of wealth in his country, observing that twenty percent of the people owned eighty percent of the wealth.

Strategic Form Game • A game: Formal representation of a situation of strategic interdependence • Set of players, I |I|=n • Each agent, j, has a set of actions, Aj • AKA strategy set • Actions define outcomes • AKA strategic combination • For each possible set of actions, there is an outcome. • Outcomes define payoffs • Agents’ derive utility from different outcomes

Agent 2 H T H -1, 1 1, -1 Agent 1 -1, 1 T 1, -1 Normal form game*(matching pennies) Action Outcome Payoffs *aka strategic form, matrix form

Extensive form game(matching pennies) Player 2 doesn’t know what has been played so he doesn’t know which node he is at. How fair would it be to say, “Let’s play matching pennies. You go first.” ? Player 1 Action T H Player 2 H T T H Terminal node (outcome) (-1,1) (-1,1) (1,-1) (1,-1) Payoffs (player1,player 2)

Strategies • Strategy: • A strategy, sj, is a complete contingency plan; defines actions which agent j should take for all possible states of the world • Strategy profile: s=(s1,…,sn) • s-i = (s1,…,si-1,si+1,…,sn) • Utility function: ui(s) • Note that the utility of an agent depends on the strategy profile, not just its own strategy • We assume agents are expected utility maximizers

Normal form game*(matching pennies) Strategy for agent 1: H Strategy for agent 2: T Agent 2 H T H Strategy profile (H,T) -1, 1 1, -1 Agent 1 U1((H,T))=1 U2((H,T))=-1 -1, 1 T 1, -1 *aka strategic form, matrix form

T H H T T H (-1,1) (-1,1) (1,-1) (1,-1) Extensive form game(matching pennies, sequential moves) Recall: A strategy is a contingency plan for all states of the game. Now we have different states to worry about. Strategy for agent 1: T Strategy for agent 2: H if 1 plays H, T if 1 plays T so (H,T) means H if 1 plays H, T if 1 plays T. (First value is associated with specific move of other player.) Strategy profile: (T,(H,T)) U1((T,(H,T)))=-1 U2((T,(H,T)))=1

Dominant Strategies • Recall that • Agents’ utilities depend on what strategies other agents are playing • Agents’ are expected utility maximizers • Agents’ will play best-response strategies (if they exist) • si* is a best response if ui(si*,s-i)ui(si’,s-i) for all si’ • A dominant strategy is a best-response for player i which is the best for all s-i • They do not always exist • Inferior strategies are called dominated

Dominant Strategy Equilibrium • A dominant strategy equilibrium is a strategy profile where the strategy for each player is dominant (so neither wants to change) s*=(s*1,…,s*n) ui(s*i,s-i)ui(s’i,s-i) for all i, for all s’i, for all s-i • Known as “DUH” strategy. • Nice: Agents do not need to counterspeculate (reciprocally reason about what others will do)!

Prisoners’ dilemma Two people are arrested for a crime. If neither suspect confesses, both get light sentence. If both confess, then they get sent to jail. If one confesses and the other does not, then the confessor gets no jail time and the other gets a heavy sentence. Ned Don’t Confess Confess Confess Kelly Don’t Confess

Prisoners’ dilemma Note that no matter what Ned does, Kelly is better off if she confesses than if she does not confess. So ‘confess’ is a dominant strategy from Kelly’s perspective. We can predict that she will always confess. Ned Don’t Confess Confess Confess Kelly Don’t Confess

Prisoners’ dilemma The same holds for Ned. Ned Don’t Confess Confess Confess Kelly Don’t Confess

Prisoners’ dilemma So the only outcome that involves each player choosing their dominant strategies is where they both confess. Solve by iterative elimination of dominant strategies Ned Don’t Confess Confess Confess Kelly Don’t Confess

Dom. Str. Eq not pareto optimal Optimal Outcome Example: Prisoner’s Dilemma • Two people are arrested for a crime. If neither suspect confesses, both get light sentence. If both confess, then they get sent to jail. If one confesses and the other does not, then the confessor gets no jail time and the other gets a heavy sentence. • (Actual numbers vary in different versions of the problem, but relative values are the same) Pareto optimal Don’t Confess Confess Confess Don’t Confess

Iterated Elimination of Dominated Strategies • Let RiSibe the set of removed strategies for agent i • Initially Ri=Ø • Choose agent i, and strategy si such that siSi\Ri(Si subtract Ri) and there exists si’ Si\Ri such that • Add si to Ri, continue • Thm: If a unique strategy profile, s*, survives iterated elimination, then it is a Nash Eq. • Thm: If a profile, s*, is a Nash Eq then it must survive iterated elimination. ui(si’,s-i)>ui(si,s-i) for all s-iS-i\R-i

A simple competition game Note – no player has a dominant strategy. But low is dominated for both players. So we can predict that neither will play low. Pierce High Medium Low High Medium Donna Low

A simple competition game Once we have removed low, medium is now a dominant strategy. So we predict that both Pierce and Donna will play medium. Pierce High Medium Low High Medium Donna Low

Example – Zero Sum (We divide the same cake. If I lose, you win.) bi matrix form • Cake slicing • Two players • cutter • chooser

Rationality • Rationality • each player will take highest utility option • taking into account the other player's likely behavior • In example • if cutter cuts unevenly • he might like to end up in the lower right • but the other player would never do that • -10 • if the current cuts evenly, • he will end up in the upper left • -1 • this is a stable outcome • neither player has an incentive to deviate

Classic Examples • Car Dealers • Why are they always next to each other? • Why aren't they spaced equally around town? • Optimal in the sense of not drawing customers to the competition • Equilibrium • because to move away from the competitor is to cede some customers to it

Decision Tree • Examines game interactions over time • Each node • Is a unique game state • Player choices • create branches • Leaves • end of game (win/lose) • Important concept for design • usually at abstract level • Example • tic-tac-toe

Example: Bach or Stravinsky • A couple likes going to concerts together. One loves Bach but not Stravinsky. The other loves Stravinsky but not Bach. However, they prefer being together than being apart. B S No dominant strategy equilibrium B S

Nash Equilibrium • Sometimes an agent’s best-response depends on the strategies other agents are playing • No dominant strategy equilibria • A strategy profile is a Nash equilibrium if no player has incentive to deviate from his strategy given that others do not deviate. • Need to know that others are playing fixed choice • for every agent i, ui(si*,s-i) ≥ ui(si’,s-i) for all si’ B S B S

Example: Mozart Mahler • A couple likes going to concerts together. Both prefer Mozart. Two Nash Equilibrium. (Mozart, Mozart) is better, but Nash Equilibrium also exists at (Mahler, Mahler) Mozart Mahler Mozart Mahler

Example – Rock, scissors, paper • Players – Ernie and Bert • Strategies – Rock, Scissors, Paper • Payoffs • If choose the same strategy, neither wins. • If one chooses rock and other chooses scissors, then rock wins $1 from other. • If one chooses rock and other chooses paper, then paper wins $1 from other. • If one chooses paper and other chooses scissors, then scissors wins $1 from other.

Example – Rock, scissors, paper Bert No Nash Equilbrium Rock Scissors Paper Rock Ernie Scissors Paper

Example: Hawk Dove • Two animals fight over prey. Best outcome is for one to act like Hawk and other to act like Dove. Two Nash Equilbria. Hawk Dove Dove Hawk

Solutions to simultaneous games If there is no unique solution in dominant/dominated strategies then we use ‘mutual best response analysis’ to find a Nash equilibrium. An outcome is a Nash equilibrium, if each player -- holding the choices of all other players as constant -- cannot do better by changing their own choice. So where all players are playing their ‘best response’, this is a Nash equilibrium.

Roommate 2 3 hours 6 hours 9 hours Roommate 1 3 hours 1, 1 2, - 4 3, - 8 6 hours - 4, 2 4, 4 6, - 2 9 hours - 8, 3 -2, 6 3, 3 How much will we clean?

Roommate 2 3 hours 6 hours 9 hours Room-mate 1 3 hours 1, 1 2, - 4 3, - 8 6 hours - 4, 2 4, 4 6, - 2 9 hours - 8, 3 -2, 6 3, 3 How much will we clean? Best responses for Roommate 1: (best first value in each column)

Roommate 2 3 hours 6 hours 9 hours Room-mate 1 3 hours 1, 1 2, - 4 3, - 8 6 hours - 4, 2 4, 4 6, - 2 9 hours - 8, 3 -2, 6 3, 3 How much will we clean? Best responses for Roommate 2: best 2nd value in each row

Roommate 2 3 hours 6 hours 9 hours Room-mate 1 3 hours 1, 1 2, - 4 3, - 8 6 hours - 4, 2 4, 4 6,- 2 9 hours - 8, 3 -2, 6 3, 3 Best response for both:(Mutual best response) Two Nash Equilibria

concepts of rationality [doing the rational thing] • undominated strategy (problem: too weak) can’t always find a single one • (weakly) dominating strategy (alias “duh?”) (problem: too strong, rarely exists) • Nash equilibrium (or double best response) (problem: may not exist) • randomized (mixed) Nash equilibrium – players choose various options based on some random number (assigned via a probability) Theorem [Nash 1952]: randomized Nash Equilibrium always exists. . . .

Why is a Nash equilibrium a sensible solution? • A Nash equilibrium can be viewed as a self-reinforcing agreement (e.g. what is reasonable if players can talk before the game but cannot sign binding contracts). • A Nash equilibrium can be viewed as a consistent set of conjectures by all players recognising their strategic interdependence. • A Nash equilibrium can be viewed as the result of ‘learning’ over time

Nash Equilibrium • Interpretations: • Focal points, self-enforcing agreements, stable social convention, consequence of rational inference.. • Criticisms • They may not be unique (Bach or Stravinsky) • Ways of overcoming this • Refinements of equilibrium concept, Mediation, Learning • Do not exist in all games • They may be hard to find (if lots of choices) • People don’t always behave based on what equilibria would predict (ultimatum games and notions of fairness,…)

Nash Equilibrium Test(for continuous choices) • If utilities can be represented as a function ui:S1xS2x…Sn • Can find Nash equilibrium if each si* is selected to make partial derivative with respect to si equal to zero. In other words: • If each si* is the only solution • And

Example • u1(x,y,z) = 2xz – x2y • u2(x,y,z) = • u3(x,y,z) = 2z – xyz2 • du1/dx = 2z-2xy = 0 • du2/dy = 0 = • du3/dz = 0 = 2-2xyz • Solution (1,1,1)

How do we tell if a Nash Equilibrium exists? • In a zero sum game, we say player 1 maximinimizes if he chooses an action that is best for him on the assumption that player j will chose her action to hurt him as much as possible. • A Nash equilbrium exists iff the action of each is a maxminimizer

Fixed Points • Let a* be a profile of actions such that a*i Bi(a*-i) where B is the “best response” function. In other words, Bi says that if other responses are known, a*i is the best for player i. • Fixed point theorems give conditions on B under which there exists a value of a* such that a* B(a*). In other words, given what other people will do, no one will change.

Intuition behind Brouwer’s fixed point theorem • Take two sheets of paper, one lying directly above the other. Draw a grid on the paper, number the gridboxes, then xerox that sheet of paper. Crumple the top sheet, and place it on top of the other sheet. You will see that at least one number is on top of the corresponding number on the lower sheet of paper. Brouwer's theorem says that there must be at least one point on the top sheet that is directly above the corresponding point on the bottom sheet. • In dimension three, Brouwer's theorem says that if you take a cup of coffee, and slosh it around, then after the sloshing there must be some point in the coffee which is in the exact spot that it was before you did the sloshing (though it might have moved around in between). Moreover, if you tried to slosh that point out of its original position, you can't help but slosh another point back into its original position.

Brouwer’s fixed point theorem in dimension one • Theorem: • Let f : [0, 1] → [0, 1] be a continuous function. Then, there exists a fixed point, i.e. there is a x* in [0, 1] such that f (x*) = x*. • Proof: There are two essential possibilities: (i) if f(0) = 0 or if f(1) = 1, then we are done. (ii) if f (0)≠0 and f(1)≠1, then define F(x) =f(x) - x. In this case:F(0) = f(0) - 0 =f(0) > 0F(1) = f(1) - 1 < 0So F: [0, 1] → R, where F(0)·F(1) < 0. As f(.) is continuous, then F(.) is also continuous. Then by using the Intermediate Value Theorem, there is a x* in [0, 1] such that F(x*) = 0. By the definition of F(.), then F(x*) = f (x*) - x* = 0, thus f (x*) = x*.

General statement of Brouwer’s fixed point theorem • Theorem: Any continuous function from a closed n-dimensional ball into itself must have at least one fixed point. • Continuity of the function is essential (if you rip the paper or if you slosh discontinuously, then there may not be fixed point). • The closure of the ball is also essential; there exists continuous mapping f:(0,1)→(0,1) with no fixed points. • The round shape of the ball is not essential; instead one can replace it by any shape obtained by a continuous deformation of the ball. However, one cannot replace it by a something with `holes', like a donut shape.

Applications of Brouwer’s fixed point theorem • Topology is a branch of pure mathematics devoted to the shape of objects. It ignores issues like size and angle, which are important in geometry. • For this reason, it is sometimes called rubber-sheet geometry. • One important problem in topology is the study of the conditions under which any transformation of a certain domain has a point that remains fixed. • Fixed point theorems are some of the most important theorems in all of mathematics. Among other applications, they are used to show the existence of solutions to differential equations, as well as the existence of equilibria in game theory. • The Brouwer fixed point theorem was a main mathematical tool in John Nash’s papers, for which he has won a Nobel prize in economics.

History • Brouwer was a major contributor to the theory of topology. He did almost all his work in topology between 1909 and 1913. He discovered characterizations of topological mappings of the Cartesian plane and a number of fixed point theorems. • He later rejected many of his results, as being “non-constructive”. • Brouwer founded the doctrine of mathematical intuitionism, in which a nonconstructive argument cannot be accepted as proof of existence. • He gave grounds to reject the law of excluded middle (proof by contradiction), which many logicians had taken to be true for all statements, going back a millenium or two. • Intuitionistic logic does not permit the inference: • not(not(p)) => (p) Luitzen Egbertus Jan Brouwer Born: Feb 27, 1881 in NetherlandsDied: Dec 2, 1966 in Netherlands

Mixed strategy equilibria ii defines a probability distribution over Si • i(sj)) is the probability player i selects strategy sj • (0,0,…1,0,…0) is a pure strategy • Strategy profile: =(1,…, n) • Expected utility: ui()=sS(j (sj))ui(s) • (chance the combination occurs times utility) • Nash Equilibrium: • * is a (mixed) Nash equilibrium if ui(*i, *-i)ui(i, *-i) for all ii, for all i

Example: Matching Penniesno pure strategy Nash Equilibrium H T H T So far we have talked only about pure strategy equilibria [I make one choice.]. Not all games have pure strategy equilibria. Some equilibria are mixed strategy equilibria.

Example: Matching Pennies q H 1-q T p H 1-p T Want to play each strategy with a certain probability. If player 1 is optimally mixing strategies, player 1 is indifferent to what player 2 does. Compute expected utility given each pure possibility of other player.

Chapter 2

Chapter 2

Presentation Transcript

Chapter 2

Chapter 2

Chapter 2

Chapter 2

Chapter 2

Chapter 2

Chapter 2

Chapter 2

Chapter 2

Chapter 2:

Chapter 2

chapter 2

chapter 2

Chapter 2-2

CHAPTER 2

Chapter 2

Chapter 2

CHAPTER 2

Chapter 2

Chapter 2

CHAPTER 2

Chapter 2