Unit III: The Evolution of Cooperation

Unit III: The Evolution of Cooperation • Can Selfishness Save the Environment? • Repeated Games: the Folk Theorem • Evolutionary Games • A Tournament • How to Promote Cooperation/Unit Review 4/9 7/28 4/14

Repeated Games Some Questions: • What happens when a game is repeated? • Can threats and promises about the future influence behavior in the present? • Cheap talk • Finitely repeated games: Backward induction • Indefinitely repeated games: Trigger strategies

Repeated Games Can threats and promises about future actions influence behavior in the present? Consider the following game, played 2X: C D C3,3 0,5 See Gibbons: 82-104. D5,0 1,1

Repeated Games Draw the extensive form game: (3,3) (0,5) (5,0) (1,1) (6,6) (3,8) (8,3) (4,4) (3,8)(0,10)(5,5)(1,6)(8,3) (5,5)(10,0) (6,1) (4,4) (1,6) (6,1) (2,2)

Repeated Games Now, consider three repeated game strategies: D (ALWAYS DEFECT): Defect on every move. C (ALWAYS COOPERATE): Cooperate on every move. T (TRIGGER): Cooperate on the first move, then cooperate after the other cooperates. If the other defects, then defect forever.

Repeated Games If the game is played twice, the V(alue) to a player using ALWAYS DEFECT (D) against an opponent using ALWAYS DEFECT(D) is: V (D/D) = 1 + 1 = 2 V (C/C) = 3 + 3 = 6 V (T/T) = 3 + 3 = 6 V (D/C) = 5 + 5 = 10 V (D/T) = 5 + 1 = 6 V (C/D) = 0 + 0 = 0 V (C/T) = 3 + 3 = 6 V (T/D) = 0 + 1 = 1 V (T/C) = 3 + 3 = 6

Repeated Games And 3x: V (D/D) = 1 + 1 + 1 = 3 V (C/C) = 3 + 3 + 3 = 9 V (T/T) = 3 + 3 + 3 = 9 V (D/C) = 5 + 5 + 5 = 15 V (D/T) = 5 + 1 + 1 = 7 V (C/D) = 0 + 0 + 0 = 0 V (C/T) = 3 + 3 + 3 = 9 V (T/D) = 0 + 1 + 1 = 2 V (T/C) = 3 + 3 + 3 = 9

Repeated Games Time average payoffs: n=3 V (D/D) = 1 + 1 + 1 = 3 /3 = 1 V (C/C) = 3 + 3 + 3 = 9 /3 = 3 V (T/T) = 3 + 3 + 3 = 9 /3 = 3 V (D/C) = 5 + 5 + 5 = 15 /3 = 5 V (D/T) = 5 + 1 + 1 = 7 /3 = 7/3 V (C/D) = 0 + 0 + 0 = 0 /3 = 0 V (C/T) = 3 + 3 + 3 = 9 /3 = 3 V (T/D) = 0 + 1 + 1 = 2 /3 = 2/3 V (T/C) = 3 + 3 + 3 = 9 /3 = 3

Repeated Games Time average payoffs: n V (D/D) = 1 + 1 + 1 + ... /n = 1 V (C/C) = 3 + 3 + 3 + ... /n = 3 V (T/T) = 3 + 3 + 3 + ... /n = 3 V (D/C) = 5 + 5 + 5 + ... /n = 5 V (D/T) = 5 + 1 + 1 + ... /n = 1 + e V (C/D) = 0 + 0 + 0 + ... /n = 0 V (C/T) = 3 + 3 + 3 + … /n = 3 V (T/D) = 0 + 1 + 1 + ... /n = 1 - e V (T/C) = 3 + 3 + 3 + ... /n = 3

Repeated Games Now draw the matrix form of this game: 1x C D T C3,3 0,5 3,3 D5,0 1,15,0 T 3,3 0,5 3,3

Repeated Games Time Average Payoffs C D T C3,3 0,53,3 If the game is repeated, ALWAYS DEFECT is no longer dominant. D5,0 1,11+e,1-e T 3,3 1-e,1+e3,3

Repeated Games C D T C3,3 0,53,3 … and TRIGGER achieves “a NE with itself.” D5,0 1,11+e,1-e T 3,3 1-e,1+e3,3

Repeated Games Time Average Payoffs T(emptation) > R(eward) > P(unishment)> S(ucker) C D T CR,R S,TR,R DT,S P,PP+e,P-e T R,R P-e,P+eR,R

Discounting The discount parameter, d, is the weight of the next payoff relative to the current payoff. In a indefinitely repeated game, d can also be interpreted as the likelihood of the game continuing for another round (so that the expected number of moves per game is 1/(1-d)). The V(alue) to someone using ALWAYS DEFECT (D) when playing with someone using TRIGGER (T) is the sum of T for the first move, d P for the second, d2P for the third, and so on (Axelrod: 13-4): V (D/T) = T + dP + d2P + … “The Shadow of the Future”

Discounting Writing this as V (D/T) = T + dP + d2P +..., we have the following: V (D/D) = P + dP + d2P + … = P/(1-d) V (C/C) = R + dR + d2R + … = R/(1-d) V (T/T) = R + dR + d2R + … = R/(1-d) V (D/C) = T + dT + d2T + … = T/(1-d) V (D/T) = T + dP + d2P + … = T+ dP/(1-d) V (C/D) = S + dS + d2S + … = S/(1-d) V (C/T) = R + dR + d2R + … = R/(1- d) V (T/D) = S + dP + d2P + … = S+ dP/(1-d) V (T/C) = R + dR + d2R + … = R/(1- d)

Discounting C D T R/(1-d) S/(1-d) R/(1-d) R/(1-d) T/(1-d) R/(1-d) C Discounted Payoffs T > R > P > S 0 > d > 1 T/(1-d) P/(1-d) T + dP/(1-d) S/(1-d) P/(1-d) S + dP/(1-d) D R/(1-d) S + dP/(1-d) R/(1- d) R/(1-d) T + dP/(1-d) R/(1-d) T

Discounting C D T R/(1-d) S/(1-d) R/(1-d) R/(1-d) T/(1-d) R/(1-d) C Discounted Payoffs T > R > P > S 0 > d > 1 T weakly dominates C T/(1-d) P/(1-d) T + dP/(1-d) S/(1-d)P/(1-d) S + dP/(1-d) D R/(1-d) S + dP/(1-d) R/(1- d) R/(1-d)T + dP/(1-d) R/(1-d) T

Discounting Now consider what happens to these values as d varies (from 0-1): V (D/D) = P + dP + d2P + … = P/(1-d) V (C/C) = R + dR + d2R + … = R/(1-d) V (T/T) = R + dR + d2R + … = R/(1-d) V (D/C) = T + dT + d2T + … = T/(1-d) V (D/T) = T + dP + d2P + … = T+ dP/(1-d) V (C/D) = S + dS + d2S + … = S/(1-d) V (C/T) = R + dR + d2R + … = R/(1- d) V (T/D) = S + dP + d2P + … = S+ dP/(1-d) V (T/C) = R + dR + d2R + … = R/(1- d)

Discounting Now consider what happens to these values as d varies (from 0-1): V (D/D) = P + dP + d2P + … = P+ dP/(1-d) V (C/C) = R + dR + d2R + … = R/(1-d) V (T/T) = R + dR + d2R + … = R/(1-d) V (D/C) = T + dT + d2T + … = T/(1-d) V (D/T) = T + dP + d2P + … = T+ dP/(1-d) V (C/D) = S + dS + d2S + … = S/(1-d) V (C/T) =R + dR + d2R + … = R/(1- d) V (T/D) = S + dP + d2P + … = S+ dP/(1-d) V (T/C) = R + dR + d2R + … = R/(1- d) V(D/D) > V(T/D) D is a best response to D

Discounting Now consider what happens to these values as d varies (from 0-1): V (D/D) = P + dP + d2P + … = P+ dP/(1-d) V (C/C) = R + dR + d2R + … = R/(1-d) V (T/T) = R + dR + d2R + … = R/(1-d) V (D/C) = T + dT + d2T + … = T/(1-d) V (D/T) = T + dP + d2P + … = T+ dP/(1-d) V (C/D) = S + dS + d2S + … = S/(1-d) V (C/T) = R + dR + d2R + … = R/(1- d) V (T/D) = S + dP + d2P + … = S+ dP/(1-d) V (T/C) = R + dR + d2R + … = R/(1- d) 2 1 3 ?

Discounting Now consider what happens to these values as d varies (from 0-1): For all values of d: V(D/T) > V(D/D) > V(T/D) V(T/T) > V(D/D) > V(T/D) Is there a value of d s.t., V(D/T) = V(T/T)? Call this d*. If d < d*,the following ordering hold: V(D/T) > V(T/T) > V(D/D) > V(T/D) D is dominant: GAME SOLVED ? V(D/T) = V(T/T) T+dP(1-d) = R/(1-d) T-dt+dP = R T-R = d(T-P) d* = (T-R)/(T-P)

Discounting Now consider what happens to these values as d varies (from 0-1): For all values of d: V(D/T) > V(D/D) > V(T/D) V(T/T) > V(D/D) > V(T/D) Is there a value of d s.t., V(D/T) = V(T/T)? Call this d*. d*= (T-R)/(T-P) If d > d*,the following ordering hold: V(T/T) > V(D/T) > V(D/D) > V(T/D) D is a best response to D;T is a best response to T; multiple NE.

Discounting Graphically: The V(alue) to a player using ALWAYS DEFECT (D) against TRIGGER (T), and the V(T/T) as a function of the discount parameter (d) V T R V(D/T) = T + dP/(1-d) V(T/T) = R/(1-d) d* 1

The Folk Theorem The payoff set of the repeated PD is the convex closure of the points [(T,S); (R,R); (S,T); (P,P)]. (S,T) (R,R) (P,P) (T,S)

The Folk Theorem The shaded area is the set of payoffs that Pareto-dominate the one-shot NE (P,P). (S,T) (R,R) (P,P) (T,S)

The Folk Theorem Theorem: Any payoff that pareto-dominates the one-shot NE can be supported in a SPNE of the repeated game, if the discount parameter is sufficiently high. (S,T) (R,R) (P,P) (T,S)

The Folk Theorem In other words, in the repeated game, if the future matters “enough” i.e., (d > d*), there are zillions of equilibria! (S,T) (R,R) (P,P) (T,S)

The Folk Theorem • The theorem tells us that in general, repeated games give rise to a very large set of Nash equilibria. In the repeated PD, these are pareto-rankable, i.e., some are efficient and some are not. • In this context,evolution can be seen as a process that selects for repeated game strategies with efficient payoffs. “Survival of the Fittest”

Evolutionary Games Fifteen months after I had begun my systematic enquiry, I happened to read for amusement ‘Malthus on Population’ . . . It at once struck me that . . . favorable variations would tend to be preserved, and unfavorable ones to be destroyed. Here then I had at last got a theory by which to work. Charles Darwin

Evolutionary Games • Evolutionary Stability (ESS) • Hawk-Dove: an example • The Replicator Dynamic • The Trouble with TIT FOR TAT • Designing Repeated Game Strategies • Finite Automata

Evolutionary Games Biological Evolution: Under the pressure of natural selection, any population (capable of reproduction and variation) will evolve so as to become better adapted to its environment, i.e., will develop in the direction of increasing “fitness.” Economic Evolution: Firms that adopt efficient “routines” will survive, expand, and multiply; whereas others will be “weeded out” (Nelson and Winters, 1982).

Evolutionary Stability Evolutionary Stable Strategy (ESS): A strategy is evolutionarily stable if it cannot be invaded by a mutant strategy. (Maynard Smith & Price, 1973) A strategy, A, is ESS, if i) V(A/A) > V(B/A), for all B ii) either V(A/A) > V(B/A) or V(A/B) > V(B/B), for all B

Hawk-Dove: an example Imagine a population of Hawks and Doves competing over a scarce resource (say food in a given area). The share of each type in the population changes according to the payoff matrix, so that payoffs determine the number of offspring left to the next generation. v = value of the resource c = cost of fighting H/D: Hawk gets resource; Dove flees (v, 0) D/D: Share resource (v/2, v/2) H/H: Share resource less cost of fighting ((v-c)/2, (v-c)/2) (See Hargreave-Heap and Varoufakis: 195-214; Casti: 71-75.)

Hawk-Dove: an example H D v = value of resource c = cost of fighting H(v-c)/2,(v-c)/2 v,0 D0,vv/2,v/2

Hawk-Dove: an example H D v = value of resource = 4 c = cost of fighting = 6 H-1,-1 4,0 D0,4 2, 2

Hawk-Dove: an example H D NE = {(1,0);(0,1);(2/3,2/3)} unstable stable H-1,-1 4,0 D0,4 2, 2 The mixed NE corresponds to a population that is 2/3 Hawks and 1/3 Doves

Hawk-Dove: an example H D NE = {(1,0);(0,1);(2/3,2/3)} unstable stable H-1,-1 4,0 D0,4 2, 2 Is any strategy ESS?

H D A strategy, A, is ESS, if i) V(A/A) > V(B/A), for all B ii) either V(A/A) > V(B/A) or V(A/B) > V(B/B), for all B EP2(O) = 3p EP2(F) = 5-5p p* = 5/8 Hawk-Dove: an example H D -1,-1 4,0 0,4 2,2 NE = {(1,0);(0,1);(2/3,2/3)}

H D A strategy, A, is ESS, if i) V(A/A) > V(B/A), for all B In other words, to be ESS, a strategy must be a NE with itself. Neither H nor D is ESS. (For these payoffs.)EP2(O) = 3p EP2(F) = 5-5p p* = 5/8 Hawk-Dove: an example H D -1,-1 4,0 0,4 2,2 NE = {(1,0);(0,1);(2/3,2/3)}

H D A strategy, A, is ESS, if i) V(A/A) > V(B/A), for all B ii) either V(A/A) > V(B/A) or V(A/B) > V(B/B), for all B What about the mixed NE strategy?= 3p EP2(F) = 5-5p p* = 5/8 Hawk-Dove: an example H D -1,-1 4,0 0,4 2,2 NE = {(1,0);(0,1);(2/3,2/3)}

H D V(H/H) = -1 V(H/D) = 4 V(D/H) = 0 V(D/D) = 2 V(H/M) = 2/3V(H/H)+1/3V(H/D) = 2/3 V(M/H) = 2/3V(H/H)+1/3V(D/H) = -2/3 V(D/M) = 2/3V(D/H)+1/3V(D/D) = 2/3 V(M/D) = 2/3V(H/D)+1/3V(D/D) = 10/3 V(M/M) = 2/3V(D/H)+1/3V(D/D) = 2/3 Hawk-Dove: an example H D -1,-1 4,0 0,4 2,2 Where M is the mixed strategy 2/3 Hawk, 1/3 Dove NE = {(1,0);(0,1);(2/3,2/3)}

H D V(H/H) = -1 V(H/D) = 4 V(D/H) = 0 V(D/D) = 2 V(H/M) = 2/3V(H/H)+1/3V(H/D) = 2/3 V(M/H) = 2/3 ( -1 ) +1/3 ( 4 ) = 2/3 V(D/M) = 2/3V(D/H)+1/3V(D/D) = 2/3 V(M/D) = 2/3V(H/D)+1/3V(D/D) = 10/3 V(M/M) = 2/3V(D/H)+1/3V(D/D) = 2/3 Hawk-Dove: an example H D -1,-1 4,0 0,4 2,2 NE = {(1,0);(0,1);(2/3,2/3)}

H D V(H/H) = -1 V(H/D) = 4 V(D/H) = 0 V(D/D) = 2 V(H/M) = 2/3V(H/H)+1/3V(H/D) = 2/3 V(M/H) = 2/3V(H/H)+1/3V(D/H) = -2/3 V(D/M) = 2/3V(D/H)+1/3V(D/D) = 2/3 V(M/D) = 2/3V(H/D)+1/3V(D/D) = 10/3 V(M/M) = 4/9V(H/H)+2/9V(H/D) = 2/9V(D/H)+1/9V(D/D) = 2/3 Hawk-Dove: an example H D -1,-1 4,0 0,4 2,2 NE = {(1,0);(0,1);(2/3,2/3)}

H D To be an ESS i) V(M/M) > V(B/M), for all B ii) either V(M/M) > V(B/M) or V(M/B) > V(B/B), for all B (O) = 3p EP2(F) = 5-5p p* = 5/8 Hawk-Dove: an example H D -1,-1 4,0 0,4 2,2 NE = {(1,0);(0,1);(2/3,2/3)}

H D To be an ESS i) V(M/M) = V(H/M) = V(D/M) = 2/3 ii) either V(M/M) > V(B/M) or V(M/B) > V(B/B), for all B (O) = 3p EP2(F) = 5-5p p* = 5/8 Hawk-Dove: an example H D -1,-1 4,0 0,4 2,2 V(M/D) > V(D/D) 10/3 > 2 V(M/H) > V(H/H) -2/3 > -1 NE = {(1,0);(0,1);(2/3,2/3)}

Evolutionary Stability in IRPD? Evolutionary Stable Strategy (ESS): A strategy is evolutionarily stable if it cannot be invaded by a mutant strategy. (Maynard Smith & Price, 1973) Is D an ESS? i) V(D/D) > V(STFT/D) ? ii) V(D/D) > V(STFT/D) or V(D/STFT) > V(STFT/STFT) ? Consider a mutant strategy called e.g., SUSPICIOUS TIT FOR TAT (STFT). STFT defects on the first round, then plays like TFT

Evolutionary Stability in IRPD? Evolutionary Stable Strategy (ESS): A strategy is evolutionarily stable if it cannot be invaded by a mutant strategy. (Maynard Smith & Price, 1973) Is D an ESS? i) V(D/D) = V(STFT/D) ii) V(D/D) = V(STFT/D) or V(D/STFT) = V(STFT/STFT) Consider a mutant strategy called e.g., SUSPICIOUS TIT FOR TAT (STFT). STFT defects on the first round, then plays like TFT D and STFT are “neutral mutants”

Evolutionary Stability in IRPD? • Axelrod & Hamilton (1981) demonstrated that D is not an ESS, opening the way to subsequent tournament studies of the game. • This is a sort-of Folk Theorem for evolutionary games: In the one-shot Prisoner’s Dilemma, DEFECT is strictly dominant. But in the repeated game, ALWAYS DEFECT (D) can be invaded by a mutant strategy, e.g., SUSPICIOUS TIT FOR TAT (STFT). • Many cooperative strategies do better than D, thus they can gain a foothold and grow as a share of the population. • Depending on the initial population, the equilibrium reached can exhibit any amount of cooperation. • Is STFT an ESS?

Evolutionary Stability in IRPD? It can be shown that there is no ESS in IRPD (Boyd & Lorberbaum, 1987; Lorberbaum, 1994). There can be stable polymorphisms among neutral mutants, whose realized behaviors are indistinguishable from one another. (This is the case, for example, of a population of C and TFT). Noise If the system is perturbed by “noise,” these behaviors become distinct and differences in their reproductive success rates are amplified. As a result, interest has shifted from the proof of the existence of a solution to the design of repeated game strategies that perform well against other sophisticated strategies.

Unit III: The Evolution of Cooperation