Dynamic Repeated Games: Equilibrium and History | Microeconomics Analysis

Prerequisites Almost essential Game Theory: Dynamic Repeated Games MICROECONOMICS Principles and Analysis Frank Cowell January 2007

Overview... Repeated Games Basic structure Embedding the game in context… Equilibrium issues Applications

Introduction • Another examination of the role of time • Dynamic analysis can be difficult • more than a few stages… • …can lead to complicated analysis of equilibrium • We need an alternative approach • but one that preserves basic insights of dynamic games • such as subgame-perfect equilibrium • Build on the idea of dynamic games • introduce a jump • from the case of comparatively few stages… • …to the case of arbitrarily many

Repeated games • The alternative approach • take a series of the same game • embed it within a time-line structure • Basic idea is simple • by connecting multiple instances of an atemporal game… • …model a repeated encounter between the players in the same situation of economic conflict • Raises important questions • how does this structure differ from an atemporal model? • how does the repetition of a game differ from a single play? • how does it differ from a collection of unrelated games of identical structure with identical players?

History • Why is the time-line different from a collection of unrelated games? • The key is history • history at any point on the timeline… • …is the information about actual play… • …accumulated up to that point • History can affect the nature of the game • at any stage all players can know all the accumulated information • strategies can be conditioned on this information • History can play a role in the equilibrium • some outcomes that aren’t equilibria in a single encounter… • …may yet be equilibria outcomes in the repeated game • the game’s history is used to support such outcomes

Repeated games: Structure • The stage game • take an instant in time • specify a simultaneous-move game • payoffs completely specified by actions within the game • Repeat the stage game indefinitely • there’s an instance of the stage game at time 0,1,2,…,t,… • the possible payoffs are also repeated for each t • payoffs at t depends on actions in stage game at t • A modified strategic environment • all previous actions assumed as common knowledge • so agents’ strategies can be conditioned on this information • Modifies equilibrium behaviour and outcome?

Equilibrium • Simplified structure has potential advantages • whether significant depends on nature of stage game • concern nature of equilibrium • Possibilities for equilibrium • new strategy combinations supportable as equilibria? • long-term cooperative outcomes… • …absent from a myopic analysis of a simple game • Refinements of subgame perfection simplify the analysis: • can rule out empty threats… • …and incredible promises • disregard irrelevant “might-have-beens”

Overview... Repeated Games Basic structure Developing the basic concepts… Equilibrium issues Applications

Equilibrium: an approach • Focus on key question in repeated games: • how can rational players use the information from history? • need to address this to characterise equilibrium • Illustrate a method in an argument by example • Outline for the Prisoner's Dilemma game • same players face same outcomes from their actions that they may choose in periods 1, 2, ..., t, .... • Prisoner's Dilemma particularly instructive given: • its importance in microeconomics • pessimistic outcome of an isolated round of the game

Prisoner’s dilemma: Reminder • Payoffs in stage game Alf [LEFT] 2,2 0,3 • If Alf plays [RIGHT] then Bill’s best response is [right]. • If Bill plays [right] then Alf’s best response is [RIGHT]. [RIGHT] 3,0 1,1 • Nash Equilibrium • Outcome that Pareto dominates NE [left] [right] • The highlighted NE is inefficient • Could the Pareto-efficient outcome be an equilibrium in the repeated game? • Look at the structure… Bill

Alf [RIGHT] [LEFT] Alf Alf Alf Alf 2 2 2 2 [RIGHT] [RIGHT] [RIGHT] [RIGHT] [LEFT] [LEFT] [LEFT] [LEFT] Bill [left] [right] [left] [right] Bill Bill Bill Bill [left] [left] [left] [left] [right] [right] [right] [right] [left] [left] [left] [left] [right] [right] [right] [right] (1,1) (2,2) (0,3) (3,0) (1,1) (1,1) (1,1) (1,1) (2,2) (2,2) (2,2) (2,2) (0,3) (0,3) (0,3) (0,3) (3,0) (3,0) (3,0) (3,0) Repeated Prisoner's dilemma • Stage game (t=1) 1 • Stage game (t=2) follows here… • or here… • or here… • or here… • Repeat this structure indefinitely…?

Alf Alf [RIGHT] [RIGHT] [LEFT] [LEFT] Bill Bill [left] [left] [right] [right] [left] [left] [right] [right] ... ... ... ... ... ... (1,1) (1,1) (2,2) (2,2) (0,3) (0,3) (3,0) (3,0) Repeated Prisoner's dilemma • The stage game… 1 • …repeated though time t Let's look at the detail

Repeated PD: payoffs • To represent possibilities in long run: • first consider payoffs available in the stage game • then those available through mixtures • In the one-shot game payoffs simply represented • it was enough to denote them as 0,…,3 • purely ordinal… • …arbitrary monotonic changes of the payoffs have no effect • Now we need a generalised notation • cardinal values of utility matter • we need to sum utilities, compare utility differences • Evaluation of a payoff stream: • suppose payoff to agent h in period t is uh(t) • value of (uh(1), uh(2),..., uh(t)...) is given by ∞ [1d] ∑dt1uh(t) t=1 • where d is a discount factor 0 < d < 1

PD: stage game • A generalised notation for the stage game • consider actions and payoffs… • …in each of four fundamental cases • Both socially irresponsible: • they play [RIGHT], [right] • get (ua, ub) whereua> 0, ub> 0 • Both socially responsible: • they play [LEFT],[left] • get (u*a, u*b) where u*a > ua, u*b > ub • Only Alf socially responsible: • they play [LEFT], [right] • get ( 0,ùb) whereùb > u*b • Only Bill socially responsible: • they play [RIGHT], [left] • get (ùa, 0) where ùa > u*a A diagrammatic view

ub _ ub (ua, ub ) _ _ ua 0 _ ua Repeated Prisoner’s dilemma payoffs • Space of utility payoffs • Payoffs for Prisoner's Dilemma • Nash-Equilibrium payoffs • Payoffs Pareto-superior to NE • Payoffs available through mixing • • Feasible, superior points • "Efficient" outcomes • (u*a, u*b ) U* • •

Choosing a strategy: setting • Long-run advantage in the Pareto-efficient outcome • payoffs (u*a, u*b) in each period… • …clearly better than (ua, ub) in each period • Suppose the agents recognise the advantage • what actions would guarantee them this? • clearly they need to play [LEFT], [left] every period • The problem is lack of trust: • they cannot trust each other… • …nor indeed themselves: • Alf tempted to be antisocial and get payoff`ua by playing [RIGHT] • Bill has a similar temptation

Choosing a strategy: formulation • Will a dominated outcome still be inevitable? • Suppose each player adopts a strategy that • rewards the other party's responsible behaviour by responding with the action [left] • punishes antisocial behaviour with the action [right], thus generating the minimax payoffs(ua, ub) • Known as a trigger strategy • Why the strategy is powerful • punishment applies to every period after the one where the antisocial action occurred • if punishment invoked offender is “minimaxed for ever” • Look at it in detail

Repeated PD: trigger strategies sTa • Take situation at t Bill’s action in 0,…,t Alf’s action at t+1 • First type of history [left][left],…,[left] • Response of other player to continue this history [LEFT] • Second type of history Anything else [RIGHT] • Punishment response • Trigger strategies [sTa, sTb] sTb Alf’s action in 0,…,t Bill’s action at t+1 Will it work? [left] [LEFT][LEFT],…,[LEFT] Anything else [right]

Will the trigger strategy “work”? • Utility gain from “misbehaving” at t: ùa− u*a • What is value at t of punishment from t+1 onwards? • Difference in utility per period: u*a− ua • Discounted value of this in period t+1: V := [u*a− ua]/[1−d ] • Value of this in period t:dV = d[u*a− ua]/[1−d ] • So agent chooses not to misbehave if • ùa− u*a≤ d[u*a− ua ]/[1−d ] • But this is only going to work for specific parameters • value of d… • … relative to ùa, ua and u*a • What values of discount factor will allow an equilibrium?

Discounting and equilibrium • For an equilibrium condition must be satisfied for both a and b • Consider the situation of a • Rearranging the condition from the previous slide: • d[u*a− ua ] ≥ [1−d] [ùa− u*a] • d[ùa− ua ] ≥ [ùa− u*a] • Simplifying this the condition must be • d ≥ d*a • where da := [ùa− u*a] / [ùa− ua] • A similar result must also apply to agent b • Therefore we must have the condition: • d ≥ d • where d := max {da , db}

Repeated PD: SPNE • Assuming d ≥ d, take the strategies [sTa, sTb] prescribed by the Table • If there were antisocial behaviour at t consider the subgame that would then start at t+1 • Alf could not increase his payoff by switching from [RIGHT] to [LEFT], given that Bill is playing [left] • a similar remark applies to Bill • so strategies imply a NE for this subgame • likewise for any subgame starting after t+1. • But if [LEFT],[left] has been played in every period up till t: • Alf would not wish to switch to [RIGHT] • a similar remark applies to Bill • again we have a NE • So, if d is large enough, [sTa, sTb] is a Subgame-Perfect Equilibrium • yields the payoffs (u*a, u*b) in every period

Folk Theorem • The outcome of the repeated PD is instructive • illustrates an important result • …the Folk Theorem • Strictly speaking a class of results • finite/infinite games • different types of equilibrium concepts • A standard version of the Theorem: • In a two-person infinitely repeated game: • if the discount factor is sufficiently close to 1 • any combination of actions observed in any finite number of stages… • …is the outcome of a subgame-perfect equilibrium

Assessment • The Folk Theorem central to repeated games • perhaps better described as Folk Theorems • a class of results • Clearly has considerable attraction • Put its significance in context • makes relatively modest claims • gives a possibility result • Only seen one example of the Folk Theorem • let’s apply it… • …to well known oligopoly examples

Overview... Repeated Games Basic structure Some well-known examples… Equilibrium issues Applications

Cournot competition: repeated • Start by reinterpreting PD as Cournot duopoly • two identical firms • firms can each choose one of two levels of output – [high] or [low] • can firms sustain a low-output (i.e. high-profit) equilibrium? • Possible actions and outcomes in the stage game: • [HIGH], [high]: both firms get Cournot-Nash payoff PC > 0 • [LOW], [low]: both firms get joint-profit maximising payoff PJ >PC • [HIGH], [low]: payoffs are (`P, 0) where `P >PJ • Folk theorem: get SPNE with payoffs (PJ, PJ) if d is large enough • Critical value for the discount factor d is `P− PJ d = ────── `P− PC • But we should say more • Let’s review the standard Cournot diagram

1 2 (qJ, qJ) 1 2 (qC, qC) Cournot stage game q2 • Firm 1’s Iso-profit curves • Firm 2’s Iso-profit curves • Firm 1’s reaction function • Firm 2’s reaction function c1(·) `q2 • Cournot-Nash equilibrium • Outputs with higher profits for both firms • Joint profit-maximising solution • Output that forces other firm’s profit to 0 c2(·) q1 0 `q1

Repeated Cournot game: Punishment • Standard Cournot model is richer than simple PD: • action space for PD stage game just has the two output levels • continuum of output levels introduces further possibilities • Minimax profit level for firm 1 in a Cournot duopoly • is zero,not the NE outcome PC • arises wherefirm 2 sets output to `q2 such that 1 makes no profit • Imagine a deviation by firm 1 at time t • raises q1 above joint profit-max level • Would minimax be used as punishment from t+1 to ∞? • clearly (0,`q2) is not on firm 2's reaction function • so cannot be best response by firm 2 to an action by firm 1 • so it cannot belong to the NE of the subgame • everlasting minimax punishment is not credible in this case

P P   Repeated Cournot game: Payoffs • Space of profits for the two firms P2 • Cournot-Nash outcome • • Joint-profit maximisation • Minimax outcomes • Payoffs available in repeated game (PJ,PJ) Now review Bertrand compewtition (PC,PC) P1 • 0

Bertrand stage game p2 • Marginal cost pricing • Monopoly pricing • Firm 1’s reaction function • Firm 2’s reaction function • Nash equilibrium pM c p1 pM c

Bertrand competition: repeated • NE of the stage game: • set price equal to marginal cost c • results in zero profits • NE outcome is the minimax outcome • minimax outcome is implementable as a Nash equilibrium… • … in all the subgames following a defection from cooperation • In repeated Bertrand competition • firms set pM if acting “cooperatively” • split profits between them • if one firm deviates from this… • …others then set price to c • Repeated Bertrand: result • can enforce joint profit maximisation through trigger strategy… • …provided discount factor is large enough

Repeated Bertrand game: Payoffs • Space of profits for the two firms P2 • Bertrand-Nash outcome • PM • Firm 1 as a monopoly • Firm 2 as a monopoly • Payoffs available in repeated game P1 • 0 PM

Repeated games: summary • New concepts: • Stage game • History • The Folk Theorem • Trigger strategy • What next? • Games under uncertainty

Dynamic Repeated Games: Equilibrium and History | Microeconomics Analysis