Graphical Models

Graphical Models Michael Kearns Michael L. Littman Satinder Signh Presenter: Shay Cohen

So far we have seen… • Players payoffs and the games are represented in tabular form • n agents with 2 actions: n matrices of exponential size: • Needed: More compact representations and algorithms for manipulating them

Graphical models (not formal) • n-player game is given by undirected graph with n vertices and n matrices • Payoff is determined only by the neighbors • “local games” composing “global game”

Examples • Games with geographical aspects involved (salespersons) • Topology of computer networks with a limited set of neighbors • … and so on

Reminder… • n-player two-action game: n matrices of size • specifies the payoff for pure strategy x • Nash-Equilibrium: (for all i and for all p’) -Nash-Equilibrium:

Graphical Games • Graphical game: (G,M) • G is undirected graph on n vertices • M is a set of n matrices representing the payoff of player i with its neighbors • Size of is when

Algorithm TreeNash • Works in two passes: the downstream pass and the upstream pass • Downstream: passes indicator tables (with witnesses) from the leafs to the root • Upstream: selects witnesses from root to the leafs (see the attached appendix)

TreeNash – more details • Downstream: A parent U will send to a child V a binary-valued table T(v,u) s.t.: T(v,u)=1 there is NE for in which U=u (v,u – mixed strategies) • Upstream: A child V will be V=v s.t. for all its parents :

Downstream in general • W – child, V – current node, U – parents (b.r. – best response)

U V W Z How? - Downstream T(w,v)=1 v b.r. to w T(w,u)=1 u b.r. to w • T(z,w)=1 for some (u,v): • T(w,u)=1, T(w,v)=1 • W=w b.r. to U=u,V=v,Z=z • T(z)=1for some w: • T(z,w)=1 • Z=z b.r. to W=w (b.r. – best response)

U V W Z How? – Upstream Choose U=u, V=v s.t. T(w,u)=1 and T(w,u)=1 Choose Z=z, W=w s.t. T(z,w)=1

TreeNash • Theorem: TreeNash computes a Nash equilibrium for the tree game (G,M) • Non-deterministic choices: select all of them, and all NE will be found • But the tables are continuous… How do we compute them?

Approximate TreeNash • Tables will be of finite size: • All computations of best responses are computations of -best responses in the grid • Each table has entries, therefore running time is (k parents)

Approximate TreeNash (2) • Lemma: Let p be a NE for (G,M)and let q be the nearest (in metric) mixed strategy on the . Then provided q is a -NE for (G,M)

Approximate TreeNash (3) • Theorem: For any >0, let Then ApproximateTreeNash computes an -NE for the tree game (G,M).

Exact TreeNash • Tables will be made of finite unions of rectangles • Each table T(v,u) will be represented by a v-list: For each interval there is a subset of [0,1] of disjoint intervals: where T(v,u)=1

Exact TreeNash (2) • Assume share a common v-list (by merging) • Downstream: How do we find T(w,v) using them, and keep such representation of rectangles?

Exact TreeNash (3) • Fix a v-interval and set of intervals appropriate to the v-interval for each parent: • T(w,v)=1 is of the form WxI - why? • What would be the region Wfor which some v in the interval is b.r. to u,w?

Exact TreeNash (4) • Denote expected payoff of V • Lemma: If then W is either empty, a continuous interval in [0,1] or union of two intervals.

Exact TreeNash (5) • Can be shown that the leafs can be represented using at most 3 rectangles • Therefore, the representation can be kept and is exponential in the number of vertices • Witnesses can be found easily, because representation is finite

ExactTreeNash • Theorem: ExactTreeNash computes a Nash equilibrium for the tree game (G,M). The algorithm runs in exponential time in the number of vertices of G

Polynomial algorithm • Use downstream pass and upstream pass as well • Pass breakpoints policies (W child of V): Interpretation (“b.p. for V”):

How? - Downstream • Denote: - ordered set of breakpoints of V’s parents - Set of values that W can play that allow V to play any strategy, given - Set of values that W can play, and V’s parents play according to V=b, then V=b is a best response -

How? - Downstream • Lemma: is either empty, a single interval or the union of two intervals • Lemma: • Construct the policy for V by covering [0,1] with them – will produce at most set of 2+l breakpoints. • How do we start with the leafs?

How? - Upstream • Add a dummy root with constant payoff and no influence on the real root • Once we select a value for the child, the value for the parents are determined according to the policies

Running time • Sorting and computing new breakpoint policy: (t – number of breakpoints) • Number of breakpoints is bounded by 2n, therefore total running time:

Summary • First framework gave us: 1. Finding approximation for NE in graphical games which are trees in polynomial time 2. Finding NE for trees in exponential time (ALL of the NEs representation) • Second algorithm: finding NE in polynomial time

Graphical Models