1 / 42

Best Reply Mechanisms

Best Reply Mechanisms. Justin Thaler and Victor Shnayder. What are best-reply dynamics?. Start with an arbitrary strategy profile In each step let some player switch his strategy to be a best reply to the current strategies of the others. What are best-reply dynamics?. Definition:

verdi
Télécharger la présentation

Best Reply Mechanisms

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Best Reply Mechanisms • Justin Thaler and Victor Shnayder

  2. What are best-reply dynamics? • Start with an arbitrary strategy profile • In each step let some player switch his strategy to be a best reply to the current strategies of the others.

  3. What are best-reply dynamics? • Definition: • A repeated-reply mechanism for a private info game G: • Extensive form game with perfect recall (same players) • At most M steps. In each step: • A single player announces an element of Ai • Players play in round-robin order • Stop when all players “pass” in n consecutive steps. • Enforce action profile of the most recently announced actions • If M steps go by without stopping, penalize the players.

  4. What are best-reply dynamics? • Need a penalty to ensure non-convergence is not in best interest of any player. • Realistic modeling assumption for BGP, TCP, etc. • Best-reply dynamics is the strategy profile of a repeated-reply mechanism in which each player i updates to i’s best-reply to the other players’ strategies each time it is i’s turn.

  5. Why best reply dynamics? • If convergence occurs, we have a highly justifiable Nash Equilibrium • Computationally simple • Players only need private information • Feasible in distributed, asynchronous settings • Prescribed by existing protocols (Ex: BGP)

  6. Why best reply dynamics? • In light of Theorems 1 and 2 (which we’ll see soon): • Often gives a non-VCG way of creating incentive compatible mechanisms (?). And sometimes without $$$. • Often get collusion-proofness, Pareto-efficiency

  7. Outline • When do best reply dynamics work? • Universal max-solvability (UMS) • Thm: UMS implies convergence to unique NE, collusion-proofness • Example applications (correlated markets, BGP, etc) • Connections to strategy-proofness • Discussion

  8. Universal max-dominance • A subset T of S is universally max-dominated if: • Very strong condition! • Existence of max-dominated set is strictly stronger than existence of dominated strategy. • Exists si, si’ s.t. ui(si, s-i) < ui(si’, s-i) for all s-i

  9. Universal max-solveability (UMS) • A game G is universally max-solvable if we can iteratively remove universally max-dominated strategy sets and get to a single strategy for each player. • Stronger condition than solvable by iterated removal of strictly dominated strategies (IRSDS)

  10. Example 1 Solvable by IRSDS, but not UMS. Neither player has a universally max-dominated set. Note unique NE is not PE, and best-reply dynamics are not incentive compatible for the row player.

  11. Example 2 UMS

  12. Example 2 UMS

  13. Example 2 UMS

  14. Example 3 (UMS) L M R A B C

  15. Example 3 (UMS) L M R A B C

  16. Example 3 (UMS) L M R A B C

  17. Example 3 (UMS) L M R A B C

  18. Example 3 (UMS) L M R A B C

  19. Theorems Theorems Theorem 1: G is UMS ⇒ G has unique, pure NE, and it is collusion-proof. Corollary: Collusion-proof NE ⇒ NE is Pareto optimal Note that solvable by IRSDS suffices for unique, pure NE. UMS is needed for collusion-proofness and PE.

  20. Proof of theorem 1: • By contradiction: G is UMS, so fix an elimination sequence of dominated strategy-sets. • Let s* be the final strategy profile. • If s* is not collusion proof NE, some set of players T can deviate and be better off. • Let s be new strategies where players in T change strategy from s* • Let si be first strategy eliminated. Then it was max-dominated, so si* is strictly better, so i can’t be better off.

  21. Example 1 Solvable by IRSDS, but not UMS. Neither player has a universally max-dominated set. Note unique NE is not PE, and best-reply dynamics are not incentive compatible for the row player.

  22. Theorems Theorems Theorem 2: If G is UMS with private information, then best reply dynamics are incentive-compatible in ex-post NE, and converge to the unique NE of the induced full-information game. Proof: Similar to Theorem 1. The main idea is that a strategy eliminated in the t‘th step of the UMS elimination process can never be used after the nt’th step of the best-reply mechanism.

  23. Correlated two-sided markets • Agents: buyers and sellers • Game: weighted bipartite graph -- buyers on one side, sellers on the other • Buyers have preference order over sellers (higher edge weight = higher preference) • Sellers prefer buyers connected by heavier edges

  24. Correlated two-sided markets are UMS • Let e be maximum weight edge. Choosing it universally max-dominates all other strategies of both endpoints. • Remove the two endpoints of e and all incident edges, repeat. • Therefore, best reply dynamics converge to ex-post NE.

  25. Extended Example: BGP

  26. d 1 2 Internet routing: BGP • Receive update messages from neighbours announcing routes to d. • Choose a single neighbor, whose route you prefer most, to send traffic through. • Announce your new route to all your neighbors 12d 1d 21d 2d

  27. Internet routing: BGP • BGP is asynchronous, distributed • Prescribes best-reply dynamics • But does BGP converge? • And is BGP “incentive compatible”? Do ASes have an incentive to deviate from the protocol?

  28. Does BGP Converge? • We can break this into two questions: • Does a stable solution even exist in the static game? • If so, will BGP find such a solution? • But we only need one answer.

  29. d 1 2 3 Does a Stable Solution Exist? 21d 2d 13d 1d It is actually NP-complete to determine existence in general networks No stable solution exists! 32d 3d

  30. d 1 2 Does BGP Converge When A Stable Solution Exists? 12d 1d 21d 2d • Notice that multiple NE exist. • And asynchronous best-reply dynamics do not necessarily converge. • So must not be UMS.

  31. So What Do We Do? • Approach #1: Use mechanism design to achieve IC convergence, but solution must be distributed. • Approach #2: Identify conditions (on network topology and/or AS preferences) under which BGP converges and is IC. • Both approaches are canonical problems in Distributed Algorithmic Mechanism Design.

  32. Approach #2 for Convergence • Griffin et al. (1999): If BGP fails to converge, then there exists a Dispute Wheel. • Each ui would rather route clockwise through ui+1 than Qi Image Source: Levin et al. “Internet Routing and Games,” 2008.

  33. Approach #2 for Convergence • Gao and Rexford (2001): Identified reasonable conditions based on economic structure of the Internet that guarantee No Dispute Wheel and hence convergence. (No bounds on convergence rate given). • But limited progress made until recently on conditions for guaranteeing that BGP is IC.

  34. Approach #2 for Incentive Compatibility • Theorem 3: Assuming non-convergence after n3 rounds is a penalty, and No Dispute Wheel holds, then routing games are UMS. • Corollary: Under the above conditions, best-reply strategies are IC in collusion-proof ex-post NE. • Corollary: Under the Gao-Rexford conditions, BGP converges in O(n3) time and is IC.

  35. Theorem 3 • Proof sketch: The case of finding the first universally max-dominated action set is general. • Find a node a1 with at least 2 actions. Let R be a1’s most preferred existing route. One of two cases must occur:

  36. Theorem 3 • Every node a2 on R prefers the suffix of R leading from a2 to d. In this case, if u is the closest node to d on R with at least two actions, then (u, d) universally max-dominates all other actions of u, and we’re done. • 2. Some node a2 on R prefers some other path over the suffix of R leading from a2 to d. In this case, we repeat the analysis at a2. Eventually we either form a dispute wheel or find ourselves in Case 1.

  37. What’s left in Routing? • Complete characterization of BGP convergence (No Dispute Wheel sufficient, not necessary). • Conditions for convergence to globally optimal solution. Can it even be efficiently found? • Do mechanism design and/or $$$ have a role to play? • Changes in network topology?

  38. Other applications • Congestion control • Criticism: Best-reply dynamics are only somewhat descriptive of how TCP works in practice. • Cost sharing games • Matching games (stable-roommate, intern assignment) • Auctions (unit demand bidders, GSP) • Relies a lot on VCG results • Main contribution is proof of convergence! (opposite of BGP)

  39. Play s(θ) Ex-post NE θ Outcome Relationship to DSIC Given UMS game, best-replying is a strategy that gives ex-post NE. Get a direct-revelation, dominant strategy IC mechanism. Good: New way to create DSIC mechanisms. Bad: Impossibility results limit the class of problems amenable to this approach (at least without money or limits on preferences).

  40. Discussion • What is the main contribution? • 1. Sufficient conditions for IC convergence of best-reply dynamics. General enough to encompass many applications, esp. BGP. • 2. Bounds on time to convergence. • 3. New framework for developing IC mechanisms?

  41. Next Steps • Necessary conditions for best-reply dynamics to converge? To be IC (under what definition?)? • Better-reply dynamics? Other types of dynamics aka algorithms? What types of dynamics are reasonable or “natural”?

  42. Economists and Complexity • See recent blog post by Noam Nisan: Does complexity of equilibria matter? • Kamal Jain: “If your laptop can’t find it then neither can the market“. • Jeff Ely:  “Solving the n-body problem is beyond the capabilities of the world’s smartest mathematicians.  How do those rocks-for-brains planets manage to do pull it off?“

More Related