1 / 54

Pruning in Artificial Intelligence: Efficient Minimax Value Computation in Game Trees

This lecture on pruning in artificial intelligence discusses the concept of alpha-beta pruning and its application in computing the minimax value of a game tree or specific game states. It explains the process of recording and updating lower and upper bounds, and when and how to prune branches of the tree. The lecture also covers the basics of game theory, including dominant strategy, Nash Equilibrium, and minimax strategy.

vang
Télécharger la présentation

Pruning in Artificial Intelligence: Efficient Minimax Value Computation in Game Trees

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Artificial Intelligence: Representation and Problem SolvingMulti-agent Systems (2): Basic Concepts in Game Theory 15-381 / 681 Instructors: Fei Fang (This Lecture) and Dave Touretzky feifang@cmu.edu Wean Hall 4126

  2. Recap: Pruning Fei Fang

  3. Recap: Pruning • Alpha-Beta ( pruning): compute the minimax value of a game tree (or a specific state) with minimal exploration • During the search, at state , record the min (for MIN node) or max (for MAX node) value of its successors that have been explored • is lower bound of the minimax value for a MAX node (initialized as ) and upper bound for a MIN node (initialized as ) • During the search, at state , record the lower-bound (, initialized as ) and upper-bound (, initialized as ) of the minimax value based on what have been searched so far (not only based on its explored successors, but also the other explored branches of the tree) • As more successors of are explored, update the value of , , • Prune a subtree starting at a node if is outside of the range • For MAX player, prune if • For MIN player, prune if • and are bounds determined globally, and are bounds determined locally; If there is a conflict, it only means the local branch is useless and can be pruned Fei Fang

  4. Recap: Pruning • : lower-bound of minimax value • : upper-bound of minimax value Fei Fang

  5. Outline • Overview • Notations • Basic Concepts • Solution Concepts • Dominant Strategy • Nash Equilibrium (NE) • Maximin Strategy • Minimax Strategy • Minimax Theorem Fei Fang

  6. From Games To Game Theory • Game theory is the study of strategic decision making (of more than one player) • Used in economics, political science etc. John von Neumann John Nash Heinrich Freiherr von Stackelberg Winners of Nobel Memorial Prize in Economic Sciences Fei Fang

  7. Normal-Form Games • A game in normal form consists of • Set of players • Set of strategies • Payoffs / Utility functions • Players move simultaneously and the game ends immediately afterwards • Strategy profile , • Outcome / Utility profile • Zero-Sum Game: (Matrix form, Strategic form, Standard form) Fei Fang

  8. Example Normal-Form Games • Prisoner’s Dilemma (PD) • Two suspects are charged with a crime • If both Cooperate: 1 year in jail each • If one Defect (rat out the other person), one Cooperate: 0 year for (D), 3 years for (C) • If both Defect: 2 years in jail each • Variation: Split or Steal https://www.youtube.com/watch?v=p3Uos2fzIJ0 Fei Fang

  9. Example Normal-Form Games • Rock-Paper-Scissors (RPS) • Rock beats Scissors • Scissors beats Paper • Paper beats Rock Fei Fang

  10. Some Example Games • Football vs Concert (FvsC) • Historically known as Battle of Sexes • If football together: Alex , Berry  • If concert together: Alex , Berry  • If not together: Alex , Berry  Fei Fang

  11. Normal-Form Games • In many cases, each player has a finite set of actions , and player ’s strategy set is , i.e., the set of probability distribution over actions • Action profile , • Set of actions • Utility function can be represented as • or simply written as • Let be the probability of choosing action , then Expected utility Fei Fang

  12. Payoff Matrix • A two-player normal-form game with finite actions can be represented by a (bi)matrix • Player 1: Row player, Player 2: Column player • Often first number is for row player, second for column player Player 2 Player 1 Player 2 Player 1 Berry Alex Fei Fang

  13. Pure Strategy, Mixed Strategy, Support • A two-player normal-form game with finite action set , and strategy set is • Pure strategy: choose an action deterministically • Mixed strategy: choose action randomly • Support: set of actions chosen with non-zero probability • Let where is the probability of choosing the action of player , then • Pure strategy: • Mixed strategy: • Support Fei Fang

  14. Quiz 1 • In Rock-Paper-Scissors, if , , what is ? • A: • B: • C: • D: Player 2 Player 1 Fei Fang

  15. Best Response • Let . Action profile can be denoted as • Similarly, define and • Best Response: Set of actions or strategies leading to highest expected utility given the strategies or actions of other players • iff • iff • Theorem (Nash 1951): A mixed strategy is BR iff all actions in the support are BR • iff Fei Fang

  16. Pareto Optimality • An outcome is Pareto optimal if there is no other outcome that all players would prefer, i.e., each player gets higher utility • An outcome is Pareto dominated by another outcome if all the players would prefer the other outcome Fei Fang

  17. Outline • Overview • Notations • Basic Concepts • Solution Concepts • Dominant Strategy • Nash Equilibrium (NE) • Maximin Strategy • Minimax Strategy • Minimax Theorem Fei Fang

  18. Solution Concepts in Normal-Form Games • In normal-form games, how should one player play and what should we expect all the players to play? • Dominant strategy and dominant strategy equilibrium / solution • Nash Equilibrium • Minimax strategy • Maximin strategy • Correlated Equilibrium Fei Fang

  19. Dominant Strategy • Dominant Strategy • One strategy is always better/never worse/never worse and sometimes better than any other strategy • Focus on single player’s strategy • Not always exist strictly dominates if very weakly dominates if weakly dominates if is a (strictly/very weakly/weakly) dominant strategy if it dominates Fei Fang

  20. Dominant Strategy Equilibrium or Solution • Dominant strategy equilibrium/solution • Every player plays a dominant strategy • Focus on strategy profile for all players • Not always exist Fei Fang

  21. Find Dominant Strategy • Onlyneedtoenumeratepurestrategies. • Pure strategy is a strictlydominant strategy if • If a strategy is a strictly/weakly dominant strategy, it has to be a pure strategy • A mixed strategy is a very weekly dominant strategy iff all actions in its support are very weekly dominant strategies Player 2 Player 1 Fei Fang

  22. Outline • Overview • Notations • Basic Concepts • Solution Concepts • Dominant Strategy • Nash Equilibrium (NE) • Maximin Strategy • Minimax Strategy • Minimax Theorem Fei Fang

  23. Nash Equilibrium • Nash Equilibrium (NE) • Every player’s strategy is a best response to others’ strategy profile • Focus on strategy profile for all players • One cannot gain by unilateral deviation • Pure Strategy Nash Equilibrium (PSNE) • Mixed Strategy Nash Equilibrium • Formally • is PSNE if • is NE if Fei Fang

  24. Nash Equilibrium • What are the PSNEs in the following games? • In FvsC, is Alex: (2/3,1/3), Berry: (1/3,2/3)a mixed strategy NE? Player 2 Alex Player 1 Fei Fang

  25. Nash Equilibrium • Theorem (Nash 1951): NE always exists in finite games • Finite game: • NE: pure or mixed • Proof: Through Brouwer's fixed point theorem Fei Fang

  26. Find PSNE • Find pure strategy Nash Equilibrium (PSNE) • Enumerate all action profile • For each action profile, check for each player to see if there is no incentive for this player to deviate, i.e., there exists another action of this player that lead to higher payoff, given the actions of other players • Can we do better? Player 2 Player 1 Fei Fang

  27. Find PSNE • Strictly dominated strategies cannot be part of an NE • is strictly dominated if , , • can be a mixed strategy • Only need to check pure strategies of other players, i.e., • Such a strategy can never be BR, thus not part of NE • Weakly dominated strategy can be part of an NE • Remove strictly dominated actions (pure strategies) and then find PSNE in the remaining game • Can we do better? Player 2 Player 2 Player 1 Player 1 Fei Fang

  28. Find PSNE • Iterative Elimination of Strictly Dominated Strategies • In each step, eliminate dominated strategies(purestrategies,i.e.,actions) from each player’s strategy space. Repeat until no more action can be removed • When the remaining game has only one action for each player, then that is the unique Nash Equilibrium of the game and the game is called dominance solvable • It may not be a dominant strategy equilibrium • When the remaining game has more than one action for some players, find PSNE in the remaining game • Order of removal does not matter Player 2 Player 1 Fei Fang

  29. Find PSNE • If you iterative eliminate veryweakly dominated strategies, at least one equilibrium is preserved • is veryweakly dominated if , • Order of removal can matter Player 2 Player 1 Fei Fang

  30. FindPSNE • Tosummarize • TofindallPSNE • Iterative Elimination of Strictly Dominated Strategies • Enumerateallactionsprofilesintheremaininggame,andforeachactionprofile,checkifnoneoftheplayershasincentivetodeviated • TofindaPSNE • Iterative Elimination of (VeryWeakly)Dominated Strategies • SearchforallactionsprofilesintheremaininggameuntilaPSNEisfound Fei Fang

  31. Find All NEs (PSNE and Mixed Strategy NE) • Special case: Two player, zero-sum game • NE=Minimax=Maximin, solved by LP (will introduce later) • In practice, available solvers/packages: nashpy (python), gambit project (http://www.gambit-project.org/) • Two-player, general-sum bimatrixgame: Support Enumeration Method Fei Fang

  32. Find All NEs • Recall: A mixed strategy is BR iff all actions in the support are BR • To find all NEs, think from the inverse direction: enumerate support • If we know in the NE, for player , action , , and are in the support of , what does it mean? • They are all BR to other player’s strategies, and therefore • 1) Action , , and are chosen with non-zero probability, and the probability of choosing them sum up to 1 • 2) Action , , and lead to the exactly same expected utility • This gives us a number of equations! • 3) The expected utility of taking action , , and is not lower than any other actions • These are necessary conditions for with support=action , , and being part of NE Fei Fang

  33. Find All NEs • If support for Alex is (Football, Concert) and for Berry is (Football, Concert), i.e., each action is chosen with non-zero probability, then Action F and C lead to the exactly same expected utility to Alex when fixing Berry’s strategy, and Action F and C lead to the exactly same expected utility to Berry when fixing Alex’s strategy • Assume Alex’s strategy is and Berry’s strategy is then Berry Alex Now check . It is indeed a reasonable NE with the specified support. Fei Fang

  34. Quiz 2 • What is the probability of Berry choosing Football in NE with support size=2? • A: • B: • C: • D: No such NE Berry Alex Fei Fang

  35. Find All NEs • Support Enumeration Method (for bimatrix games) • Enumerate all support pairs with the same size for size=1 to • For each possible support pair • Compute the probability so as to (1) keep the other player indifferent among actions in the support and (2) the probability of taking actions in the support sum up to 1 • Check if the resulting probability is consistent with our assumption: all actions in the support set are chosen with non-zero non-negative probability • Check if no incentive to deviate, i.e., all other actions that are not in the support does not lead to higher expected utility Expected utility (EU) of choosing any action is the support is the same Fei Fang

  36. Find All NEs • Support size=1 • Alex: Football, Berry: Football: is an NE • Alex: Football, Berry: Concert: is not an NE • Berry’s action Football, which is not in the support, leads to higher utility for Berry • Alex: Concert, Berry: Football: is not an NE • Alex’s action Football, which is not in the support, leads to higher utility for Alex • Alex: Concert, Berry: Concert: is an NE • Support size=2: • Alex: (Football, Concert), Berry: (Football, Concert) • , is an NE Berry Alex Fei Fang

  37. Outline • Overview • Notations • Basic Concepts • Solution Concepts • Dominant Strategy • Nash Equilibrium (NE) • Maximin Strategy • Minimax Strategy • Minimax Theorem Fei Fang

  38. Maximin Strategy • Maximin Strategy (applicable to multiplayer games) • Maximize worst case expected utility • Maximin strategy for player is • Maximin value for player is • Focus on single player’s strategy (Also called safety level) Fei Fang

  39. Compute Maximin Strategy • For bimatrix games, maximin strategy can be computed through linear programming • Let be player 1’s payoff value when player 1 choose action and player 2 choose action To get , we denote where is the probability of choosing the action of player 1. Now we need to find the value of s.t. Only need to check pure strategies. Recall the theorem of BR: A mixed strategy is BR iff all actions in the support are BR Fei Fang

  40. Compute Maximin Strategy • Convert to LP • Claim: is optimal solution for iff it is optimal solution for -- LP s.t. s.t. , s.t. , Let be the payoff matrix for player 1 (row player). Then can be rewritten in matrix form Fei Fang

  41. Compute Maximin Strategy s.t. , Berry Alex Fei Fang

  42. Outline • Overview • Notations • Basic Concepts • Solution Concepts • Dominant Strategy • Nash Equilibrium (NE) • Maximin Strategy • Minimax Strategy • Minimax Theorem Fei Fang

  43. Minimax Strategy • Minimax Strategy (make sense in two-player games) • Minimize best case expected utility for the other player (just want to harm your opponent) • Minimax strategy for player is • Minimax value for player is • Focus on single player’s strategy • Can be computed through linear programming Fei Fang

  44. Compute Minimax Strategy • For bimatrix games, maximin strategy can be computed through linear programming • Let be player 2’s payoff value when player 1 choose action and player 2 choose action . Denote where is the probability of choosing the action of player 1. Then the minimax strategy can be found through solving the following LP Fei Fang

  45. Compute Minimax Strategy s.t. , Berry Alex Fei Fang

  46. Outline • Overview • Notations • Basic Concepts • Solution Concepts • Dominant Strategy • Nash Equilibrium (NE) • Maximin Strategy • Minimax Strategy • Minimax Theorem Fei Fang

  47. Minimax Theorem • Theorem (von Neumann 1928, Nash 1951): • Minimax=Maximin=NE in 2-player zero-sum games • Formally, every two-player zero-sum game has a unique value such that • Player 1 can guarantee value at least • Player 2 can guarantee loss at most • is called value of the game • All NEs leads to the same utility profile in a two-player zero-sum game Fei Fang

  48. Summary • A game in normal form consists of • Set of players, Set of strategies, Payoffs / Utility functions • Players move simultaneously • For a bimatrix game, we expect you to be able to find: Fei Fang

  49. Reading • Textbook Chapter 17.5 Fei Fang

  50. Additional Resources (optional) • Online course • https://www.youtube.com/user/gametheoryonline Fei Fang

More Related