1 / 51

Adversarial Search

This outline provides an overview of adversarial search in game playing, including game trees, minimax algorithm, alpha-beta pruning, and different types of games. It covers concepts like deterministic and chance games, perfect and imperfect information, and utility values.

cpace
Télécharger la présentation

Adversarial Search

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Adversarial Search CMPT 420 / CMPG 720

  2. Outline • Game playing • Game trees • Minimax • Alpha-beta pruning

  3. Games vs. search problems • competitive environments: agents’ goals are in conflict • adversarial search problems (games)

  4. Types of Games deterministic chance perfect information imperfect information

  5. Games • deterministic, fully-observable, turn-taking, two–player, zero-sum games • Utility values at the end are equal and opposite • Tic-tac-toe

  6. Game Search Formulation • Two players MAX and MIN take turns (with MAX playing first) • S0: • Player(s): • Action(s): • Result(s,a): • Terminal-test(s): • Utility(s,p):

  7. Game Search Formulation • S0: initial state • Player(s): • Action(s): • Result(s,a): • Terminal-test(s): • Utility(s,p):

  8. Game Search Formulation • S0: initial state • Player(s): which player has the move in a state • Action(s): • Result(s,a): • Terminal-test(s): • Utility(s,p):

  9. Game Search Formulation • S0: initial state • Player(s): which player has the move in a state • Action(s): set of legal moves in a state • Result(s,a): • Terminal-test(s): • Utility(s,p):

  10. Game Search Formulation • S0: initial state • Player(s): which player has the move in a state • Action(s): set of legal moves in a state • Result(s,a): transition model • Terminal-test(s): • Utility(s,p):

  11. Game Search Formulation • S0: initial state • Player(s): which player has the move in a state • Action(s): set of legal moves in a state • Result(s,a): transition model • Terminal-test(s): true/false (terminal states) • Utility(s,p):

  12. Game Search Formulation • S0: initial state • Player(s): which player has the move in a state • Actions(s): set of legal moves in a state • Result(s,a): transition model • Terminal-test(s): true/false (terminal states) • Utility(s,p): utility function defines the final value of a game that ends in terminal state s for a player p • zero-sum games: same total payoff

  13. Game tree (1-player)

  14. Partial Game Tree for Tic-Tac-Toe

  15. Optimal strategies • MAX uses search tree to determine next move. • Assumption: Both players play optimally!! • Given a game tree, the optimal strategy can be determined by using the minimaxvalue of each node

  16. Minimax • The minimax value of a node is the utility (for Max) of being in the corresponding state, assuming that both players play optimally. • Minimax(s) = • if Terminal-test(s) • if Player(s) = Max • if Player(s) = Min

  17. Minimax • The minimax value of a node is the utility (for Max) of being in the corresponding state, assuming that both players play optimally. • Minimax(s) = • Utility (s) if Terminal-test(s) • max of Minimax(Result(s,a)) if Player(s) = Max • min of Minimax(Result(s,a)) if Player(s) = Min

  18. Optimal Play 2 2 1 2 7 1 2 7 1 8 8 2 2 1 2 7 1 8 2 7 1 8 2 7 1 8 This is the optimal play MAX MIN

  19. Two-Ply Game Tree

  20. Two-Ply Game Tree

  21. Two-Ply Game Tree

  22. Two-Ply Game Tree The minimax decision Minimax maximizes the worst-case outcome for max.

  23. What if MIN does not play optimally? • Definition of optimal play for MAX assumes MIN plays optimally: maximizes worst-case outcome for MAX. • But if MIN does not play optimally, MAX can do even better.

  24. Minimax Algorithm function MINIMAX-DECISION(state) returns an action inputs: state, current state in game vMAX-VALUE(state) return the action in SUCCESSORS(state) with value v function MAX-VALUE(state) returns a utility value if TERMINAL-TEST(state) then return UTILITY(state) v  -∞ for a,s in SUCCESSORS(state) do v MAX(v,MIN-VALUE(s)) return v function MIN-VALUE(state) returns a utility value if TERMINAL-TEST(state) then return UTILITY(state) v  ∞ for a,s in SUCCESSORS(state) do v MIN(v,MAX-VALUE(s)) return v

  25. Properties of minimax • Complete? • Yes (if tree is finite) • Optimal? • Yes (against an optimal opponent) • Time complexity? • O(bm) • Space complexity? • O(bm) (depth-first exploration) • For chess, b ≈ 35, m ≈100 for "reasonable" games exact solution is infeasible

  26. Alpha-Beta Pruning • Problem with minimax search: exponential in the depth of the tree • Can we cut it in half? • It is possible to compute the minimax decision without looking at every node. • pruning: eliminate some parts of the tree

  27. Alpha-beta pruning • We can improve on the performance of the minimax algorithm through alpha-beta pruning MAX MIN MAX 2 7 1 ?

  28. Alpha-beta pruning • We can improve on the performance of the minimax algorithm through alpha-beta pruning MAX • We don’t need to compute the value at this node. • No matter what it is, it can’t affect the value of the root node. MIN MAX 2 7 1 ?

  29. Alpha-Beta Example Do DFS until the first leaf Range of possible values [-∞,+∞] [-∞, +∞]

  30. Alpha-Beta Example Do DFS until first leaf Range of possible values [-∞,+∞] [-∞, +∞]

  31. Alpha-Beta Example (continued) [-∞,+∞] [-∞,3]

  32. Alpha-Beta Example (continued) [-∞,+∞] [-∞,3]

  33. Alpha-Beta Example (continued) [-∞,+∞] [3,3]

  34. Alpha-Beta Example (continued) [3,+∞] [3,3]

  35. Alpha-Beta Example (continued) [3,+∞] [3,3] [-∞, ∞]

  36. Alpha-Beta Example (continued) [3,+∞] [3,3] [-∞,2]

  37. Alpha-Beta Example (continued) [3,+∞] This node is worse for MAX [3,3] [-∞,2]

  38. Alpha-Beta Example (continued) , [3,14] [3,3] [-∞,2] [-∞, ∞]

  39. Alpha-Beta Example (continued) , [3,14] [3,3] [-∞,2] [-∞,14]

  40. Alpha-Beta Example (continued) , [3,5] [3,3] [−∞,2] [-∞,5]

  41. Alpha-Beta Example (continued) [2,2] [3,3] [−∞,2]

  42. Alpha-Beta Example (continued) [3,3] [2,2] [3,3] [-∞,2]

  43. α-β pruning example • Minimax(root) • = max(min(3,12,8),min(2,x,y),min(14,5,2)) • = max(3,min(2,x,y),2) • = 3

  44. α-β pruning • We made the same minimax decision without ever evaluating two of the leaf nodes! • They are independent. • It is possible to prune entire subtrees.

  45. α = value of the best choice found so far at any choice point along the path for max If v is worse than α, max will avoid it prune that branch Define β similarly for min Why is it called α-β?

  46. Alpha-Beta Algorithm function ALPHA-BETA-SEARCH(state) returns an action inputs: state, current state in game vMAX-VALUE(state, - ∞ , +∞) return the action in SUCCESSORS(state) with value v function MAX-VALUE(state, , ) returns a utility value if TERMINAL-TEST(state) then return UTILITY(state) v  - ∞ for a,s in SUCCESSORS(state) do v MAX(v,MIN-VALUE(s,  , )) ifv ≥ then returnv  MAX( ,v) return v

  47. Alpha-Beta Algorithm function MIN-VALUE(state,  , ) returns a utility value if TERMINAL-TEST(state) then return UTILITY(state) v  + ∞ for a,s in SUCCESSORS(state) do v MIN(v,MAX-VALUE(s,  , )) ifv ≤ then returnv  MIN( ,v) return v

  48. Comments: Alpha-Beta Pruning • Pruning does not affect the final results. • Entire subtrees can be pruned. • Good move ordering improves effectiveness of pruning. • With “perfect ordering,” time complexity is O(bm/2) • Alpha-beta pruning can look twice as far as minimax in the same amount of time

More Related