1 / 78

Adversarial Search

Adversarial Search. Adversarial Search. Game playing Perfect play The minimax algorithm alpha-beta pruning Resource limitations Elements of chance Imperfect information. Game Playing State-of-the-Art.

Télécharger la présentation

Adversarial Search

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Adversarial Search

  2. Adversarial Search Game playing • Perfect play • The minimaxalgorithm • alpha-beta pruning • Resource limitations • Elements of chance • Imperfect information

  3. Game Playing State-of-the-Art • Checkers: Chinook ended 40-year-reign of human world champion Marion Tinsley in 1994. Used an endgame database defining perfect play for all positions involving 8 or fewer pieces on the board, a total of 443,748,401,247 positions. Checkers is now solved! • Chess: Deep Blue defeated human world champion GaryKasparov in a six-game match in 1997. Deep Blue examined 200 million positions per second, used very sophisticated evaluation and undisclosed methods for extending some lines of search up to 40 ply. Current programs are even better, if less historic. • Othello: Human champions refuse to compete against computers, which are too good. • Go: Human champions are beginning to be challenged by machines, though the best humans still beat the best machines. In go, b > 300, so most programs use pattern knowledge bases to suggest plausible moves, along with aggressive pruning. • Pacman: unknown

  4. What kind of games? • Abstraction: To describe a game we must capture every relevant aspect of the game. Such as: • Chess • Tic-tac-toe • … • Accessible environments: Such games are characterized by perfect information • Search: game-playing then consists of a search through possible game positions • Unpredictable opponent: introduces uncertainty thus game-playing must deal with contingency problems

  5. Type of games

  6. Type of games

  7. Game Playing • Many different kinds of games! • Axes: • Deterministic or stochastic? • One, two, or more players? • Perfect information (can you see the state)? • Turn taking or simultaneous action? • Want algorithms for calculating a strategy (policy) which recommends a move in each state

  8. Deterministic Games • Deterministic, single player, perfect information: • Know the rules • Know what actions do • Know when you win • E.g. Freecell, 8-Puzzle, Rubik’s cube • … it’s just search! • Slight reinterpretation: • Each node stores a value: the best outcome it can reach • This is the maximal outcome of its children (the max value) • Note that we don’t have path sums as before (utilities at end) • After search, can pick move that leads to best node

  9. Deterministic Two-Player • E.g. tic-tac-toe, chess, checkers • Zero-sum games • One player maximizes result • The other minimizes result • Minimaxsearch • A state-space search tree • Players alternate • Each layer, or ply, consists of around of moves* • Choose move to position with highest minimax value = best achievable utility against best play

  10. Games vs. search problems • “Unpredictable"opponentsolutionisa strategyspecifyinga movefor every possibleopponent reply • Timelimitsunlikelytofindgoal, must approximate • Plan of attack: • Computer considers possiblelinesof play (Babbage,1846) • Algorithm for perfectplay (Zermelo,1912;Von Neumann,1944) • Finitehorizon, approximate evaluation (Zuse, 1945;Wiener, 1948;Shannon,1950) • Firstchess program(Turing, 1951) • Machine learningtoimprove evaluationaccuracy (Samuel,1952- 57) • Pruning to allowdeepersearch (McCarthy,1956)

  11. Searching for the next move • Complexity:many games have ahuge searchspace • Chess: b=35,m=100 nodes=35100 • ifeachnodetakesabout1 nstoexplore theneachmove willtakeabout10 50 millennia tocalculate. • Resource(e.g., time, memory)limit:optimal solutionnotfeasible/possible,thusmustapproximate • Pruning:makesthesearch moreefficientby discardingportionsof thesearch treethatcannot improvequalityresult. • Evaluationfunctions:heuristicstoevaluateutilityof a state withoutexhaustivesearch.

  12. Two-player games • A game formulatedas a searchproblem board positionand turn definitionof legalmoves conditions for when game isover • Initial state: • Operators: • Terminalstate: • Utilityfunction: ofthe anumericvalue that describesthe outcome game. E.g.,-1, 0,1for loss, draw,win. (AKApayofffunction)

  13. Example:Tic-Tac-Toe CS561-Lecture7-8-Macskassy-Spring2011

  14. The minimax algorithm • Perfectplay for deterministicenvironmentswithperfect information • Basicidea: choose move withhighestminimaxvalue • =best achievable payoff against best play • Algorithm: • Generategametree completely • Determine utility ofeachterminalstate • Propagatetheutility valuesupwardinthethree byapplying • MINandMAXoperators onthenodesinthecurrentlevel • Attherootnodeuseminimaxdecisiontoselect themove • withthemax(ofthemin)utility value • Steps2and3 inthealgorithmassume thatthe opponentwillplay perfectly.

  15. Generate Game Tree

  16. Generate Game Tree x

  17. Generate Game Tree

  18. Generate Game Tree 1ply 1move

  19. A sub-tree win lose draw CS561-Lecture7-8-Macskassy-Spring2011 20

  20. What is a good move? win lose draw CS561-Lecture7-8-Macskassy-Spring2011

  21. MiniMax Example MAX MIN CS561-Lecture7-8-Macskassy-Spring2011

  22. MiniMax: Recursive Implementation CS561-Lecture7-8-Macskassy-Spring2011

  23. Minimax Properties • Optimalagainsta perfectplayer. Otherwise? • Time complexity? • O(bm) • Space complexity? • O(bm) • For chess,b=35, m=100 • Exact solutioniscompletelyinfeasible • But, doweneed toexplorethe whole tree? max min 10 10 9 100

  24. Resource Limits max • Cannot search to leaves • Depth-limited search • Instead, search a limited depth of tree • Replace terminal utilities with an eval function for non-terminal positions • Guarantee of optimal play is gone • More plies makes a BIG difference • Example: • Suppose we have 100 seconds, can explore 10K nodes / sec • So can check 1M nodes per move • α – β reaches about depth 8 – decent chess program 4 -2 4 min min -1 -2 4 9 ? ? ? ?

  25. Evaluation Functions • Functionwhichscores non-terminals • Idealfunction:returnsthe utilityof theposition • Inpractice: typicallyweightedlinearsumoffeatures: • e.g.f1(s)=(num whitequeens–numblackqueens),etc.

  26. Evaluation Functions

  27. Why Pacman starves • Heknowshisscorewill go upbyeatingthedot now • Heknowshisscore will go up justas muchby eatingthe dotlater on • Therearenopoint- scoring opportunities after eatingthedot • Therefore,waiting seems justas goodas eating

  28. - pruning:generalprinciple Player m Opponent If > v thenMAXwillchosemso prunetreeunder n Similarfor forMIN Player n v Opponent CS561-Lecture7-8-Macskassy-Spring2011

  29. - pruning:example1 [-∞,+∞] MAX MIN CS561-Lecture7-8-Macskassy-Spring2011 30

  30. - pruning:example1 [-∞,+∞] MAX MIN 3 CS561-Lecture7-8-Macskassy-Spring2011

  31. - pruning:example1 [-∞,+∞] MAX MIN [-∞,3] 3 CS561-Lecture7-8-Macskassy-Spring2011

  32. - pruning:example1 [-∞,+∞] MAX MIN [-∞,3] 3 12 CS561-Lecture7-8-Macskassy-Spring2011

  33. - pruning:example1 [-∞,+∞][3,+∞] MAX MIN [-∞,3] 3 12 8 CS561-Lecture7-8-Macskassy-Spring2011

  34. - pruning:example1 [-∞,+∞][3,+∞] MAX MIN [3,2] [-∞,3] 3 12 8 2 CS561-Lecture7-8-Macskassy-Spring2011

  35. - pruning:example1 [-∞,+∞][3,+∞] MAX MIN [3,2] [3,14] [-∞,3] 3 12 8 2 14 CS561-Lecture7-8-Macskassy-Spring2011

  36. - pruning:example1 [-∞,+∞][3,+∞] MAX MIN [3,2] [3,5] [-∞,3] [3,14] 3 12 8 2 14 5 CS561-Lecture7-8-Macskassy-Spring2011

  37. - pruning:example1 [-∞,+∞][3,+∞] MAX MIN [3,2] [3,5] [3,2] [-∞,3] [3,14] 3 12 8 2 14 5 2 CS561-Lecture7-8-Macskassy-Spring2011

  38. - pruning:example1 [-∞,+∞][3,+∞] MAX Selectedmove MIN [3,2] [3,5] [3,2] [-∞,3] [3,14] 3 12 8 2 14 5 2 CS561-Lecture7-8-Macskassy-Spring2011

  39. - pruning:example2 [-∞,+∞] MAX MIN CS561-Lecture7-8-Macskassy-Spring2011 40

  40. - pruning:example2 [-∞,+∞] MAX [-∞,2] MIN 2 CS561-Lecture7-8-Macskassy-Spring2011

  41. - pruning:example2 [-∞,+∞] MAX [-∞,2] MIN 5 2 CS561-Lecture7-8-Macskassy-Spring2011

  42. - pruning:example2 [-∞,+∞][2,+∞] MAX [-∞,2] MIN 14 5 2 CS561-Lecture7-8-Macskassy-Spring2011

  43. - pruning:example2 [-∞,+∞][2,+∞] MAX [-∞,2] MIN [2,5] 5 14 5 2 CS561-Lecture7-8-Macskassy-Spring2011

  44. - pruning:example2 [-∞,+∞][2,+∞] MAX [-∞,2] MIN [2,5][2,1] 1 5 14 5 2 CS561-Lecture7-8-Macskassy-Spring2011

  45. - pruning:example2 [-∞,+∞][2,+∞] MAX [-∞,2] MIN [2,8] [2,5][2,1] 8 1 5 14 5 2 CS561-Lecture7-8-Macskassy-Spring2011

  46. - pruning:example2 [-∞,+∞][2,+∞] MAX [-∞,2] MIN [2,8] [2,5][2,1] 12 8 1 5 14 5 2 CS561-Lecture7-8-Macskassy-Spring2011

  47. - pruning:example2 [-∞,+∞][2,+∞][3, +∞] MAX [-∞,2] MIN [2,8] [2,3] [2,5][2,1] 3 12 8 1 5 14 5 2 CS561-Lecture7-8-Macskassy-Spring2011

  48. - pruning:example2 [-∞,+∞][2,+∞][3, +∞] MAX Selectedmove [-∞,2] MIN [2,8] [2,3] [2,5][2,1] 3 12 8 1 5 14 5 2 CS561-Lecture7-8-Macskassy-Spring2011

  49. - pruning:example3 MIN MAX [6,∞] MIN CS561-Lecture7-8-Macskassy-Spring2011 50

  50. - pruning:example3 [-∞,6] MIN [-∞,6] MAX [6,∞] MIN CS561-Lecture7-8-Macskassy-Spring2011

More Related