Adversarial Search

Adversarial Search Chapter 6

Outline • Optimal decisions • α-β pruning • Imperfect, real-time decisions

Games as search problems • Engaged intellectual faculties of humans. • Board games such as Chess. • Game playing is appealing target of AI. • State of the game is easy to represent . And Actions are restricted.

Games as search problems • This makes the Game playing an idealizations of world in which hostile agents act so as to diminish one’s well being. • Early researchers choose chess fro several reasons. • A chess playing computer is an existing proof of machine doing some thing thought to require intelligence. • Easy to represent the game as search through a space of possible gaming positions .

Games as search problems • The presence of opponent makes the decision problem complicated. • It introduces “Uncertainty”. • So, all game playing programs must deal with contingency problems. • Means prediction is impossible.

Games as search problems • The complexity of the games introduces completely a new kind of uncertainty that we have not seen so far. • It does not arise because it has missing information but one does not have a time to calculate the exact consequences of any move. • Instead, one has to make the best guess based on past experience and act before he is sure what action to take.

Games as search problems • Game usually has the time limits. • Game playing research has therefore spawned a number of interesting ideas on how to make the best use of time to reach good decisions. • Now ,the next discussion is how to find the theoretically best move. Then we look techniques for choosing a good move when time is limited. • Pruning allows us to to ignore portions of the search tree that makes no difference to the final choice. • Heuristic function allows us to approximate the true utility of state without complete search

Perfect Decisions in two person Games • The general case of a game with two players, whom we will call MAX and M1N, • MAX moves first, and then they take turns moving until the game is over. • At the end of the game, points are awarded to the winning player.

Formal definition of game as search problem • The initial state, which includes the board position and an indication of whose move it is. • A set of operators, which define the legal moves that a player can make. • A terminal test, which determines when the game is over. States where the game has ended are called terminal states. • A utility function (also called a payoff function), which gives a numeric value for the outcome of a game. In chess, the outcome is a win, loss, or draw, which we can represent by the values +1, —1, or 0

Perfect Decisions in two person Games • MAX would have to search for a sequence of moves that leads to a terminal state that is a winner .then go ahead and make the first move in the sequence. • Unfortunately, MIN has something to say about it. • MAX therefore must find a strategy that will lead to a winning terminal state regardless of what MIN does, where the strategy includes the correct move for MAX for each possible move by MIN. • We will begin by showing how to find the optimal (or rational) strategy.

Game tree (2-player, deterministic, turns) • Figure shows part of the search tree for the game of Tic-Tac-Toe. • From the initial state, has a choice of nine possible moves. Play alternates between MAX placing x's and MIN placing o's until we reach leaf nodes corresponding to terminal states: states where one player has three in a row or all the squares are filled. • The number on each leaf node indicates the utility value of the terminal state from the point of view of MAX; high values are assumed to be good for MAX and bad for MIN .It is MAX'S job to use the search tree to determine the best move.

Game tree (2-player, deterministic, turns)

Game tree • Even a simple game like Tic-Tac-Toe is too complex to show the whole search tree, • so we will switch to the absolutely trivial game in Figure . • The possible moves for MAX are labelled A I , AT, and AS. The possible replies loA\ for MIN are A11, A12, A13, and so on. This particular • game ends after one move each by MAX and MIN.

Game tree

Minimax Algorithm • The minimax algorithm is designed to determine the optimal strategy for MAX, and thus to decide what the best first move is. • The algorithm consists of five steps: • Generate the whole game tree, all the way down to the terminal states. • Apply the utility function to each terminal state to get its value. • Use the utility of the terminal states to determine the utility of the nodes one level higher up in the search tree.

Minimax Algorithm • Continue backing up the values from the leaf nodes toward the root, one layer at a time. • Eventually, the backed-up values reach the top of the tree; at that point, MAX chooses the move that leads to the highest value. In the topmost A node of , MAX has a choice of three moves that will lead to states with utility 3, 2, and 2, respectively. • Thus, MAX's best opening move is A1. This is called the minimax decision, because it maximizes the utility under the assumption that the opponent will play perfectly to minimize it.

Minimax • Perfect play for deterministic games • Idea: choose move to position with highest minimax value = best achievable payoff against best play • E.g., 2-ply game:

Minimax algorithm

Properties of minimax • Complete? Yes (if tree is finite) • Optimal? Yes (against an optimal opponent) • Time complexity? O(bm) (m is depth and b is legal move on each point) • Space complexity? O(bm) (depth-first exploration) • For chess, b ≈ 35, m ≈100 for "reasonable" games exact solution completely infeasible

α-β pruning example

Properties of α-β • Pruning does not affect final result • Good move ordering improves effectiveness of pruning • With "perfect ordering," time complexity = O(bm/2) doubles depth of search • A simple example of the value of reasoning about which computations are relevant (a form of metareasoning)

α is the value of the best (i.e., highest-value) choice found so far at any choice point along the path for max If v is worse than α, max will avoid it  prune that branch Define β similarly for min Why is it called α-β?

The α-β algorithm

Resource limits Suppose we have 100 secs, explore 104 nodes/sec106nodes per move Standard approach: • cutoff test: e.g., depth limit (perhaps add quiescence search) • evaluation function = estimated desirability of position

Evaluation functions • For chess, typically linear weighted sum of features Eval(s) = w1 f1(s) + w2 f2(s) + … + wn fn(s) • e.g., w1 = 9 with f1(s) = (number of white queens) – (number of black queens), etc.

Cutting off search MinimaxCutoff is identical to MinimaxValue except • Terminal? is replaced by Cutoff? • Utility is replaced by Eval Does it work in practice? bm = 106, b=35  m=4 4-ply lookahead is a hopeless chess player! • 4-ply ≈ human novice • 8-ply ≈ typical PC, human master • 12-ply ≈ Deep Blue, Kasparov

Deterministic games in practice • Checkers: Chinook ended 40-year-reign of human world champion Marion Tinsley in 1994. Used a precomputed endgame database defining perfect play for all positions involving 8 or fewer pieces on the board, a total of 444 billion positions. • Chess: Deep Blue defeated human world champion Garry Kasparov in a six-game match in 1997. Deep Blue searches 200 million positions per second, uses very sophisticated evaluation, and undisclosed methods for extending some lines of search up to 40 ply. • Othello: human champions refuse to compete against computers, who are too good. • Go: human champions refuse to compete against computers, who are too bad. In go, b > 300, so most programs use pattern knowledge bases to suggest plausible moves.

Summary • Games are fun to work on! • They illustrate several important points about AI • perfection is unattainable  must approximate • good idea to think about what to think about

Adversarial Search