Game Playing

Game Playing Mini-Max search Alpha-Beta pruning General concerns on games

Why study board games ? • Abstract and pure form of competition that seems to require intelligence • Easy to represent the states and actions • Very little world knowledge required ! • Game playing is a special case of a search problem, with some new requirements. • One of the oldest subfields of AI (Shannon and Turing, 1950)

Deterministic Chance Perfect information Imperfect information Types of games Chess, checkers, go, othello Backgammon, monopoly Sea battle Bridge, poker, scrabble, nuclear war

Why new techniques for games? • We don’t know the opponents move ! • The size of the search space: • Chess : ~15 moves possible per state, 80 ply • 1580 nodes in tree • Go : ~200 moves per state, 300 ply • 200300 nodes in tree • Game playing algorithms: • Search tree only up to some depth bound • Use an evaluation functionat the depth bound • Propagate the evaluation upwards in the tree • “Contingency” problem:

Select this move MAX 3 MIN 3 2 1 MAX 2 5 3 1 4 4 3 MINI MAX • Restrictions: • 2 players:MAX(computer)and MIN (opponent) • deterministic, perfect information • Select a depth-bound (say: 2) and evaluation function - Construct the tree up till the depth-bound - Compute the evaluation function for the leaves - Propagate the evaluation function upwards: - taking minima inMIN - taking maxima inMAX

The MINI-MAX algorithm: Initialise depthbound; Minimax (board, depth) = IFdepth=depthbound THENreturnstatic_evaluation(board); ELSEIF maximizing_level(depth) THENFOR EACH child child of board compute Minimax(child, depth+1); return maximum over all children; ELSEIF minimizing_level(depth) THENFOR EACH child child of board compute Minimax(child, depth+1); return minimum over all children; Call: Minimax(current_board, 0)

Alpha-Beta Cut-off • Generally applied optimization on Mini-max. • Instead of: • first creating the entire tree (up to depth-level) • then doing all propagation • Interleave the generation of the tree and the propagation of values. • Point: • some of the obtained values in the tree will provide information that other (non-generated) parts are redundant and do not need to be generated.

MAX 2 1 2 =2 MIN MAX 2 5 1 Alpha-Beta idea: • Principles: • generate the tree depth-first, left-to-right • propagate final values of nodes as initial estimates for their parent node. - The MIN-value (1) is already smaller than the MAX-value of the parent (2) - The MIN-value can only decrease further, - The MAX-value is only allowed to increase, - No point in computing further below this node

- The (temporary) values at MAX-nodes are ALPHA-values MAX 2 Alpha-value 1 2 =2 MIN Beta-value MAX 2 5 1 Terminology: - The (temporary) values atMIN-nodes areBETA-values

- If an ALPHA-value is larger or equal than the Beta-value of a descendant node: stop generation of the children of the descendant MAX 2 Alpha-value 1 2 =2 MIN Beta-value MAX 2 5 1 The Alpha-Beta principles (1): 

- If an Beta-valueis smaller or equal than the Alpha-valueof a descendant node: stop generation of the children of the descendant MAX 2 Alpha-value 2 =2 MIN Beta-value 1 MAX 2 5 3 The Alpha-Beta principles (2): 

 4  5 = 5 16 31 39  8  5 23 6 = 4 = 5 15 30  3 38  1  2  1  8 33 10 18 2  3  2 25  4  3 35 = 8 12 20 5  9  9  6 = 3 8 27 29 = 4 = 5 37 14 22 8 7 3 9 1 6 2 4 1 1 3 5 3 9 2 6 5 2 1 2 3 9 7 2 8 6 4 Mini-Max with at work: MAX MIN MAX 1 3 4 7 9 11 13 17 19 21 24 26 28 32 34 36 11 static evaluations saved !!

- For game trees with at least 4 Min/Max layers: the Alpha - Beta rules apply also to deeper levels.  4  4  4  4  2 4 2 “DEEP” cut-offs

MAX - If at every layer: the best node is the left-most one MIN MAX The Gain: Best case: Only THICK is explored

MAX MIN MAX 21 24 27 12 15 18 3 6 9 3 2 1 6 5 4 9 8 7 21 20 19 24 23 22 27 26 25 12 11 10 15 14 13 18 17 16 Example of a perfectly ordered tree 21 21 12 3

- Alpha / Beta : best case : # (static evaluations) = - The proof is by induction. How much gain ? 2 bd/2 - 1 (if d is even) b(d+1)/2 + b(d-1)/2 - 1 (if d is odd) - In the running example: d=3, b=3 : 11 !

b = 10 # Static evaluations No pruning 100000 Alpha-Beta Best case 10000 1000 100 10 Depth 1 2 3 4 5 6 7 - Note: a logarithmic scale. Best case gain pictured: - Conclusion: still exponential growth !! - Worst case?? For some trees alpha-beta does nothing, For some trees: impossible to reorder to avoid cut-offs

Queen lost Pawn lost Queen lost The horizon effect. horizon = depth bound of mini-max • Because of the depth-bound • we prefer to delay disasters, although we don’t prevent them !! • solution: heuristic continuations

How to play within reasonable time bounds? Time bounds: Even with fixed depth-bound, times can vary strongly! Solution: Iterative Deepening !!! Remember: overhead of previous searches = 1/b Good investment to be sure to have a move ready.

Ex.: Backgammon: Form of the game tree: Games of chance

State of the art Drawn from an article by Mathew Ginsberg, Scientific American, Winter 1998, Special Issue on Exploring Intelligence

State of the art (2)

State of the art (3)

Chess Rating 3500 ? Kasparov 3000 2500 2000 1500 Depth in ply 2 4 6 8 10 12 14 Further increase of depth was likely to win ! Win of deep blue predicted: Computer chess ratings studied around 90ies:

Game Playing

Game Playing

Presentation Transcript

Game Playing

Game Playing

Game Playing

Game Playing

Game Playing

Game-Playing

Game Playing

Game Playing

GAME PLAYING

Game Playing

Game playing

Game Playing

Game Playing

Game Playing

Game Playing

Game Playing

Game Playing

Game Playing

Game Playing

Game Playing

Game Playing

Game Playing