Transposition Tables, Zobrist Hashing, GHI problem

Transposition Tables, Zobrist Hashing, GHI problem • The commonest, most important, enhancement is a transposition table - TT. • The same position may arise from two or more move sequences. • (Also, a sequence of moves may lead to repetition of a position) • If search has told you once what might happen in a position, it is waste of effort to repeat that search. • A TT allows you to lookup previously computed results. Q: If all these search algorithms are only about 20 lines of code, how come my search routine is 20 pages? A: It’s all the enhancements. http://csiweb.ucd.ie/Staff/acater/comp30260.html Artificial Intelligence for Games and Puzzles

Repetitions • In many games, the rules pay some special attention to repetitions, eg • in Chess: if the same position occurs three times, the game is drawn • in Go: it is forbidden to repeat a position • in Nine Men’s Morris: it is forbidden to make a mill by reversing your immediately previous move x  http://csiweb.ucd.ie/Staff/acater/comp30260.html Artificial Intelligence for Games and Puzzles

Transpositions • Even where repetitions are not specially interesting, there still may be several alternative sequences of moves that lead to a common position. • eg in Chess: 1: P-K4, P-K4. 2: P-Q4 1: P-Q4, P-K4. 2: P-K4 Same configuration on the board, same player to move, same depth to be searched, same results! http://csiweb.ucd.ie/Staff/acater/comp30260.html Artificial Intelligence for Games and Puzzles

What should a TT store? • Each position visited in a search should have information stored that:- • allows for later recognition of the same position if it recurs • disposition of pieces • who is to move • other game-specific information • eg in chess: castling possibilities, en-passant possibilities, move count since some irreversible event like a capture or pawn move • tells what valuation should be returned if it is still valid • numeric score; fail low/fail high; B* bounds; CN range. • allows judging whether or not the valuation is still valid • depth of the search that provided this valuation • (if ) what bounds were in force • (possibly) which problem was being solved when the entry was stored • tells what was the best move found by the previous search http://csiweb.ucd.ie/Staff/acater/comp30260.html Artificial Intelligence for Games and Puzzles

How is a TT stored? • A TT is usually maintained as a hash table, using a numeric key. • In the event of a hash-table collision, (where two distinct situations map to the same entry in the hash table), two alternative approaches are sensible: • discard the stored entry and replace it with the new one if the stored entry relates to an older problem, this is obvious approach; if the stored entry relates to the current problem, there is an alternative … • choose between the stored entry and the new one on some principled basis: • keeping the entry with greatest depth? • keeping the entry with the largest search effort? • keeping the entry with most reliable value? • keeping the entry with the tightest bounds? http://csiweb.ucd.ie/Staff/acater/comp30260.html Artificial Intelligence for Games and Puzzles

(Alfred) Zobrist Hashing • Zobrist hash codes aim to give a systematic numeric coding for situations arising in games, to allow easy determination of whether two situations are the same. (esp. is a situation stored in TT the same as the current situation) • Conventional wisdom is that 64-bit codes are necessary. • cf Shared Birthday problem • clashes will occur, but so infrequently as not to matter • The hashtable used for the TT will contain far fewer than 264 entries; some subset of the Zobrist key bits will be used as TT hash code. • One way tournament programs have been known to cheat is to instruct their opponent programs to reduce the size of their TTs. • Zobrist codes are updated by XOR with (64-bit) randomised numbers. • Some claim rand() is good enough; others play Hamming-distance tricks. http://csiweb.ucd.ie/Staff/acater/comp30260.html Artificial Intelligence for Games and Puzzles

Zobrist Hashing uses XOR • In Chess, • There are 12 kinds of piece, 64 squares • Make a 12x64 table of random numbers • XOR the piece&position numbers (PPNs) for all pieces initially on the board • When a piece moves, XOR the PPN for its starting square (it is no longer there) and also XOR the PPN for its ending square (it is there now) • When a capture occurs, also XOR the PPN for the captured piece (it is no longer there) • Use a few extra numbers for • white’s move vs black’s move • white can castle kingside, etc • In Go, • there are 2 kinds of piece, 361 intersections • Make a 2x361 table of random numbers • When a stone is placed, XOR its PPN • When a capture occurs, XOR the PPNs for all captured stones • When a move is impossible because of ko, XOR the PPNs for both black and white; XOR them back again after each legal move • Use a few extra numbers for • white’s move vs black’s move • balance of prisoners taken (W-B say) http://csiweb.ucd.ie/Staff/acater/comp30260.html Artificial Intelligence for Games and Puzzles

Advantages of using a TT • The biggest advantage by far is the reduction of search effort (for chess, x5) resulting from reuse of information about positions already searched. In a 2-player game, with branching factor b, if all moves were independent, • on 3rd half-ply, b2 moves would be duplications, leaving b3- b2 • substantial saving: b2 searches of subtrees of size bd-3 • on 4th half-ply, 2x b2 duplication, leaving b4-2x b2 • on 5th and on 6th half-ply, order b3 duplications • TTs are also useful to store an opening book. • Even where search must be performed again, e.g. because depth of search of stored move is not sufficient for current needs, the retrieved best-move information can be used for move ordering for  • You sometimes get “extra depth for free” • It can happen that you find a reusable entry resulting from a search to (say) depth 7 (from that position) when you only intended to search to depth 3 http://csiweb.ucd.ie/Staff/acater/comp30260.html Artificial Intelligence for Games and Puzzles

Problems of using a TT • To be useful, a TT should be large. 105 entries or better - or much better. This is unlikely to help the caching behaviour of your personal computer. • Code complexity: extra code is required to encode positions, probe the hashtable, judge the reusability of results, decide whether to overwrite or not in the case of hashtable collisions. • [1, 2] are generally considered to be prices worth paying • Search instability may result, when • an entry gives “extra depth for free”, then • gets overwritten with another entry for a different position, then • its position is revisited, searched again - but to lesser depth than before • Beware of the Graph History Interaction (GHI) problem • how you got to a position may affect what you may do thereafter http://csiweb.ucd.ie/Staff/acater/comp30260.html Artificial Intelligence for Games and Puzzles

Graph History Interaction problem - GHI • Recall that many games have special rules about repetition of situations. • In some games, repetition is illegal (sometimes, as in Nine Men’s Morris; or always, as in Go with “Superko” rule) and so should be scored as a loss • In some games, repetition results in a draw (sometimes, as in Chess; or always), and so should be scored as • a draw, if you are involved in playing a game • a failure, if you are trying to achieve a checkmate or similar goal • a success, if you are trying to avoid a checkmate or similar http://csiweb.ucd.ie/Staff/acater/comp30260.html Artificial Intelligence for Games and Puzzles

Two manifestations of GHI • The value of some position P2 may be first computed according to a lead-up sequence in which some position P1 does occur and does affect the course of the game after P2; • then this value might not be valid if P2 is reached via another lead-up sequence not involving P1 P1 • The value of some position P2 may be first computed according to a lead-up sequence in which some position P1 does not occur and so does not affect the course of the game after P2; • then this value might not be valid if P2 is reached via another lead-up sequence that does involve P1 P2 http://csiweb.ucd.ie/Staff/acater/comp30260.html Artificial Intelligence for Games and Puzzles

GHI causing wrong values to be backed up • Consider the AND-OR tree: (minimax trees are closely related) Suppose d is a loss, g is a win, and scenario is “first-player-loss” - like not mating when far ahead. a AND OR b c e f d h g http://csiweb.ucd.ie/Staff/acater/comp30260.html Artificial Intelligence for Games and Puzzles

GHI causing wrong values to be backed up • Consider the AND-OR tree: (minimax trees are closely related) Suppose d is a loss, g is a win, and scenario is “first-player-loss” - like not mating when far ahead. a AND OR b c e f d h g Search a-b-e-h-e: store loss at h http://csiweb.ucd.ie/Staff/acater/comp30260.html Artificial Intelligence for Games and Puzzles

GHI causing wrong values to be backed up • Consider the AND-OR tree: (minimax trees are closely related) Suppose d is a loss, g is a win, and scenario is “first-player-loss” - like not mating when far ahead. a AND OR b c e f d h g Search a-b-e-h-e: store loss at h Search a-b-d: store loss at b http://csiweb.ucd.ie/Staff/acater/comp30260.html Artificial Intelligence for Games and Puzzles

GHI causing wrong values to be backed up • Consider the AND-OR tree: (minimax trees are closely related) Suppose d is a loss, g is a win, and scenario is “first-player-loss” - like not mating when far ahead. a AND OR b c e f d h g Search a-b-e-h-e: store loss at h Search a-b-d: store loss at b Search a-c-f-h: stored loss at h is backed up to f and c http://csiweb.ucd.ie/Staff/acater/comp30260.html Artificial Intelligence for Games and Puzzles

GHI causing wrong values to be backed up • Consider the AND-OR tree: (minimax trees are closely related) Suppose d is a loss, g is a win, and scenario is “first-player-loss” - like not mating when far ahead. a AND OR b c e f d h g Search a-b-e-h-e: store loss at h Search a-b-d: store loss at b Search a-c-f-h: stored loss at h is backed up to f and c Now a is deemed a loss because b and c are both loss. http://csiweb.ucd.ie/Staff/acater/comp30260.html Artificial Intelligence for Games and Puzzles

GHI causing wrong values to be backed up • Consider the AND-OR tree: (minimax trees are closely related) Suppose d is a loss, g is a win, and scenario is “first-player-loss” - like not mating when far ahead. a AND OR b c e f d h g Search a-b-e-h-e: store loss at h Search a-b-d: store loss at b Search a-c-f-h: stored loss at h is backed up to f and c Now a is deemed a loss because b and c are both loss. But the sequence a-c-f-h-e-g is a win! http://csiweb.ucd.ie/Staff/acater/comp30260.html Artificial Intelligence for Games and Puzzles

GHI causing wrong values to be backed up • Consider the AND-OR tree: (minimax trees are closely related) Suppose d is a loss, g is a loss, and scenario is “current-player-loss” - like not being allowed to repeat. a AND OR b c d e f h g http://csiweb.ucd.ie/Staff/acater/comp30260.html Artificial Intelligence for Games and Puzzles

GHI causing wrong values to be backed up • Consider the AND-OR tree: (minimax trees are closely related) Suppose d is a loss, g is a loss, and scenario is “current-player-loss” - like not being allowed to repeat. a AND OR b c d e f h g Search a-b-e-h: store win at h, since opponent has no legal move from h http://csiweb.ucd.ie/Staff/acater/comp30260.html Artificial Intelligence for Games and Puzzles

GHI causing wrong values to be backed up • Consider the AND-OR tree: (minimax trees are closely related) Suppose d is a loss, g is a loss, and scenario is “current-player-loss” - like not being allowed to repeat. a AND OR b c d e f h g Search a-b-e-h: store win at h, since opponent has no legal move from h Search a-c-f-h: stored win at h is backed up to f and c http://csiweb.ucd.ie/Staff/acater/comp30260.html Artificial Intelligence for Games and Puzzles

GHI causing wrong values to be backed up • Consider the AND-OR tree: (minimax trees are closely related) Suppose d is a loss, g is a loss, and scenario is “current-player-loss” - like not being allowed to repeat. a AND OR b c d e f h g Search a-b-e-h: store win at h, since opponent has no legal move from h Search a-c-f-h: stored win at h is backed up to f and c Now a is deemed a win because c is a win. http://csiweb.ucd.ie/Staff/acater/comp30260.html Artificial Intelligence for Games and Puzzles

GHI causing wrong values to be backed up • Consider the AND-OR tree: (minimax trees are closely related) Suppose d is a loss, g is a loss, and scenario is “current-player-loss” - like not being allowed to repeat. a AND OR b c d e f h g Search a-b-e-h: store win at h, since opponent has no legal move from h Search a-c-f-h: stored win at h is backed up to f and c Now a is deemed a win because c is a win. But a is a loss! a-b-d, a-c-f-h-e-g, a-c-f-h-e-h all lose. “A general solution to the graph history interaction problem” Kishimoto & Mueller http://www.cs.ualberta.ca/~kishi/pdf_file/AAAI04KishimotoA.pdf http://csiweb.ucd.ie/Staff/acater/comp30260.html Artificial Intelligence for Games and Puzzles

Search enhancement: Killer Heuristic •  search is more efficient with well-ordered moves. • It often happens in practice that one player has a particularly good move available somewhere in the near future, good in several different situations. • If such a “killer move” can be recognised by the search algorithm, and considered first when it is among the generated possible moves,  is more likely to cutoff quickly. • Method: • at each ply in the search tree, keep a count of the number of times each possible move has caused a  cutoff. • when a list of moves for a node is generated, consider first the move that has caused most cutoffs at that ply level. http://csiweb.ucd.ie/Staff/acater/comp30260.html Artificial Intelligence for Games and Puzzles

Search enhancement: History Heuristic • With similar reasoning, a move that has repeatedly proved good may also be good at other ply levels. • Method: • Maintain (for each player separately) a table of moves that have caused beta cutoffs, recording the number of cutoffs for each move, regardless of ply depth • When generating moves for a node, sort them in descending order of the number of recorded cutoffs. http://csiweb.ucd.ie/Staff/acater/comp30260.html Artificial Intelligence for Games and Puzzles

Fuzzy move matching • With the Killer Heuristic and the History Heuristic, it is in some games quite likely that other “similar” moves will also be good. In chess for example, • A particular move found successful might be white rook:b3-b7 • This might be successful for various reasons: • Getting away from the b3 square (to avoid a threat, or to gain mobility) • Getting a white rook or queen to the 7th rank (to make an attack) • Getting a piece to the b7 square (to escape a check) • Moves might be compared to known killer moves on the basis of strict identity, or by sharing one or more of the piece-from-to elements of the description. http://csiweb.ucd.ie/Staff/acater/comp30260.html Artificial Intelligence for Games and Puzzles

Search enhancement: Singular Extensions • Part of the rationale of minimax search is that a static evaluation of a position is more likely to be reliable as the game gets closer to its end. • The idea of singular extensions is to use a very focused search to several ply deeper in order to get a more reliable evaluation for midgame positions. • In an  search, when static evaluation of a position is required, • Pick the possible next move with best static evaluation • Generate all possible successors of that “singular extension” • Repeat for some predetermined number of ply • Report the evaluation at the end of this playout sequence to  routine • The host  search must be carried out to lesser depth if singular extensions are pursued, since substantial move generation and evaluation is necessary. http://csiweb.ucd.ie/Staff/acater/comp30260.html Artificial Intelligence for Games and Puzzles

Transposition Tables, Zobrist Hashing, GHI problem