1 / 39

double AlphaBeta (state, depth , alpha, beta ) begin if depth <= 0 then

double AlphaBeta (state, depth , alpha, beta ) begin if depth <= 0 then return evaluation(state) //op pov for each action “a” possible from state nextstate = performAction(a, state) rval = - AlphaBeta (nextstate, depth-1 , -beta, -alpha ); if (rval >= beta) return rval;

kenyon-hunt
Télécharger la présentation

double AlphaBeta (state, depth , alpha, beta ) begin if depth <= 0 then

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. double AlphaBeta(state, depth, alpha, beta) begin if depth <= 0 then return evaluation(state) //op pov for each action “a” possible from state nextstate = performAction(a, state) rval = -AlphaBeta(nextstate, depth-1, -beta, -alpha); if (rval >= beta)return rval; if (rval > alpha) alpha = rval; endfor return alpha; end

  2. Meta-Reasoning for Search • problem: one legal move only (or one clear favourite) alpha-beta search will still generate (possibly large) search tree • similar symmetrical situations • idea: compute utility of expanding a node before expanding it • meta-reasoning (reasoning about reasoning): reason about how to spend computing time

  3. Where We are: Chess

  4. Deep Blue • algorithm: • iterative-deepening alpha-beta search, transposition table, databases incl. openings, grandmaster games (700000), endgames (all with 5 pieces, many with 6) • hardware: • 30 IBM RS/6000 processors • software search: at high level • 480 custom chess processors for • hardware search: search deep in the tree, move generation and ordering, position evaluation (8000 features) • average performance: • 126 million nodes/sec., 30 billion position/move generated, search depth: 14 (but up to 40 plies)

  5. Samuel’s Checkers Program (1952) • learn an evaluation function by self-play(see: machine learning) • beat its creator after several days of self-play • hardware: IBM 704 • 10kHz processor • 10000 words of memory • magnetic tape for long-term storage

  6. Chinook: Checkers World Champion • simple alpha-beta search (running on PCs) • database of 444 billion positions with eight or fewer pieces • problem: Marion Tinsley • world checkers champion for over 40 years • lost three games in all this time • 1990: Tinsley vs. Chinook: 20.5-18.5 • Chinook won two games! • 1994: Tinsley retires (for health reasons)

  7. Backgammon • TD-GAMMON • search only to depth 2 or 3 • evaluation function • machine learning techniques (see Samuel’s Checkers Program) • neural network • performance • ranked amongst top three players in the world • program’s opinions have altered received wisdom

  8. Go • most popular board game in Asia • 19x19 board: initial branching factor 361 • too much for search methods • best programs: Goemate/Go4++ • pattern recognition techniques (rules) • limited search (locally) • performance: 10 kyu (weak amateur)

  9. A Dose of Reality: Chance • unpredictability: • in real life: normal; often external events that are not predictable • in games: add random element, e.g. throwing dice, shuffling of cards • games with an element of chance are less “toy problems”

  10. Example: Backgammon • move: • roll pair of dice • move pieces according to result

  11. Search Trees with Chance Nodes • problem: • MAX knows its own legal moves • MAX does not know MIN’s possible responses • solution: introduce chance nodes • between all MIN and MAX nodes • with n children if there are n possible outcomes of the random element, each labelled with • the result of the random element • the probability of this outcome

  12. Example: Search Tree for Backgammon MAX move CHANCE probability +outcome 1/361-1 1/366-6 1/181-2 1/185-6 MIN move CHANCE probability +outcome

  13. Optimal Decisions for Games with Chance Elements • aim: pick move that leads to best position • idea: calculate the expected value over all possible outcomes of the random element • expectiminimax value

  14. Example: Simple Tree 2.1 1.3 0.9 × 2 + 0.1 × 3 = 2.1 0.9 × 1 + 0.1 × 4 = 1.3 0.9 0.1 0.9 0.1 2 3 1 4 2 2 3 3 1 1 4 4

  15. Complexity of Expectiminimax • time complexity: O(bmnm) • b: maximal number of possible moves • n: number of possible outcomes for the random element • m: maximal search depth • example: backgammon • average b is around 20 (but can be up to 4000 for doubles) • n = 21 • about three ply depth is feasible

More Related