1 / 22

COSC 4350 and 5350 Artificial Intelligence

COSC 4350 and 5350 Artificial Intelligence Kasparov vs. Machine: Reflection on the Essence of Intelligence via Game Playing (Part 2) Dr. Lappoon R. Tang Outline Enhancing Minimax search with Alpha Beta pruning Kasparov vs. Deep Blue 2 – what lessons have we learned? Open questions Readings

Samuel
Télécharger la présentation

COSC 4350 and 5350 Artificial Intelligence

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. COSC 4350 and 5350Artificial Intelligence Kasparov vs. Machine: Reflection on the Essence of Intelligence via Game Playing (Part 2) Dr. Lappoon R. Tang

  2. Outline • Enhancing Minimax search with Alpha Beta pruning • Kasparov vs. Deep Blue 2 – what lessons have we learned? • Open questions

  3. Readings • Section 6.3 • Section 6.4

  4. Last time … • But you may ask: what about the chess game? Can we use minimax to play chess as Deep Blue used it? • Good question • The game tree of chess has an average branching factor of 30 or more • Usually, a game can have 30 ply or more (i.e. a tree depth of 30+) – 2.05 x 1044 nodes! • Can minimax work on such a huge search space? Let’s find out more about it next time …

  5. These are states of End Game Review of Minimax search

  6. Alpha beta pruning • A complete minimax search requires exploring every nodes until meeting the terminal state • Too inefficient • Idea: An algorithm that tells which parts of the game tree can be ignored because computing the minimax values of the nodes in those parts will not affect the final choice made by MAX

  7. Alpha Beta Pruning Effectiveness • Effectiveness depends on node ordering • Worst Case ... • No advantage due to useless node ordering. That is, complexity = O(bd). • Best Case ... • Happens when the score of the leftmost node is smaller than the best so far – the O(bd) complexity of MiniMax becomes O(bd/2). • Don’t worry about the Math, the important point is that … • Since bd/2= bd, this is the same as having a branching factor of b instead of b, thereby doubling the allowable tree depth to explore • For example: Chess • Branching factor goes from ~35 to ~6. • Allows for a much deeper search given the same amount of time. • Expected Case ... • Empirical studies indicate and expected complexity of O(b3d/4).

  8. Dealing with Limited Time • In real games, there is usually a time limit T on making a move How do we take this into account?

  9. Dealing with Limited Time • Could we set a conservative depth-limit that guarantees we will find a move in time < T? • Example: Confine Minimax to explore up to a tree depth of 10 … • BUT there are problems …

  10. Dealing with Limited Time • Problem 1: What if we haven’t reached the terminal nodes to get their utility values? • Problem 2: Even if some of the terminal nodes have been reached, how do we select the minimax value for a level if some nodes at the level do not have minimax values?

  11. Evaluation function to the rescue (instead of using utility function) • We can solve both problems by combining alpha-beta pruning with a state evaluation function • Idea: Replace the terminal-test with a cut-off test – stops when a certain depth limit T has been reached • Instead of returning utility values from terminal nodes, one just computes the expected value of a non-terminal node (by using a heuristic function) • The purpose of a heuristic function is to estimate the quality of a resulting board configuration so that the “best move” can be made

  12. Cut-off Test • IF level(game_state) >= T, then return value_of(game_state) • Idea: For every game state – • Check if its level in the game tree >= T • If so, we stop and treat the state as if it was a terminal game state • Evaluate its expected utility score given the state description and return it back

  13. cutoff Cut-off Test (cont’d) Depth limited Minimax search We would like to do Minimax on this full game tree but ...  … we don’t have time, so we will explore it to some manageable depth.

  14. Heuristic Evaluation Functions • Often called static evaluation heuristics. • Evaluate board without knowing where exactly it will lead to • Use it to estimate the probability of winning from that node • Example: a chess game state in which the queen has been captured – but not the opponent’s queen – is not likely to lead to a winning state • Important qualities: • Must agree with the utility function f at the terminal states (i.e. h(end-game) = f(end-game)) • Must not take long, to compute. • Should be accurate enough.

  15. What should the heuristic function return? Expected Value vs. Material Value • Approach 1: Expected (utility) value of a state = probability-weighted average of the different possible utility values a state can have • Example: (0.72 * +1) + (0.20 * -1) + (0.08 * 0) = 0.52 (52%) • Problem: Estimation of probabilities is difficult • Approach 2: Material value of a state = weighted linear combination of the different features of a state • Example: α*materialBalance + β*centerControl + γ* … where material balance = Value of white pieces - Value of black pieces (pawn = 1, rook = 5, queen = 9, etc). • Problems: 1) features are assumed to be independent of each other; 2) can be time consuming

  16. The Horizon Effect • Sometimes disaster lurks just beyond search depth • computer captures queen, but a few moves later the opponent checkmates (i.e., wins) • The computer has a limited horizon; it cannotsee that this significant event could happen • How does one avoid catastrophic losses due to “short-sightedness”? • Quiescence search • Secondary search • Basically these are searches that look ahead a bit to see if a disaster is just beneath the horizon

  17. Computers and Grand Master Chess “Deep Blue 2” (IBM) • Parallel processor, 32 node cluster • Each node has 8 dedicated VLSI “chess chips” • Can search 250 million configurations/second – an average human being can search at most 30 moves/second? • Uses minimax, alpha-beta pruning, sophisticated heuristics • Currently it can search up to 14 plies (i.e. 7 pairs of moves) • Can avoid horizon by searching as deep as 40 plies • Uses book moves

  18. Computers and Grand Master Chess Kasparov vs. Deep Blue 2, May 1997 • 6 game full-regulation chess match sponsored by the Association of Computing Machinery (ACM) • Kasparov lost the match 2 wins & 1 tie to 3 wins & 1 tie • This was a historic achievement: the first time a computer became the best chess player on the planet

  19. Kasparov vs. Deep Blue 2: Lessons learned • Minimax search and alpha beta pruning are not “intelligent” by themselves – they are just mathematical calculations • There is arguably no observable intelligence at a microscopic level • However, an extremely large amount of seemingly unintelligent steps organized to achieve a well defined purpose can produce intelligent behavior at a macroscopic level • Deep Blue 2 can see more board configurations in one second than all the moves a person can see in his life time • “Quantity becomes quality” – G. Kasparov

  20. How far can we go with such an approach? • Checkers • Current world champion is Chinook (a computer system ) • Othello • computers can easily beat the world experts • Go • branching factor b ~ 360 (very large!) • $2 million prize for any system that can beat a world expert

  21. Open questions • Currently, brute force search + a big machine can amount to a sufficient level of computational intelligence that beats human intelligence on some non-trivial board games (e.g. chess and checkers) • Such an approach still won’t work for more complicated games with a crazy branching factor (e.g. Go) • Open question: Can we further combine human intelligence and raw computational power to build a game playing machine that “plays like a God”? • It can beat any system that relies purely on computational intelligence • It can also beat any human expert

More Related