1 / 23

Solving Probabilistic Combinatorial Games

Solving Probabilistic Combinatorial Games. Ling Zhao & Martin Mueller University of Alberta September 7, 2005. Paper link: http://www.cs.ualberta.ca/~zhao/PCG.pdf. Motivations. Ken Chen’s previous work to maximize winning chance in Go. Maximize points Vs. maximize winning probability

dayo
Télécharger la présentation

Solving Probabilistic Combinatorial Games

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Solving Probabilistic Combinatorial Games Ling Zhao & Martin Mueller University of Alberta September 7, 2005 Paper link: http://www.cs.ualberta.ca/~zhao/PCG.pdf

  2. Motivations • Ken Chen’s previous work to maximize winning chance in Go. • Maximize points Vs. maximize winning probability • How to solve the abstract game efficiently or play the abstract game well?

  3. Motivations Results (black plays first): +15 (80%), -7 (20%)

  4. Combinatorial Games

  5. Probabilistic Combinatorial Games • A Terminal node which is expressed as a probability distribution (d={[p1, v1], [p2, v2], …, [pn, vn] }) is a PCG. • If A1, A2, … An, B1, B2, …, Bn are PCGs, then {A1, A2, … An | B1, B2, …, Bn} is a PCG. • A sum of PCGs is a PCG. Left options Right options A move in a sum game consists of a move to an option in exactly one subgame and leaves all other subgames unchanged.

  6. Simple PCG (SPCG) • Each PCG has exactly one left option and one right option. • Each option leads immediately to a terminal node. • Each distribution has exactly 2 values with associate probabilities.

  7. Problems to Address • How to solve PCGs efficiently? • How to play PCGs well if resources are limited or fast play is required?

  8. Game Tree Analysis • Very regular game tree: a node at depth k has exactly n-k children, so n!/(n-k)! nodes in total at depth k. • Very large number of transpositions: C(n, k) * C(k, k/2) distinct nodes at depth k.

  9. Terminal Node Evaluation • Terminal node is a sum of probability distributions. • Winning probability

  10. Monte-Carlo Terminal Evaluation (MCTE) • Use Monte-Carlo methods to randomly collect k samples from 2n data points in the sum of n distributions. • Use the average winning percentage of samples (P’w)to approximate the overall winning probability (Pw). • Theory from statistics: Pw - P’w is a normal distribution, with mean=0 and std dev = <= • Experimental results:

  11. Monte-Carlo Interior Evaluation (MCIE) • Evaluation of anode is approximated by averaging the values of terminal nodesreached from it through random play. • Proposed by Abramsonin 1990. - Using 4x4 tic-tac-toe and 6x6 Othello for experiments. • Applied to Monte-Carlo Go by Bouzy and Helmstetter and several other researchers.

  12. SPCG Solver and Player • Solver: alpha-beta search, transposition tables, move ordering (MCIE & MCTE). • Player: alpha-beta search to a certain depth and use Monte-Carlo interior evaluation for frontier nodes.

  13. Experimental Results • 100 randomly generated games, and each game has 14 subgames. • Value distribution: probability from 0 to 1, value from –1000 to 1000. • AMD 2400MHz CPUs • 220 cache entries (terminal nodes have higher priority) • About 8 seconds to solve a game.

  14. Solver Performance • Monte-Carlo move ordering: dm – depth limit for Monte-Carlo move ordering being used, otherwise history heuristic is used. • Monte-Carlo interior evaluation: nt – percentage of all the current node’s descendant terminal nodes sampled. • Monte-Carlo terminal evaluation: nc – number of data points sampled. • Accurate terminal evaluation occupies 90% of the overall running time.

  15. Solver Performance: Results

  16. Monte-Carlo Player • Test against the perfect player. • Each game has two rounds: each side plays first once. • Winning probability is the average of the two rounds. • Parameters: search depth and nc.

  17. Monte-Carlo Player: Results

  18. Error in Move Ordering • Average probability error: Move: A B Actual win prob: 0.18 < 0.19 Estimate: 0.32 > 0.30 Win prob lost: 0.01 • Average probability error is the average of the winning probability lost in all move pairs of a node. • Worst probability error: the probability lost due to the wrong best move chosen.

  19. Results

  20. Conclusions • Efficient exact and heuristic solvers for SPCG. • Successfully incorporate Monte-Carlo move ordering into alpha-beta search to SPCG. • A heuristic evaluation techniques based on Monte-Carlo with performance close to the prefect player. • Extensive experiments for the two solvers.

  21. Future Work • Better algorithms to accurately evaluate terminal nodes? • Progressive pruning. • Why does the simple Abramson’s Expected Outcome model perform so well in move ordering?

  22. New Directions to Apply Monte-Carlo to Computer Go Monte-Carlo Go: + New Direction:

  23. Future Work: Transformation ? ? PCG solver or player

More Related