1 / 30

Poker for Fun and Profit (and intellectual challenge)

Poker for Fun and Profit (and intellectual challenge). Robert Holte Computing Science Dept. University of Alberta. Poker. World Series of Poker. Poker Research Group - core. Darse Billings (Ph.D.) Aaron Davidson M.Sc., Poki Neil Burch P/A, PsOpti Terence Schauenberg (M.Sc.), Adapti

carina
Télécharger la présentation

Poker for Fun and Profit (and intellectual challenge)

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Poker for Fun and Profit(and intellectual challenge) Robert Holte Computing Science Dept. University of Alberta

  2. Poker

  3. World Series of Poker

  4. Poker Research Group - core • Darse Billings (Ph.D.) • Aaron Davidson M.Sc., Poki • Neil Burch P/A, PsOpti • Terence Schauenberg (M.Sc.), Adapti • Advisors: J Schaeffer, D Szafron

  5. Poker Research Group – new arrivals • Bret Hoehn (M.Sc.) • Finnegan Southey (postdoc) • Michael Bowling • Dale Schuurmans • Rich Sutton • Robert Holte

  6. Our Goal

  7. PsOpti2 vs. “theCount”

  8. Play Us Online http://games.cs.ualberta.ca/poker/

  9. Poki’s Poker Academy • http://poki-poker.com

  10. Poker Variants • Many different variants of poker • Texas Hold’em the most skill-testing • No-Limit Texas Hold’em used to determine the world champion • Our research: Limit Texas Hold’em • Current focus: 2-player (heads up)

  11. Initial Flop Turn River 1,624,350 2 private cards to each player Bet Sequence Bet Sequence Bet Sequence Bet Sequence 9 of 19 17,296 3 community cards 9 of 19 O(1018) 45 1 community card 9 of 19 44 1 community card 19 2-player, limit, Texas Hold’em

  12. Research Issues • Chance events • Imperfect Information • Sheer size of the game tree • Opponent modelling is crucial • How best to use domain knowledge ? • Experimental method Variants have even more challenges: • More than 2 players (up to 10) • “No limit” (bid any amount)

  13. Issues: Chance Events • Utility of outcomes • currently just reason about expected payoff • short-term vs. long-term • High variance • was the outcome due to luck or skill ? • experiment design

  14. Issues: Imperfect Information • Probabilistic strategies are essential • Cannot construct your strategy in a bottom-up manner, as is done with perfect information games

  15. Issues: Size of the game • 2-player, Limit, Texas Hold’em game tree has about 1018 states • Linear Programming can solve games with 108 states

  16. Issues: Opponent Modelling • Nash equilibrium not good enough • Static • Defensive • Even the best humans have weaknesses that should be exploited • How to learn very quickly, with very noisy information ? • Expoitation vs. exploration • How not to be exploited yourself ?

  17. Issues: Using Expert Knowledge • We are fortunate to have unlimited access to a poker-playing expert (Darse) • How best to use his knowledge ? • Expert system (explicitly encoded knowledge) was not effective • Used his knowledge to devise abstractions that reduced the game size with minimal impact on strategic aspects of the game • Use him to evaluate the system

  18. Experimental Method • High variance • ‘bot play not the same as human play • Very limited access to expert humans other than our own expert

  19. Coping with very large games abstraction Full game tree T Abstract game tree T* (lossy) Solve (LP) too big to solve Strategy For T Strategy For T* (reverse mapping)

  20. Abstraction • Texas Hold'em 2-player game tree is too big for current LP –solvers (1,179,000,604,565,715,751) • Many ways of doing the abstractions • We require coarse-grained abstractions • Avoiding a severe loss of accuracy • Abstract to a set of smaller problems  108 states,  106 equations and unknowns

  21. Alternate Game Structures • Truncation of betting rounds • Bypassing betting rounds • Models with 3 rounds, 2 rounds, or 1 round • Many-to-one mapping of game-tree nodes to single nodes in the abstract game tree • How you do the mapping determines the overall accuracy (few good and many bad mappings) • This is the limiting factor of the method

  22. Initial Flop Turn River Bet Sequence Bet Sequence Bet Sequence Bet Sequence 3-round Model (expected value leaf nodes) 1,624,350 9 of 19 17,296 Texas Hold'em O(1018) 9 of 19 45 9 of 19 44 19

  23. Initial Flop River Turn 1-round Preflop Model Bet Sequence Bet Sequence Bet Sequence Bet Sequence 3-round Postflop Model (single flop) 1,624,350 9 of 19 17,296 Texas Hold'em O(1018) 9 of 19 45 9 of 19 44 19

  24. Abstractions • Board Q – 7 – 2  • Compare 1.A–3 2.A–4 3.A–K • Suit isomorphism (24X) (exact) • Rank near-equivalence (small error) • Bucketing Hands are mapped to a small set of buckets depending on • Current hand strength • Potential for improvement in hand strength

  25. Original Bucketing 1,1 1,2 1,3 …. 6,6 Transition Probabilities Next Round Bucketing 1,1 1,2 1,3 .… 6,6 Bucketing • Reduce branching factor at chance nodes • Partition hands into six classes per player • Overlaying strategically similar sub-trees

  26. Initial Flop Turn River 1,624,350 w2 (36) Bet Sequence Bet Sequence Bet Sequence Bet Sequence 7 of 15 9 of 19 Abstract Preflop Model O(107) x2 (36) 17,296 Texas Hold'em O(1018) 7 of 15 9 of 19 y2 (36) Abstract Postflop Model O(107) 45 7 of 15 9 of 19 44 z2 (36) 19 15

  27. Reverse Mapping • Bucket splitting • LP solution gives a strategy (recipe) • Each partition class split strong / weak • Split the randomized mixed strategy • {0, 0.2, 0.8} => {0, 0, 1.0} & {0, 0.4, 0.6} • Better hand selection (with some risk)

  28. Preflop Selby preflop model Flop Bets Turn 2 4 6 8 River Post Post Post Post Putting It All Together – PsOpti1

  29. Preflop 3-round preflop model Bets + model Flop 2 4 4 6 6 8 8 Turn River Post Post Post Post Post Post Post Putting It All Together – PsOpti2

  30. Conclusions • Game Theory can be applied to large problems and practical systems • Nash Equilibrium (minimax) too defensive, does not exploit the opponent’s weaknesses • Current work involves opponent modelling • Preliminary results are very promising • We hope to beat the best poker players in the world in the near future

More Related