240 likes | 469 Vues
VK Dice. By: Kenny Gutierrez, Vyvy Pham Mentors: Sarah Eichhorn , Robert Campbell. Rules. Variation of the game Sequences On each turn, a player rolls 6 dice Player is given option to reroll once but all 1s must be kept Larger sequences are worth more points
E N D
VK Dice By: Kenny Gutierrez, Vyvy Pham Mentors: Sarah Eichhorn, Robert Campbell
Rules • Variation of the game Sequences • On each turn, a player rolls 6 dice • Player is given option to reroll once but all 1s must be kept • Larger sequences are worth more points • Three or more 1s = Restart Score • Repeats of a certain number is counted once • Winner is the first player to reach 100
Scoring 1 – 5 points 1,2 –10 points 1,2,3 –15 points 1,2,3,4 –20 points 1,2,3,4,5 –25 points 1,2,3,4,5,6 –35 points
Objective • Optimal Strategy • When to Reroll • Which dice to keep/reroll • Computer Adaptive Learning Program simulate one million rolls for each run. Programmed to run 5 times simultaneously • Determined which actions repeated most frequently for all game states • Repeated actions of the computer are compared to the Expected Values for each game state.
Description • 6^6= 46,656 game states • 462 don’t include repetition • Different states are grouped into sections according to the same numbers, regardless of repetition
Probability for Game States • Sections are further divided into: one 1, two 1s, no 1s • The probability is the same within each section • Probability is calculated for every reroll option. • Game States without 1s (Reroll 1-6 Dice) • Game States with one 1 (Reroll 1-5 Dice) • Game States with two 1s (Reroll 1-4 Dice)
Cases For Each Reroll • Reroll 6 Dice (No 1s) • -Get 1,2,3,4,5,6, • -Get 1, 2,3,4,5 not 6 • -Get 1,2,3,4 not 5, not 1,1,1 • -Get 1,2,3 not 4, not 1,1,1 • -Get 1,2 not 3, not 1,1,1 • -Get 1 not 2, not 1,1,1 • Reroll 5 Dice (one 1) • -Get 2,3,4,5,6 • -Get 2,3,4,5 not 6 • -Get 2,3,4 not 5, not 1,1,1 • -Get 2,3 not 4, not 1,1,1 • Get 2, not 3, not 1,1,1
Ex. Of Calculating ProbabilityReroll 5 for Initial Game States Without 1s • Case A: Get 1,3,4,5,6 5! (1/6)5 Probability= 5/324= 1.54%Case B: Get 1,3,4,5 not 6 *1 3 4 5 (4) = 5! = 120*1 3 4 5 (1 or 3 or 4 or 5) = 4(5!/2!) = 240Probability= 350/6^5 = 5/108 = 4.63%Case C: Get 1,3,4 not 5, not 111 *1 3 4 (2,2) or (6,6) = 2(5!/2!) = 120 * 1 3 4 (2 and 6) = 5! = 120 *1 3 4 (2 or 6) (1 or 3 or 4) = 6(5!/2) = 360 *1 3 4 (4,4) or (3,3) = 2(5!/3!) = 40 *1 3 4 (1,3) or (1,4) or (4,3) = 3(5!/(2! 2!)) = 90Probability= 730/6^5 = 365/3888 = 9.39%Case D: Get 1, 3, not 4, not 1,1,1 *1 3 (2,2,2) or (5,5,5) or (6,6,6) = 3(5!/3!) = 60 *1 3 (2,5,6) = 5! = 120 *1 3 (2,2) or (5,5) or (6,6) (Different: 2 or 5 or 6) = 6(5!/2!) = 360 *1 3 (2,2) or (5,5) or (6,6) (1 or 3) = 6(5!/(2! 2!)) = 180 *1 3 (3,3,3) = 5!/4! = 5 *1 3 (3,3) (2 or 5 or 6) = 3(5!/3!) = 60 *1 3 (1,3) (2 or 5 or 6) = 3(5!/4) = 90 *1 3 (2,2) or (5,6) or (4,6) (1 or 3) = 6(5!/2!) = 360 *1 3 (1,3,3) = 5! (3! 2!) = 10 Probability = 1245/6^5 = 415/2592 = 16.0% Case E: Get 1, not 3, not 1,1,1 *1 (2,2,2,2) or (4,4,4,4) or (5,5,5,5) or (6,6,6,6) = 4(5!/(2! 2!)) = 20 *1 (2,2,2) or (4,4,4) or (5,5,5) or (6,6,6) (Different: 2,4,5,6) = 12(5/3!) = 240 *1 (2,4,5,6) = 5! = 120 *1 (4,4) or (5,5) or (6,6) or (2,2) Differ: (6,2)(6,4)(4,5)(4,2)(6,5)(5,2)) = 12(5!/2!) = 720 *1 (4,4) or (5,5) or (6,6) or (2,2) Differ: (4,4) (5,5) (6,6) (2,2) = 12(5!/(2! 2!)) = 360 *1 (1) (4,4,4) or (5,5,5) or (6,6,6) or (2,2,2) = 4(5!/(3! 2!)) = 40 *1 (1) (2,2) or (4,4) or (5,5) or (6,6) Differ(2,4,5,6)= 12(5!/ (2! 2!)) = 360 *1 (1) (2,4,5) or (2,4,6) or (4,5,6) = 3(5!/2!) = 180Probability = 2040/65 = 85/324 = 26.23%
Finding Expected Values • Sum of all possible values each multiplied by the probability of its occurrence • Example: [1,2,3,4,4,5] Legend Pink: Probability of each cases multiplied by score Yellow: Probability of getting the same score and not 1,1,1
Inside the Program • Runs five times • Each Run • 1,000,000 dice rolls • Prints computer’s actionsfor all game states • Learns based on result of each roll through reward and punish system
How Does It Work? • Set of six numbers for each initial game state. • Each number pertains to one of the six dice • Initially, each number in the list contains 50 • Program generates random number between 1-100 for each number. • In order to reroll a die, the random number must be between the range 1-(list of number) • Ex Game State: [1, 2, 3, 5, 5, 6] List: [55, -10, 32, 87, 98, 103] Random #s: [60, 39, 47, 37, 12, 18] Action: [Keep, Keep, Keep, Reroll, Reroll, Reroll]
Rewarding & Punishing • Reward: • Certain number of points based on score after re-roll IF final score > initial score • Increase probability of repeating that action by either adding or subtracting • Adds when the computer rerolled, subtracts when it kept dice • Punish: • Only when the re-rolls end with at least three 1s • Decrease probability to avoid that action by either adding or subtracting • Adds when the computer kept dice, subtracts when it rerolled. TABLE FOR PUNISH & REWARD
Rewarding & Punishing • Computer will never reroll 1, regardless of the number in the list • After subtracting and adding to each list, the numbers will eventually go into the negatives or above 100 • Negatives= Always Keep (N) • Over 100= Always Reroll (Y) • Between 1-100 • Undetermined (U)
Best Move Mechanic • Mechanism implemented to help computer learn the optimal strategy • Before keeping a die, the computer checks if there is a better option • Ex. [ 1, 2, 3, 5, 6, 6] • If it wants to keep two 6s, it will change to keep 2 and 3.
Comparing Program with Theoretical Probability • Examples of each Initial Game States: • Without any 1s • With one 1 • With two 1s • Adaptive learning program- used the actions of the dice most common out of the five runs
Initial State without 1s Example: [2, 3, 4, 4, 6, 6] Theoretical Expected Values • Optimal Move: • Reroll 3 dice; Keeping 2,3,4 Adaptive Learning Program • After 5 runs: • Most Common Move: • Reroll 3 dice; Keeping 2,3,4 Conclusion: Expected Values matched EXACTLY to the Adaptive Learning Program
Initial State With one 1 Example: [1, 2, 3, 3, 4, 6] Theoretical Expected Values • Optimal Move: • Reroll 1 Dice; Keep 1,2,3,4,6 Adaptive Learning Program • 5 Sample Runs: • Most Common Move: • Reroll 2 Dice; Keeping 1,2,3,4 Conclusion: The expected values and the results from the program were similar. The computer chose the 2nd best action
Initial State With two 1s Example: [1, 1, 2, 4, 5, 5] Theoretical Expected Values • Optimal Move: • Reroll 1 Die; Keeping 1,1,2,4,5 Adaptive Learning Program • After 5 runs: • Most Common Move: • Uncertain Conclusion: There is a high probability of rerolling a 1 so the move is undetermined and needs more runs
Conclusion • Expected Values were found for ALL game states • Adaptive Learning Program with 5 runs and created a list of actions for the 6 dice for every game state. • Most common move from the program were compared to the expected values for each game state • Program’s common moves were either the best or 2nd best action indicated by the expected values • Game states with double 1s
Acknowledgements • Sarah Eichhorn: • Helping with the probabilities of the different game states • Answering questions every step of the way • Robert Campbell: • Helping with the computer Program • Teaching us how to calculate expected values