Evolving Heuristics for Searching Games

Evolving Heuristics forSearching Games
Evolutionary Computation and Artificial Life Supervisor: Moshe Sipper AchiyaElyasaf June, 2010

Overview Searching Games State-Graphs Representation Uninformed Search Heuristics Informed Search Rush Hour Domain Specific Heuristic Evolving Heuristics Coevolving Game Boards Results Freecell Domain Specific Heuristic Coevolving Game Boards Learning Methods Results

Searching Games State-GraphsRepresentation Every puzzle/game can be represented as a state graph: Single player games such as puzzles, board games etc.: every piece move can be counted as a different state Multi player games such as chess, robocode etc. – the place of the player / the enemy, rest of the parameters (health, shield…) define a state

Searching Games State-GraphsRepresentation Rush Hour:

Searching Games State-GraphsRepresentation Blocksworld:

Searching Games State-GraphsUninformed Search BFS – Exponential in the search depth DFS – Linear in the length of the current search path. BUT: We might “never” track down the right path. Usually games contain cycles Iterative Deepening: Combination of BFS & DFS Each iteration DFS with a depth limit is performed. Limit grows from one iteration to another Worst case - traverse the entire graph

Searching Games State-GraphsUninformed Search Most of the game domains are PSPACE-Complete! Worst case - traverse the entire graph We need an informed-search!

Searching Games State-GraphsHeuristics h:states -> Real. For every state s, h(s) is an estimation of the minimal distance/cost from s to a solution h is perfect: an informed search that tries states with highest h-score first – will simply stroll to solution Bad heuristic means the search might never get to answer For hard problems, finding h is hard We need a good heuristic function to guide informed search

Searching Games State-Graphs Informed Search (Cont.) IDA*: Iterative-Deepening with A* The expanded nodes are pushed to the DFS stack by descending heuristic values Let g(si) be the min depth of state si: Only nodes with f(s)=g(s)+h(s)<depth-limit are visited Near optimal solution (depends on path-limit) The heuristic need to be admissible

Rush HourDomain Specific Heuristic GP-Rush [Hauptman et al, 2009] Hand Crafted heuristics: Goal distance – Manhattan distance Blocker estimation – lower bound (Admissble) Hybrid blockers distance – combine the two above Is Move To Secluded – did the car enter a secluded area Is Releasing move

GA/GP For H1, … , Hn – building blocksHow should we choose the fittest heuristic? Minimum? Maximum? Linear combination? GA/GP may be used for: Building new heuristics from existing building blocks Finding weights for each heuristic (for applying linear combination) Finding conditions for applying each Probably, H should fit stage of search E.g. “goal” heuristics when assuming we’re close

GA/GP (Cont.) If False Condition True * And + H5 / H3 * ≤ ≥ H1 0.1 H1 0.5 H1 0.4 H2 0.7

GA/GP (Cont.)Back to Rush Hour Functions & Terminals: Genetic Operators: Cross-Over & Mutation on trees as Koza describes

GA/GP (Cont.)Policies Fitness measure? Cross-over? Mutation?

Co-Evolving Difficult Solvable 8x8 Boards Our enhanced IDA* search solved over 90% of the 6x6 problems We wanted to demonstrate our method’s scalability to larger boards

Co-Evolving Difficult Solvable 8x8 Boards G G F F Fitness measure? Cross-over? Mutation? S S H H I I F F B A B A M M C C K P K P

Rush Hour Results Average percentage of nodes required to solve test problems, with respect to the number of nodes scanned by a blind search:

Rush Hour Results (Cont.) Time (in seconds) required to solve problems JAM01 . . . JAM40:

FreecellIntro FreeCell remained relatively obscure until Windows 95 There are 32,000 solvable problems (known as Microsoft 32K), except for game #11982, which has eluded solution so far

FreecellIntro (Cont.) Foundations Freecells Cascades

FreecellHeuristics Lowest card at Foundations Number of well placed cards Num of cards not at Foundations Num of Freecells and free Cascades Sum of the Cascades bottom cards Highest home card – lowest home card

FreecellLearning methods As opposed to Rush-Hour, blind search could not solve even one problem The best solver to date solves 89% of Microsoft 32K Reasons: High branching factor Hard to generate a good heuristic

FreecellLearning methods In Rush Hour: Hyper-Heuristics population Each generation – all individuals solve 5 different randomly selected instances Test set - 20% of the problems Training set – the rest In Freecell: This method failed

FreecellLearning methods First try: Sort the problems by difficulty Learn gradually the whole training set FAILED: Days of training Over fitting and forgetness

FreecellLearning methods Second try: Co-evolution: First population – Hyper-Heuristics Second population – Game boards with Hillis “Hall of Fame” FAILD: Ambiguous reason for low fitness

FreecellLearning methods Third try: Co-evolution: First population – Hyper-Heuristics Second population – Group of 8 game boards SUCCESS: Fast learning process No ambiguity We create the right competioin

Freecell Results

Evolving Heuristics for Searching Games

Evolving Heuristics for Searching Games

Presentation Transcript

Evolving Heuristics forSearching Games

Evolving Local Search Heuristics for SAT Using Genetic Programming

Heuristics

Heuristics for Ample Sets

HEURISTICS FOR PROCESS SYNTHESIS

Heuristics

Heuristics

HEURISTICS FOR PROCESS SYNTHESIS

Using Heuristics in Games

Heuristics for Planning

Heuristics

Heuristics

Evolving Multimodal Networks for Multitask Games

Heuristics

Heuristics for Meta Scheduling

Heuristics

Searching for Macro-operators with Automatically Generated Heuristics

Heuristics

How Online Slot Games Are Evolving

Heuristics

Risk Factor Searching Heuristics for SNP Case-Control Studies

Heuristics

Heuristics for backtracking algorithms