1 / 0

Evolving Heuristics for Searching Games

Evolving Heuristics for Searching Games. Evolutionary Computation and Artificial Life Supervisor : Moshe Sipper Achiya Elyasaf June, 2010. Overview. Searching Games State-Graphs Representation Uninformed Search Heuristics Informed Search Rush Hour Domain Specific Heuristic

fifi
Télécharger la présentation

Evolving Heuristics for Searching Games

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Evolving Heuristics forSearching Games

    Evolutionary Computation and Artificial Life Supervisor: Moshe Sipper AchiyaElyasaf June, 2010
  2. Overview Searching Games State-Graphs Representation Uninformed Search Heuristics Informed Search Rush Hour Domain Specific Heuristic Evolving Heuristics Coevolving Game Boards Results Freecell Domain Specific Heuristic Coevolving Game Boards Learning Methods Results
  3. Searching Games State-GraphsRepresentation Every puzzle/game can be represented as a state graph: Single player games such as puzzles, board games etc.: every piece move can be counted as a different state Multi player games such as chess, robocode etc. – the place of the player / the enemy, rest of the parameters (health, shield…) define a state
  4. Searching Games State-GraphsRepresentation Rush Hour:
  5. Searching Games State-GraphsRepresentation Blocksworld:
  6. Searching Games State-GraphsUninformed Search BFS – Exponential in the search depth DFS – Linear in the length of the current search path. BUT: We might “never” track down the right path. Usually games contain cycles Iterative Deepening: Combination of BFS & DFS Each iteration DFS with a depth limit is performed. Limit grows from one iteration to another Worst case - traverse the entire graph
  7. Searching Games State-GraphsUninformed Search Most of the game domains are PSPACE-Complete! Worst case - traverse the entire graph We need an informed-search!
  8. Searching Games State-GraphsHeuristics h:states -> Real. For every state s, h(s) is an estimation of the minimal distance/cost from s to a solution h is perfect: an informed search that tries states with highest h-score first – will simply stroll to solution Bad heuristic means the search might never get to answer For hard problems, finding h is hard We need a good heuristic function to guide informed search
  9. Searching Games State-Graphs Informed Search (Cont.) IDA*: Iterative-Deepening with A* The expanded nodes are pushed to the DFS stack by descending heuristic values Let g(si) be the min depth of state si: Only nodes with f(s)=g(s)+h(s)<depth-limit are visited Near optimal solution (depends on path-limit) The heuristic need to be admissible
  10. Overview Searching Games State-Graphs Representation Uninformed Search Heuristics Informed Search Rush Hour Domain Specific Heuristic Evolving Heuristics Coevolving Game Boards Results Freecell Domain Specific Heuristic Coevolving Game Boards Learning Methods Results
  11. Rush HourDomain Specific Heuristic GP-Rush [Hauptman et al, 2009] Hand Crafted heuristics: Goal distance – Manhattan distance Blocker estimation – lower bound (Admissble) Hybrid blockers distance – combine the two above Is Move To Secluded – did the car enter a secluded area Is Releasing move
  12. GA/GP For H1, … , Hn – building blocksHow should we choose the fittest heuristic? Minimum? Maximum? Linear combination? GA/GP may be used for: Building new heuristics from existing building blocks Finding weights for each heuristic (for applying linear combination) Finding conditions for applying each Probably, H should fit stage of search E.g. “goal” heuristics when assuming we’re close
  13. GA/GP (Cont.) If False Condition True * And + H5 / H3 * ≤ ≥ H1 0.1 H1 0.5 H1 0.4 H2 0.7
  14. GA/GP (Cont.)Back to Rush Hour Functions & Terminals: Genetic Operators: Cross-Over & Mutation on trees as Koza describes
  15. GA/GP (Cont.)Policies Fitness measure? Cross-over? Mutation?
  16. Co-Evolving Difficult Solvable 8x8 Boards Our enhanced IDA* search solved over 90% of the 6x6 problems We wanted to demonstrate our method’s scalability to larger boards
  17. Co-Evolving Difficult Solvable 8x8 Boards G G F F Fitness measure? Cross-over? Mutation? S S H H I I F F B A B A M M C C K P K P
  18. Rush Hour Results Average percentage of nodes required to solve test problems, with respect to the number of nodes scanned by a blind search:
  19. Rush Hour Results (Cont.) Time (in seconds) required to solve problems JAM01 . . . JAM40:
  20. Overview Searching Games State-Graphs Representation Uninformed Search Heuristics Informed Search Rush Hour Domain Specific Heuristic Evolving Heuristics Coevolving Game Boards Results Freecell Domain Specific Heuristic Coevolving Game Boards Learning Methods Results
  21. FreecellIntro FreeCell remained relatively obscure until Windows 95 There are 32,000 solvable problems (known as Microsoft 32K), except for game #11982, which has eluded solution so far
  22. FreecellIntro (Cont.) Foundations Freecells Cascades
  23. FreecellHeuristics Lowest card at Foundations Number of well placed cards Num of cards not at Foundations Num of Freecells and free Cascades Sum of the Cascades bottom cards Highest home card – lowest home card
  24. FreecellLearning methods As opposed to Rush-Hour, blind search could not solve even one problem The best solver to date solves 89% of Microsoft 32K Reasons: High branching factor Hard to generate a good heuristic
  25. FreecellLearning methods In Rush Hour: Hyper-Heuristics population Each generation – all individuals solve 5 different randomly selected instances Test set - 20% of the problems Training set – the rest In Freecell: This method failed
  26. FreecellLearning methods First try: Sort the problems by difficulty Learn gradually the whole training set FAILED: Days of training Over fitting and forgetness
  27. FreecellLearning methods Second try: Co-evolution: First population – Hyper-Heuristics Second population – Game boards with Hillis “Hall of Fame” FAILD: Ambiguous reason for low fitness
  28. FreecellLearning methods Third try: Co-evolution: First population – Hyper-Heuristics Second population – Group of 8 game boards SUCCESS: Fast learning process No ambiguity We create the right competioin
  29. Freecell Results
More Related