1 / 17

Trading optimality for speed…

Trading optimality for speed…. The admissibility condition guarantees that an optimal path is found In path planning a near-optimal path can be satisfactory Try to minimise search instead of minimising cost: i.e. find a near-optimal path (quickly).

watson
Télécharger la présentation

Trading optimality for speed…

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Trading optimality for speed… • The admissibility condition guarantees that an optimal path is found • In path planning a near-optimal path can be satisfactory • Try to minimise search instead of minimising cost: • i.e. find a near-optimal path (quickly)

  2. CSC344: AI for GamesLecture 6Online and local search Patrick Olivier p.l.olivier@ncl.ac.uk

  3. = - + f ( n ) ( 1 w ) g ( n ) wh ( n ) w Weighting… • w = 0.0 (breadth-first) • w = 0.5 (A*) • w = 1.0 (best-first, with f = h) • trading safety/optimality for speed • weight towards h when confident in the estimate of h

  4. Local search algorithms • In many optimisation problems, paths are irrelevant; goal state the solution • State space = set of "complete" configurations • Find configuration satisfying constraints, e.g., n-queens: n queens on an n ×n board with no two queens on the same row, column, or diagonal • Use local search algorithms which keep a single "current" state and try to improve it

  5. Hill-climbing search • "climbing Everest in thick fog with amnesia” • we can set up an objective function to be “best” when large (perform hill climbing) • …or we can use the previous formulation of heuristic and minimise the objective function (perform gradient descent)

  6. Local maxima/minina • Problem: depending on initial state, can get stuck in local maxima/minina 1/(1+H(n)) = 1/17 1/(1+H(n)) = 1/2 Local minima

  7. Local beam search • Keep track of k states rather than just one • Start with k randomly generated states • At each iteration, all the successors of all k states are generated • If any one is a goal state, stop; else select the k best successors from the complete list and repeat.

  8. Simulated annealing search • Idea: escape local maxima by allowing some "bad" moves but gradually decrease their frequency and range (VSLI layout, scheduling)

  9. Simulated annealing example • Point feature labelling

  10. Genetic algorithm search • A successor state is generated by combining two parent states • Start with k randomly generated states (population) • A state is represented as a string over a finite alphabet (often a string of 0s and 1s) • Evaluation function (fitness function). Higher values for better states. • Produce the next generation of states by selection, crossover, and mutation

  11. Genetic algorithms in games • Computationally expensive so primarily offline form of learning • Cloak, Dagger & DNA (Oidian Systems) • 4 DNA strands defining opponent behaviour • between battles, opponents play each other • Creatures (Millennium Interactive) • Genetic algorithms to learning the weights in a neural network that defines behaviour

  12. “Real-time” search concepts • In A* the whole path is computed off-line, before the agent walks through the path • This solution is only valid for static worlds • If the world changes in the meantime, the initial path is no longer valid: • new obstacles appear • position of goal changes (e.g. moving target)

  13. “Real-time” definitions • Off-line (non real-time): the solution is computed in a given amount of time before being executed • Real-time: One move is computed at a time, and that move executed before computing the next • Anytime: the algorithm constantly improves its solution through time capable of providing “current best” at any time

  14. Agent-based (online) search • For example: • mobile robot • NPC without perfect knowledge • agent that must act now with limited information • Planning and execution are interleaved • Could apply standard search techniques: • Best-first (but we know it is poor) • Depth-first (has to physically back-track) • A* (but nodes in the fringe are not accessible)

  15. LRTA*: Learning Real-time A* • Augment hill-climbing with memory • Store “current best estimate” • Follow path based on neighbours’ estimates • Update estimates based on experience • Experience  Learning • Flatten out local maxima…

  16. 1 1 1 1 1 1 1 1 1 1 2 4 1 9 9 3 4 9 1 8 8 8 2 4 2 4 1 3 4 5 5 9 8 1 4 1 5 9 8 4 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 LRTA*: example

  17. Learning real-time A*

More Related