260 likes | 425 Vues
Non-Conservative Cost Bound Increases in IDA*. Doug Demyen. Introduction (Recap of IDA*) Advantages/ Disadvantages of IDA* Alternate approach Advantages/ Disadvantages Example: Travelling Salesman Problem. TSP Representation Creating the Pattern Database
E N D
Non-Conservative Cost Bound Increases in IDA* Doug Demyen
Introduction (Recap of IDA*) Advantages/ Disadvantages of IDA* Alternate approach Advantages/ Disadvantages Example: Travelling Salesman Problem TSP Representation Creating the Pattern Database Methods for Increasing the Depth Bound Experiment Setup and Parameters Preliminary Results Conclusion Outline
IDA* Recap • Search algorithm used for finding an optimal solution to a problem • Slower than A*, but requires is O(bd) memory whereas A* requires O(bd) • where b is the branching factor of the space and d is the depth of the goal • Usually used when there is insufficient memory to run A*
IDA* Recap (Cont’d) • For a node N, define: • g(N) = the distance from the start state to N • h(N) = an “at most” estimate of the distance from N to the goal state • f(N) = g(N) + h(N) • Define also a depth boundӨ for IDA* • When the algorithm begins, Ө = h(Start)
IDA* Recap (Cont’d) • Does a depth-first search on all nodes N where f(N) ≤ Ө • If the goal is not found in this iteration, updates Ө to be the minimum f(N) of the nodes N that were generated but not expanded, and searches again • That is, Өi+1 := minN {f(N)} • for N є {Ni | f(Ni) > Өi, Niє Succ(Nj), f(Nj) ≤ Өi}
Advantages • Won’t search any deeper than the goal: • first path found to goal is optimal • avoids expanding any extra “levels” • Best method when a great number of nodes share the f-values (especially in an exponential state spaces)
Disadvantages • In the worst case, each node has a different f-value – number of iterations will be O(n) • where n is the number of nodes in the space • This is disastrous • In this case, we want to update Ө to include more than one more f-value in each iteration
Dangers of this Approach • Dangers of incrementing Ө in this way stem from the possibility that in the last iteration, Ө > g(Goal): • The first time the goal is found, it might not be by an optimal path • Searching deeper in an exponential space, one could expand more nodes deeper than the goal than leading up to the goal
Converging to Optimal • Although the first path to the goal might not be optimal, we can find the optimal path: • when we find a path to the goal, Ө := g(Goal)-1 • continue searching in the current iteration for other paths to the goal, updating Ө similarly • when all nodes N with f(N) ≤ Ө have been expanded, the last (shortest) path to the goal must be optimal
The Traveling Salesman Problem • One problem on which IDA* has classically performed poorly is the TSP • Involves a number of cities with a distance (or cost) between each pair • In the non-fully-connected problem, distances between unconnected states can be thought of as infinite • The cost of traveling between two cities can be the same in both directions (symmetrical) or different (asymmetrical)
The TSP (Cont’d) • Want to visit each of the cities and return to the starting city while incurring as little cost as possible
TSP Representation • Distances (or costs) between cities are represented in a matrix as above • In the symmetrical TSP the matrix is symmetrical
TSP Representation (Cont’d) • The state of the agent is defined as a two-tuple of a set and an atom: ({a, c}, b) • Representing the visited cities and the current location, respectively • If we consider a to be the starting city: • The start state is ({}, a) • The goal state is ({a, b, c, d}, a)
Building PDBs with TSPs • Similarly to other problems, abstract cities within a TSP to the same constant • Ex: Φ: {a, b, c, d} {a, x, x, d} • When traveling from a city to another, take the distance to be the minimum of the entries in the rows with cities mapped to the same constant as the origin city and the columns with cities mapped to the same constant as the destination city
Example x 4 5 8 x 2 6
Creating the PDB • Moves are unidirectional, not invertible • Easiest to enumerate the state space in the forward direction, then when the goal is reached, for each node N in the path from Start Goal: • h(N) := min {h(N), h(Goal) – g(N)} • where initially h(N) = ∞, for all nodes N • For this example, goal = ({a, x, x, d}, a)
Increasing the Depth Bound • Several alternative methods have already been created for updating the depth-bound: • DFS* - double the depth bound each iteration • IDA*_CR - classify pruned nodes into “buckets” and increase the depth bound to include enough buckets containing a predefined number of nodes • RIDA* - uses regression to set the depth bound so the estimated increase in nodes expanded next iteration is constant
More Methods • I am testing DFS* and IDA*_CR, along with a number of other methods: • Multiply the IDA* depth bound by some constant (for example, 1.5) • Increase the depth bound to include a percentage of the “fringe” nodes (for example, 50% = median, 100% = maximum) • Increase the depth bound to include a constant number of the fringe nodes • More?
Experiment Setup • Currently using the 10-city TSP • Use an abstraction for the space (example: Φ({abcdefghij}) = {VVWWXXYYZZ}) • Populate a distance matrix randomly (either symmetrical or asymmetrical) • Enumerate the space to populate the pattern database
Experiment Setup (Cont’d) • Run IDA* using each of the different depth bound updating techniques • For each technique, record: • length of the first solution • time expired and nodes expanded in reaching it • time expired and nodes expanded in reaching the optimal solution • time expired and nodes expanded by the end of the algorithm
Variables to Manipulate • I will try symmetrical and asymmetrical TSP • Several different abstractions for the PDB • Different parameters for methods (for example, include 8 fringe nodes) • Possibly different upper bounds on inter-city distances
Results so far • Results taken from 40 runs of asymmetric 10-city TSP with a PDB using the domain abstraction: Φ({abcdefghij}) = {VVWWXXYYZZ} (paired) • DFS*, Өmin+50%, and the maximum fringe f-value produce very similar results: long first solution paths found very quickly • Interestingly, setting the depth bound to the 5th lowest fringe f-value always finds the optimal path first, faster than IDA* • Other techniques form a middle ground in speed and initial solution path length
Conclusion • In a state space like the TSP, non-conservative depth bound increments perform much better than standard IDA* • Despite the “trade-off” between speed and initial solution length, in my experiments, non-conservative methods still find the optimal solution more than 100 times after than standard IDA* • More to come . . .
References • B. W. Wah and Y. Shang, A Comparative Study of IDA*-Style Searches, Proc. 6th Int’l Conference on Tools with Artificial Intelligence, IEEE, Nov. 1994, pp. 290-296. • R. E. Korf, Space-Efficient Search Algorithms, ACM Computing Surveys (CSUR), Sept. 1995, pp. 337-339.