300 likes | 440 Vues
Learning Relational Rules for Goal Decomposition. Prasad Tadepalli Oregon State University Chandra Reddy IBM T.J. Watson Research Center Supported by Office of Naval Research. A Critique of Current Research. Most work is confined to learning in isolation
E N D
Learning Relational Rules for Goal Decomposition Prasad Tadepalli Oregon State University Chandra Reddy IBM T.J. Watson Research Center Supported by Office of Naval Research Symposium on Reasoning and Learning
A Critique of Current Research • Most work is confined to learning in isolation • Predominantly employs propositional representations • The learner is passive and has to learn from random examples • The role of prior knowledge in learning is minimal Symposium on Reasoning and Learning
Our Approach • Learning in the context of hierarchical problem solving • The goals, states and actions are represented relationally • Active Learning: Learner can ask questions, pose problems to itself, and solve them • Declarative prior knowledge guides and speeds up learning Symposium on Reasoning and Learning
Air Traffic Control (ATC) Task(Ackerman and Kanfer study) Symposium on Reasoning and Learning
Goal Decomposition Rules (D-rules) • D-rules decompose goals into subgoals. goal:land(?plane) condition:plane-at(?plane, ?loc) & level(L3, ?loc) subgoals:move(?plane, L2)}; move(?plane, L1); land1(?plane) • Problems are solved by recursive decomposition of goals to subgoals • Control knowledge guides the selection of appropriate decomposition rules. Symposium on Reasoning and Learning
Domain theory for ATC task • Domain Axioms: • can-land-short(?p) :- type(?p propeller) • can-land-short(?p) : - type(?p DC10) & wind-speed(low) & runway-cond(dry) • Primitive Operators: • jump(?cursor-from, ?cursor-to), • short-deposit(?plane, ?runway), • long-deposit(?plane, ?runway), • select(?loc, ?plane) Symposium on Reasoning and Learning
Learning from Demonstration • Input Examples: • State: at(p1, 10), type(p1, propeller), fuel(p1, 5),cursor-loc(4), free(1), free(2),…, free(9), free(11),…, free(15), runway-cond(wet), wind-speed(high), wind-dir(south) • Goal: land-plane(p1) • Solution: jump(4, 10), select(10,p1), jump(10,14), short-deposit(p1,R2) • Output: underlying D-rules Symposium on Reasoning and Learning
Generalizing Examples • Examples are inductively generalized: • Examples to D-rules • Example goal D-rule Goal • Initial state Condition • Literals in other states Subgoals • Least General Generalization (lgg) X lgg H Problem: Size of lgg grows exponentially with the number of examples. Symposium on Reasoning and Learning
Learning from Queries • Use queries to prevent the exponential growth of the lgg: • (Reddy and Tadepalli, 1997) Non-recursive, single-predicate Horn programs are learnable from queries and examples. • Prune each literal in the lgg and ask a membership query (a question) to confirm that the result is not overgeneral. Symposium on Reasoning and Learning
Need for queries D Symposium on Reasoning and Learning
Need for queries x D Symposium on Reasoning and Learning
Need for queries x lgg D Symposium on Reasoning and Learning
Need for queries x D target Symposium on Reasoning and Learning
Need for queries overgeneral x D Symposium on Reasoning and Learning
Need for queries x D Symposium on Reasoning and Learning
Using Prior Knowledge • Explanation-Based Pruning: Remove literals that don't play a causal role in the plan e.g., free(1), free(2), ...etc. • Abstraction by Forward Chaining: can-land-short(?p) :- type(?p propeller) Helps learn a more general rule. • Learning subgoal order: Subgoal literals are maintained as a sequence of sets of literals. A set is refined into a sequence of smaller sets using multiple examples. Symposium on Reasoning and Learning
Learning Multiple D-Rules • Maintain a list of d-rules for each goal. • Combine a new example x with the first d-rule hi for which lgg(x,hi) is not over-general • Reduce the result and replace hi Symposium on Reasoning and Learning
Results on learning from demonstration Symposium on Reasoning and Learning
Learning from Exercises • Supplying solved training examples is too demanding for the teacher. • Solving problems from scratch is computationally hard. • A compromise solution: learning from exercises. • Exercises are helpful intermediate subproblems that help solve the main problems. • Solving easier subproblems makes it possible to solve more difficult problems. Symposium on Reasoning and Learning
Difficulty Levels in ATC Domain Symposium on Reasoning and Learning
Solving Exercises • Use previously learned d-rules as operators . • Iterative-deepening DFS to find short rules. • Generalization is done as before. Symposium on Reasoning and Learning
Query Answering by Testing • Generate test problems {InitialState, Goal} that match the d-rule. • Use the decomposition that the d-rule suggests, and solve the problems • If some problem cannot be solved the rule is over-general. Symposium on Reasoning and Learning
Results on learning from exercises • 14 d-rules Symposium on Reasoning and Learning
Conclusions • It is possible to learn useful problem-solving strategies in expressive representations. • Prior knowledge can be put to good use in learning. • Queries can be implemented approximately using heuristic techniques. • Learning from demonstration and learning from exercises make different tradeoffs with respect to learning and reasoning. Symposium on Reasoning and Learning
Electronic Arts Learning for Training Environments(Ron Metoyer) • Task Training • Sports • Military Boston Dynamics Inc. Who creates the training content? Symposium on Reasoning and Learning
Research Challenges • Learning must be on-line. Must learn quickly, since users can only give a few examples. • Extension to more complex strategy languages that include concurrency, partial observability, real-time execution, multiple agents, e.g., ConGolog • Provide a predictable model of generalization. • Allow learning from demonstrations, reinforcement, advice, and hints e.g., improving or learning to select between strategies. Symposium on Reasoning and Learning
Vehicle Routing & Product Delivery Symposium on Reasoning and Learning
Learning Challenges • Very large number of states and actions • Stochastic demands by customers and shops • Multiple agents (trucks, truck companies, shops, distribution centers) • Partial observability • Hierarchical decision making • Significant real-world impact Symposium on Reasoning and Learning
ICML Workshop onRelational Reinforcement Learning Paper Deadline: April 2 Check ICML website Symposium on Reasoning and Learning