1 / 23

Pat Langley School of Computing and Informatics Arizona State University Tempe, Arizona

Robust Reasoning and Learning About Goal-Directed Activities. Pat Langley School of Computing and Informatics Arizona State University Tempe, Arizona Institute for the Study of Learning and Expertise Palo Alto, California.

makana
Télécharger la présentation

Pat Langley School of Computing and Informatics Arizona State University Tempe, Arizona

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Robust Reasoning and Learning About Goal-Directed Activities Pat Langley School of Computing and Informatics Arizona State University Tempe, Arizona Institute for the Study of Learning and Expertise Palo Alto, California Thanks to T. Konik, D. Choi, U. Kutur, and D. Nau for their contributions. This talk reports work funded by grants from DARPA, which is not responsible for its contents.

  2. Abductive Plan Understanding We can state the task of abductive plan understanding as: • Given: A set of generalized conditional hierarchical plans; • Given: A partial sequence of observed actions or events; • Find: An explanation of these events in terms of other agents’ goals and intentions. We can also state a related task that involves plan learning: • Given: A set of primitive action models (plan operators); • Given: A partial sequence of action/event sequences with associated goals; • Find: A set of generalized conditional hierarchical plans that explain these and future behaviors.

  3. Problem Initial State ? goal LIGHT Learning Plan Knowledge from Demonstration Reactive Executor Plan Knowledge Learned plan knowledge If Impasse HTNs Demonstration Traces Expert States and actions Action model Concept definitions Background knowledge

  4. Inputs to LIGHT: Conceptual Knowledge Conceptual kowledge is cast as Horn clauses that specify relevant relations in the environment Hierarchically organized in memory Divided into primitive and non-primitive predicates Nonprimitive Concept patient-form-filled (?patient) Primitive Concept assigned-mission (?patient ?mission)

  5. Inputs to LIGHT: Action Models Action get-arrival-time (?patient ?from ?to) Effects Concept arrival-time(?patient) Precondition Concept patient(?p) and travel-from(?p ?from) and travel-to(?p ?to) • Operators describe low-level actions that agents can execute directly in the environment • Preconditions: legal conditions for action execution • Effects: expected changes when action is executed

  6. Inputs to LIGHT: Expert Traces and Goals Action instance get-arrival-time(P2) Goal Concept all-patients-arranged State • Expert demonstration traces • actions the expert takes and the resulting belief state • State: set of concept instances • Goal is a concept instance in the final state • LIGHT learns generalized skills that achieves similar goals Concept instance assigned-flight (P1 M1)

  7. Outputs of LIGHT: HTN Methods HTN goal concept subgoal HTN method precondition concept • Methods decompose goals into subgoals • if you have a goal and its precondition is satisfied, then apply its submethods or its operators • Similar to regular HTNs but methods indexed by goals achieved HTN method operator

  8. Learning HTNs by Trace Analysis concepts actions

  9. Learning HTNs by Trace Analysis Action Chaining

  10. Learning HTNs by Trace Analysis Concept Chaining concepts actions

  11. Explanation Structure transfer-hospital patient1, hospital2 arrange-ground-transportation SFO, hospital2, 1pm Time:3 location patient1 SFO 1pm close-airport hospital2, SFO assigned patient1 NW32 arrival-time NW32, 1pm dest-airport patient1 SFO query-arrival-time assign patient1, NW32 Time:1 Time:2 Flight-available Scheduled NW32

  12. Hierarchical Task Network transfer-hospital ?patient ?hospital close-airport ?hospital ?loc arrange-ground-transportation ?loc ?hospital ?time location ?patient ?loc ?time assigned ?patient ?flight arrival-time ?flight ?time dest-airport ?patient ?loc Scheduled ?flight Flight-available query-arrival-time assign ?patient ?flight

  13. Adapting HTNs to Plan Understanding HTNs and methods for learning them (like LIGHT) are typically designed for generating and executing plans. To adapt HTNs to plan understanding, we must revise the framework to support abductive inference when: • actions and events are only partially observed; • some goals and plans are more likely than others; • observations of others’ behaviors are inherently noisy. These characteristics require extensions to our representation, performance methods, and learning mechanisms.

  14. Markov Task Networks To this end, we have designed a new representatonal formalism for plan knowledge – Markov task networks – that include: • A set of goal-indexed HTN methods, each with; • a prior probability, , on the method ’s goal • a conditional probability, , for its precondition • a conditional probability, for each subgoal • A set of Horn clauses, each with: • a prior probability, , of the clause ’s head • a conditional probability, , for each condition This framework appears better suited to abductive inference about goal-directed behavior than Markov logic.

  15. Markov Task Networks P(Goal) P(Subgoal| Method) P(Precondition| Method) P(Subgoal| Method) A Markov task network is a goal-indexed HTN with probabilities for: • goals the agent may aim to achieve • subgoals he may pursue when using a given method • preconditions that suggest he is using the method • constraints among the subgoal orders It also includes probabilistic information about relevant relational concepts. P(Goal) P(Precondition| Method)

  16. Inference Over Markov Task Networks We can estimate the posterior probability of each goal in a Markov task network given a sequence of observed states by computing: Where when is a primitive relation that occurs in . ) To obtain actual probabilities, we normalize using the expressions: This is a variant on cascaded Bayesian classifiers (Provan et al., 1996).

  17. Like other probabilistic frameworks, Markov task networks require two forms of learning: Learning in Markov Task Networks • Parameter estimation occurs either: • by simple counting, as in naïve Bayes, in the fully supervised case where all goals/subgoals are given • by expectation maximization in the partly supervised case where only the top-level goal is provided • Structure learning occurs as in LIGHT, except that: • Explanation takes advantage of methods learned earlier • This process finds the most probable account of events Both forms of learning should be efficient computationally and require few training cases.

  18. Learning Markov Task Networks by Trace Analysis • Trace analysis proceeds as before, but guided by probabilistic inference that allows for: • Missing conceptual relations in states • Missing actions that connect states When an existing method is used to explain a trace, probabilities are updated accordingly. Missing concepts Missing actions

  19. Plans for Future Research To evaluate the framework of Markov task networks, we must: • Implement the performance and learning algorithms • Design tasks in realistic simulators like OneSAF and MadRTS • Use these simulators to generate sequences of observed states • Provide background knowledge about these domains • Measure accuracy of goal inference given handcrafted task networks • Measure ability of learned task networks to produce similar results Experimental results of this sort will suggest ways to improve our formulation and its methods for inference and learning.

  20. Related Work on Abduction and Learning Our approach incorporates ideas from a number of traditions: • Hierarchical task networks (Nau et al., 1999; Choi & Langley, 2005) • Logical methods for abductive inference (Ng & Mooney, 1990) • Relational Bayesian classifiers (Flach & Lachiche, 1999) • Cascaded Bayesian classifiers (Provan, Langley, & Binford, 1996) • Explanation-based learning from expert traces (Segre, 1987) • Statistical relational learning (Muggleton, 1996; Domingos, 2004) However, it adapts and combines them in ways appropriate to the task of abductive plan understanding and learning.

  21. End of Presentation

  22. Hierachical Concepts Nonprimitive Concepts (in-rightmost-lane?self ?clane) :percepts (self ?self) (segment ?seg) (line ?clane segment ?seg) :relations (driving-well-in-segment ?self ?seg ?clane) (last-lane ?clane) (not (lane-to-right ?clane ?anylane)) (driving-well-in-segment?self ?seg ?lane) :percepts (self ?self) (segment ?seg) (line ?lane segment ?seg) :relations (in-segment ?self ?seg) (in-lane ?self ?lane) (aligned-with-lane-in-segment ?self ?seg ?lane) (centered-in-lane ?self ?seg ?lane) (steering-wheel-straight ?self) (in-lane?self ?lane) :percepts (self ?self segment ?seg) (line ?lane segment ?seg dist ?dist)) :tests (> ?dist -10) (<= ?dist 0) Primitive Concepts

  23. Hierarchical Methods Nonprimitive Skill (in-rightmost-lane?self ?line) :percepts (self ?self) (line ?line) :start (last-lane ?line) :subgoals (driving-well-in-segment ?self ?seg ?line) (driving-well-in-segment ?self ?seg ?line) :percepts (segment ?seg) (line ?line) (self ?self) :start (steering-wheel-straight ?self) :subgoals (in-segment ?self ?seg) (centered-in-lane ?self ?seg ?line) (aligned-with-lane-in-segment ?self ?seg ?line) (steering-wheel-straight ?self) (in-segment ?self ?endsg) :percepts (self ?self speed ?speed) (intersection ?int cross ?cross) (segment ?endsg street ?cross angle ?angle) :start (in-intersection-for-right-turn ?self ?int) :actions (steer 1) Primitive Skill

More Related