1 / 42

Self-Improvement through Self-Understanding: Model-Based Reflection for Agent Adaptation

Self-Improvement through Self-Understanding: Model-Based Reflection for Agent Adaptation. J. William Murdock Intelligent Decision Aids Group Navy Center for Applied Research in Artificial Intelligence Naval Research Laboratory, Code 5515 Washington, DC 20375

halima
Télécharger la présentation

Self-Improvement through Self-Understanding: Model-Based Reflection for Agent Adaptation

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Self-Improvement through Self-Understanding: Model-Based Reflection for Agent Adaptation J. William Murdock Intelligent Decision Aids Group Navy Center for Applied Research in Artificial Intelligence Naval Research Laboratory, Code 5515 Washington, DC 20375 bill@murdocks.orghttp://bill.murdocks.org Presentation at NIST – March 18, 2002

  2. Adaptation • People adapt very well. • They figure out how to do new things. • If something doesn’t work, they try something else. • They understand how and why they are doing things. • Computer programs do not adapt very well. • They can only do what they are programmed for. • They keep making the same mistakes. • They have no understanding of themselves. Can we make computer programs adapt?

  3. REM(Reflective Evolutionary Mind) • Operating environment for intelligent agents • Provides support for adaptation to new functional requirements • Uses functional models, generative planning, and reinforcement learning • J. William Murdock and Ashok K. Goel

  4. Example:Web Browsing Agent • A mock-up of web browsing software • Based on Mosaic for X Windows, version 2.4 • Imitates not only behavior but also internal process and information of Mosaic 2.4 ps ??? html pdf txt

  5. Example:Disassembly and Assembly • Software agent for disassembly in the domain of cameras • Information about cameras • Information about relevant actions • e.g., pulling, unscrewing, etc. • Information about disassembly processing • e.g., decide how to disconnect subsystems from each other and then decide how to disassemble those subsystems separately. • Agent now needs to assemble a camera

  6. URL’s, servers, documents, etc. Access Remote Local … Request Receive Store TMK(Task-Method-Knowledge) • TMK models provide the agent with knowledge of its own design. • TMK encodes: • Tasks: functional specification / requirements and results • Methods: behavioral specification / composition and control • Knowledge: Domain concepts and relations

  7. REM Reasoning Process ... Implemented Task Execution ... A Method ... ADAPTED Implemented Task Trace ... Set of Input Values ADAPTED Method Set of Output Values Unimplemented Task Adaptation Set of Input Values

  8. Adaptation Process Generative Planning ... Task ADAPTED Implemented Task Situator (for Q-Learning) ... Set of Input Values ADAPTED Method Proactive Model Transfer ... ... Existing Method Similar Implemented Task Failure-Driven Model Transfer ... Trace A Method

  9. Execution Process ... Implemented Task Select Method ... A Method Trace Select Next Task Within Method Set of Input Values Set of Output Values Execute Primitive Task

  10. Selection: Q-Learning • Popular, simple form of reinforcement learning. • In each state, each possible decision is assigned an estimate of its potential value (“Q”). • For each decision, preference is given to higher Q values. • Each decision is reinforced, i.e., it’s Q value is altered based on the results of the actions. • These results include actual success or failure and the Q values of next available decisions.

  11. Q-Learning in REM • Decisions are made for method selection and for selecting new transitions within a method. • A decision state is a point in the reasoning (i.e., task, method) plus a set of all decisions which have been made in the past. • Initial Q values are set to 0. • Decides on option with highest Q value or randomly selects option with probabilities weighted by Q value (configurable). • A decision receives positive reinforcement when it leads immediately (without any other decisions) to the success of the overall task.

  12. Task-Method-Knowledge Language (TMKL) • A new, powerful formalism of TMK developed for REM. • Uses LOOM, a popular off-the-shelf knowledge representation framework: concepts, relations, etc. REM models not only the tasks of the domain but also itself in TMKL.

  13. Tasks in TMKL • All tasks can have input & output parameter lists and given & makes conditions. • A non-primitive task must have one or more methods which accomplishes it. • A primitive task must include one or more of the following: source code, a logical assertion, a specified output value. • Unimplemented tasks have neither of these.

  14. TMKL Task (define-task communicate-with-www-server :input (input-url) :output (server-reply) :makes (:and (document-at-location (value server-reply) (value input-url)) (document-at-location (value server-reply) local-host)) :by-mmethod (communicate-with-server-method))

  15. Methods in TMKL • Methods have provided and additional result conditions which specify incidental requirements and results. • In addition, a method specifies a start transition for its processing control. • Each transition specifies requirements for using it and a new state that it goes to. • Each state has a task and a set of outgoing transitions.

  16. Simple TMKL Method (define-mmethod external-display :provided (:not (internal-display-tag (value server-tag))) :series (select-display-command compile-display-command execute-display-command))

  17. Complex TMKL Method (define-mmethod make-plan-node-children-mmethod :series (select-child-plan-node make-subplan-hierarchy add-plan-mappings set-plan-node-children)) (tell (transition>links make-plan-node-children-mmethod-t3 equivalent-plan-nodes child-equivalent-plan-nodes) (transition>next make-plan-node-children-mmethod-t5 make-plan-node-children-mmethod-s1) (:create make-plan-node-children-terminate transition) (reasoning-state>transition make-plan-node-children-mmethod-s1 make-plan-node-children-terminate) (:about make-plan-node-children-terminate (transition>provided '(terminal-addam-value (value child-plan-node)))))

  18. Knowledge in TMKL Foundation: LOOM • Concepts, instances, relations • Concepts and relations are instances and can have facts about them. Knowledge representation in TMKL involves LOOM + some TMKL specific reflective concepts and relations.

  19. Some TMKLKnowledge Modeling (defconcept location) (defconcept computer :is-primitive location) (defconcept url :is-primitive location :roles (text)) (defrelation text :range string :characteristics :single-valued) (defrelation document-at-location :domain reply :range location) (tell (external-state-relation document-at-location))

  20. Sample Meta-Knowledge in TMKL • relation characteristics • single-valued/multiple-valued • symmetric, commutative • relations over relations • external/internal • state/definitional • generic relations • same-as • instance-of • inverse-of • concepts involving concepts • thing • meta-concept • concept

  21. Web Browsing Agent Mock-up of a web browser: Steps through the web-browsing process • Interactive Domain: Web agent is affected by the user and by the network • Dynamic Domain: Both users and networks often change • Knowledge Intensive Domain: Documents, networks, servers, local software, etc.

  22. Tasks and Methodsof Web Agent Process URL Process URL Method Communicate with WWW Server Display File Communicate with WWW Server Method Display File Method Request from Server Receive from Server Interpret Reply Display Interpreted File External Display Internal Display Select Display Command Compile Display Command Execute Display Command Execute Internal Display

  23. Example: PDF Viewer • The web agent is asked to browse the URL for a PDF file. It does not have any information about external viewers for PDF. • Because the agent already has a task for browsing URL’s it is executed first. • When the system fails, the user provides feedback indicating the correct viewer. • Failure-Driven Model Transfer

  24. Web Agent Adaptation ... External Display Select Display Command ... Compile Display Command Execute Display Command External Display Select Display Command Compile Display Command Execute Display Command Select Display Command Base Method Select Display Command Alternate Method Select Display Command Base Task Select Display Command Alternate Task

  25. Physical Device Disassembly • ADDAM: Legacy software agent for case-based, design-level disassembly planning and (simulated) execution • Interactive: Agent connects to a user specifying goals and to a complex physical environment • Dynamic: New designs and demands • Knowledge Intensive: Designs, plans, etc.

  26. Disassembly  Assembly • A user with access to ADDAM disassembly agent wishes to have this agent instead do assembly. • ADDAM has no assembly method thus must adapt first. • Since assembly is similar to disassembly, REM selects Proactive Model Transfer.

  27. Pieces of ADDAM which are key to Disassembly  Assembly Disassemble Plan Then Execute Disassembly Adapt Disassembly Plan Execute Plan Topology Based Plan Adaptation Hierarchical Plan Execution Make Plan Hierarchy Map Dependencies Select Next Action Execute Action Select Dependency Assert Dependency Make Equivalent Plan Nodes Method Make Equivalent Plan Node Add Equivalent Plan Node

  28. New Adapted Task inDisassembly  Assembly Assemble COPIED Plan Then Execute Disassembly COPIED Adapt Disassembly Plan COPIED Execute Plan COPIED Topology Based Plan Adaptation COPIED Hierarchical Plan Execution COPIED Make Plan Hierarchy COPIED Map Dependencies Select Next Action INSERTED Inversion Task 2 Execute Action COPIED Select Dependency INVERTED Assert Dependency COPIED Make Equivalent Plan Nodes Method COPIED Make Equivalent Plan Node INSERTED Inversion Task 1 COPIED Add Equivalent Plan Node

  29. Task: Assert Dependency Before: define-task Assert-Dependency input: target-before-node, target-after-node asserts: (node-precedes (value target-before-node) (value target-after-node)) After: define-task Mapped-Assert-Dependency input: target-before-node, target-after-node asserts: (node-follows (value target-before-node) (value target-after-node)))

  30. Task: Make Equivalent Plan Node define-task make-equivalent-plan-node input: base-plan-node, parent-plan-node, equivalent-topology-node output: equivalent-plan-node makes: (:and (plan-node-parent (value equivalent-plan-node) (value parent-plan-node)) (plan-node-object (value equivalent-plan-node) (value equivalent-topology-node)) (:implies (plan-action (value base-plan-node)) (type-of-action (value equivalent-plan-node) (type-of-action (value base-plan-node))))) by procedure ...

  31. Task:Inverted-Reversal-Task define-task inserted-reversal-task input: equivalent-plan-node asserts: (type-of-action (value equivalent-plan-node) (inverse-of (type-of-action (value equivalent-plan-node))))

  32. ADDAM Example: Layered Roof

  33. Roof Assembly

  34. Modified Roof Assembly: No Conflicting Goals

  35. Applicability ofProactive Model Transfer • Knowledge about the concepts and relations in the domain • Knowledge about how the tasks and methods affect these concepts and relations • Differences between the old task and the new map onto knowledge of the concepts and relations in the domain.

  36. Applicability ofFailure-Driven Model Transfer • May need less knowledge about the domain itself since the adaptation is grounded in a specific incident. • e.g., feedback about PDF for an example instead of advance knowledge of all document types. • Still requires knowledge about how the tasks and methods interact with the domain.

  37. Additional Mechanisms • Model-based adaptation may leave some design decisions unsolved. • These decisions may be solved by traditional decision making mechanisms, e.g., reinforcement learning. • Models may be unavailable or irrelevant for some tasks or subtasks • Generative planning can combine primitive actions.

  38. Level of Decomposition • Level of decomposition may be dictated by the nature of the agent. • Some tasks simply cannot be decomposed • In other situations, level of decomposition may be guided by the nature of adaptation to be done. • Can be brittle if unpredicted demands arise. • REM enables autonomous decomposition of primitives which addresses this problem.

  39. Computational Costs • Reasoning about models incurs some costs. • For very easy problems, this overhead may not be justified. • For other problems, the benefits enormously outweigh these costs. Models can localize planning and learning.

  40. Knowledge Requirements • Someone has to build an agent. • Builder should know what that agent does and how it does it  Can make model. • Analyst may be able to understand builder’s notes, etc.  Can make model • Some evidence for this in the context of software engineering / architectural extraction.

  41. Current Work: AHEAD • Theme: Analyzing hypotheses regarding asymmetric threats (e.g., criminals, terrorists). • Input: Hypotheses regarding a potential threat • Output: Argument for and/or against the hypotheses • Technique: Analogy over functional models • An extension to TMKL will encode known behaviors for asymetric threats and the purposes that the behaviors serve. • Analogical reasoning will enable retrieval and mapping of new hypotheses to existing models. • Models will provide arguments about how observed actions do or do not support the purposes of the hypothesized behavior. • Naval Research Laboratory / DARPA Evidence Extraction and Link Discovery program • David Aha, J. William Murdock, Len Breslow

  42. Summary • REM (Reflective Evolutionary Mind) • Operating environment for agents that adapt • TMKL (Task-Method-Knowledge Language) • The language for agents in REM • Functional modeling language for encoding computational processes • Adaptation • Some kinds of adaptation can be performed using specialized model-based techniques • Others require more generic planning & learning mechanisms (localized using models)

More Related