AI – Week 17 Machine Learning Applied to AI Planning: LOCM

AI – Week 17 Machine Learning Applied to AI Planning: LOCM Lee McCluskey, room 2/09 Email lee@hud.ac.uk http://scom.hud.ac.uk/scomtlm/cha2555/

Will we always need to engineer knowledge bases for planners? (pddl Pipes World) E.G. …… (:durative-action PUSH-START :parameters( ?pipe – pipe ?batch-atom-in – batch-atom ?from-area - area ?to-area - area ?first-batch-atom – batch-atom ?product-batch-atom-in – product ?product-first-batch - product) :duration (= ?duration (/ 1 (speed ?pipe))) :condition (and (over all (normal ?pipe)) (at start (first ?first-batch-atom ?pipe)) (at start (connect ?from-area ?to-area ?pipe)) (at start (on ?batch-atom-in ?from-area)) (at start (not-unitary ?pipe)) (at start (is-product ?batch-atom-in ?product-batch-atom-in)) (at start (is-product ?first-batch-atom ?product-first-batch)) (at start (may-interface ?product-batch-atom-in ?product-first-batch))) :effect (and (at end (push-updating ?pipe)) (at end (not (normal ?pipe))) (at end (first ?batch-atom-in ?pipe)) (at start (not (first ?first-batch-atom ?pipe))) (at end (follow ?first-batch-atom ?batch-atom-in)) (at start (not (on ?batch-atom-in ?from-area))) ) ) ENGINEERED WITH VARIABLE DOMAINS, RELATIONS, PROPERTIES, CONDITIONS ..

Machine Learning applied to AI Planning Automated Knowledge Acquisition: learning the domain model. One Promising Direction: Give a learning system a number of PLANS, or let it “observe” plans. Get the learning system to learn (“mine”) the details of the actions by “inducing” the operator schema This could be termed “process mining”. Example Applications: • Learn effect of operating system instructions • Learn moves / rules in a game • Learn meaning of actions in a work-flow • Learn meaning of business processes

Learning PDDL Domain Models:Where would training plans come from? Training plan scripts could come from several types of activity: 1. (Goal Directed) Solutions from current planners using existing domain models 2. Random plan scripts generated using existing domain models 3. Harvested plan scripts from human activities such as game playing 4. Recorded or logged plan scripts from computer or natural processes

Example System: LOCM learning of object-centred models

Will use “tyre world” as a running example Trace of Changing a Car Wheel fetch_jack jack1 boot1 remove_wheel wheel0 hub0 jack0 jack_up hub1 jack1 put_on_wheel wheel2 hub0 jack0 fetch_wrench wrench0 boot0 jack_down hub1jack1 putaway_wrench wrench1 boot0 ………………..etc (:action remove_wheel :parameters (?Wheel1 - wheel ?Hub2 - hub ?Jack3 - jack) :precondition (and (wheel_state2 ?Wheel1 ?Hub2) (hub_state1 ?Hub2 ?Jack3 ?Wheel1) (jack_state1 ?Jack3 ?Hub2)) :effect (and (wheel_state1 ?Wheel1) (not (wheel_state2 ?Wheel1 ?Hub2)) (hub_state0 ?Hub2 ?Jack3) (not (hub_state1 ?Hub2 ?Jack3 ?Wheel1)) (jack_state0 ?Jack3 ?Hub2) (not (jack_state1 ?Jack3 ?Hub2))) ) LOCM Meaning of “remove_wheel” Action in Planning-ready format

LOCM assumptions (‘sort’ sort of means ‘class’) The behaviour of objects in a sort can be represented by a FSM. The output state of an object is the same as the input state of the object in the next action fetch_jack jack1 boot1 remove_wheel wheel0 hub0 jack0 jack_up hub1 jack1 put_on_wheel wheel2 hub0 jack0 fetch_wrench wrench0 boot0 jack_down hub1jack1 putaway_wrench wrench1 boot0 Objects that occur together over 2 or more actions indicate associations between object sorts The same name is used for the same action Each occurrence of the same action has the same number of objects of the same sort as arguments Inducing Action Semantics from Traces

LOCM - assumptions INPUTS: traces of “plans” e.g. one plan in the tyre world might be: open(c1); fetchjack(j1,c1); fetchwrench(wr1,c1); close(c1); open(c2);fetchwrench(wr2,c2); fetchjack(j2, c2); close(c2); close(c3); open(c3) OUPUTS: PDDL Domain Model LOCM Assumptions to do with regularity: • Each sequence contains action names followed by a list of parameters which are objects used by that action • Different instances of actions have the same number of parameters in the same order, and of the same type (sort) • Sequences are SOUND (actions can be executed in turn) • The objects referred to by the training plans can thus be partitioned into a set of distinct “sorts”. Cresswell, S.N., McCluskey, T.L. and West, Margaret M. (2013) Acquiring planning domain models using LOCM. Knowledge Engineering Review. ISSN 0269-8889 , http://eprints.hud.ac.uk/9052/

LOCM- more assumptions open(c1); fetchjack(j1,c1); fetchwrench(wr1,c1); close(c1); open(c2);fetch wrench(wr2,c2); fetchjack(j2, c2); close(c2); close(c3); open(c3) LOCM Assumptions to do with behaviour: • All objects of the same sort behave in the same way and states of an object can be described by an FSM • For every specific object: the output state of one action is the SAME as the input state of the next action affecting it • For each action instance, an object it is applied to always starts and finishes in the same state (transitions are 1-1) INDUCED STATES OF CAR “BOOT” SO RT fetch_jack.2 open.1 fetch_wrench.2 boot1 (open) boot0 (closed) close.1

LOCM- create state machines open(c1); fetchjack(j1,c1); fetchwrench(wr1,c1); close(c1); open(c2);fetchwrench(wr2,c2); fetchjack(j2, c2); close(c2); close(c3); open(c3) From first 4 actions – there are 8 possible states of a boot (e.g. c1) :S1- S8 S1 => open.1 => S2 S3 => close.1 => S4 S5 => fetch jack.2 => S6 S7 => fetch wrench.2 => S8 These collapse to 2 STATES when applying behaviour assumption 2&3. INDUCED STATES OF CAR “BOOT” SO RT fetch_jack.2 open.1 fetch_wrench.2 boot1 (open) boot0 (closed) close.1

LOCM – inductive generalisation • New example: open(c1); putawayjack(j1, c1); close(c1); open(c2); putawayjack(j2, c2); open(1); fetchjack(j1,c1); fetchwrench(wr1, c1); fetchjack(j2, c2); close(c1); Consider one action A taking an object x of sort S1 into a state T and another action B taking object x out of that state. Assume that A and B both also refer to some other sort S2. If *every time* in training it is observed that when A and then B are executed on the same object of sort S1, the SAME object of sort S2 is recorded, then we induce that state T has an association with objects of sort S2 Example above: putawayjack(j2; c2) …. fetchjack(j2,c2): same object c2 is referred to, hence induce association with sort “boot”. Not so for egfetchjack(j2,c1) .. putaway(j1,c1) – can’t say that boot is associated with jack. Cresswell, S.N., McCluskey, T.L. and West, Margaret M. (2013) Acquiring planning domain models using LOCM. Knowledge Engineering Review. ISSN 0269-8889 , http://eprints.hud.ac.uk/9052/

LOCM – Parameterised FSM => PDDL (:action open_container :parameters (?Boot1 - boot) :precondition (and (zero_state0) (boot_state1 ?Boot1)) :effect (and (boot_state0 ?Boot1) (not (boot_state1 ?Boot1))) )) ) (:action fetch_jack :parameters (?Jack1 - jack ?Boot2 - boot) :precondition (and (zero_state0) (jack_state4 ?Jack1 ?Boot2) (boot_state0 ?Boot2)) :effect (and (jack_state3 ?Jack1) (not (jack_state4 ?Jack1 ?Boot2))) ) INDUCED STATES OF EVERY SORT – BOOT, JACK, WHEEL, HUB, NUTS, WRENCH … fetch_jack.2 open.1 fetch_wrench.2 boot0 (open) boot1 (closed) close.1

Inducing Domain Models - Game Example Trace of FreeCell Game Send to home .. Move to Free cell .... Move to Free Cell .... Move to column .... Send to home ... (:action sendtohome : parameters (?card - card ?suit - suitsort ?vcard - denomination ?homecard - card ?vhomecard ?cols ?ncols - denomination) :precondition (and (clear ?card) (bottomcol ?card) (home ?homecard) (suit ?card ?suit) (suit ?homecard ?suit) (value ?card ?vcard) (value ?homecard ?vhomecard) (successor ?vcard ?vhomecard) (colspace ?cols) (successor ?ncols ?cols)) : effect (and (home ?card) (colspace ?ncols) (not (home ?homecard)) (not (clear ?card)) (not (bottomcol ?card)) (not (colspace ?cols)))) Induction Meaning of “Send to home” Action in Planning-ready format

Problems/Future Work: • When is the induction finished – how big have the training sequences to be? • LOCM can’t (yet) induce static knowledge • Need to find “naturally occurring” sources of planning traces • What are the theoretical limits to the expressiveness of the induced language?

Conclusion • KE is hard and inflexible – techniques that can learn knowledge are important • Learning mechanisms like LOCM can exploit - regularity in training examples - physical constraints - inductive generalisations about associations between objects - assumptions about the form of actions - assumptions about the state change behaviour of groups of objects in order to learn structures such as PDDL

AI – Week 17 Machine Learning Applied to AI Planning: LOCM

AI – Week 17 Machine Learning Applied to AI Planning: LOCM

Presentation Transcript

Getting to the core: How do I help my child in math?

CS 9633 Machine Learning

Machine Learning Methods for Human-Computer Interaction

Machine Learning for Adaptive Power Management

Machine Learning Chapter 3. Decision Tree Learning

An Introduction to Machine Learning

Tamara Berg Machine Learning

Machine Learning on Spark

Machine learning: Unsupervised learning

Lesson Planning Planning for Learning

Supervised and semi-supervised learning for NLP

Beyond Convexity – Submodularity in Machine Learning

Machine Learning: Basic Introduction

Machine Learning Lecture outline

Machine Learning: An Overview

Machine Learning

Concept learning

Lifelong Machine Learning and Reasoning

Graphical Models in Machine Learning

Overview of Machine Learning for NLP Tasks: part I

ECO 365 UOP Courses/Uophelp