11/5

11/5 Bayes Nets project due Prolog project assigned Today: FOPC—Resolution Thm Proving; Situation CalculusLeading to planning

Your Project 4!

Efficiency can be improved by re-ordering subgoals adaptively e.g., try to prove Pet before Small in Lilliput Island; and Small before Pet in pet-store.

Similar to “Integer Programming” or “Constraint Programming”

Generate compilable matchers for each pattern, and use them

y/z;x/Rao ~loves(z,Rao) z/SK(rao);x’/rao Example of FOPC Resolution.. Everyone is loved by someone If x loves y, x will give a valentine card to y Will anyone give Rao a valentine card?

Finding where you left your key.. Atkey(Home) V Atkey(Office) 1 Where is the key? Ex Atkey(x) Negate Forall x ~Atkey(x) CNF ~Atkey(x) 2 Resolve 2 and 1 with x/home You get Atkey(office) 3 Resolve 3 and 2 with x/office You get empty clause So resolution refutation “found” that there does exist a place where the key is… Where is it? what is x bound to? x is bound to office once and home once. so x is either home or office

Existential proofs.. • The previous example shows that resolution refutation is powerful enough to model existential proofs. In contrast, generalized modus ponens is only able to model constructive proofs.. • (We also discussed a cute example of existential proof—is it possible for an irrational number power another irrational number to be a rational number—we proved it is possible, without actually giving an example).

Existential proofs.. • Are there irrational numbers p and q such that pq is rational? This and the previous examples show that resolution refutation is powerful enough to model existential proofs. In contrast, generalized modus ponens is only able to model constructive proofs.. Rational Irrational

GMP vs. Resolution Refutation • While resolution refutation is a complete inference for FOPC, it is computationally semi-decidable, which is a far cry from polynomial property of GMP inferences. • So, most common uses of FOPC involve doing GMP-style reasoning rather than the full theorem-proving.. • There is a controversy in the community as to whether the right way to handle the computational complexity is to • a. Develop “tractable subclasses” of languages and require the expert to write all their knowlede in the procrustean beds of those sub-classes (so we can claim “complete and tractable inference” for that class) OR • Let users write their knowledge in the fully expressive FOPC, but just do incomplete (but sound) inference. • See Doyle & Patil’s “Two Theses of Knowledge Representation”

11/7 Homework 4 due 11/14 Make-up class 11/9 same time same room +Raspberry Bars.. Final exam is scheduled for 12/10 12:20-2:10pm

Situational Calculus:Time & Change in FOPC • SitCalc is a special class of FOPC with • Special terms called “situations” • Situations can be thought of as referring to snapshots of the universe at various times • Special terms called “actions” • Putdown(A); stack(B,x) etc (A,B constants) • Special function called Result which returns a situation • Result(action-term,situation-term) • Result(putdown(a),S) • World properties can be modeled as predicates (with an extra situational argument) • Clear(B,S0) • Actions are modeled in terms of what needs to be true in the situation where the action takes place, and what will be true in the situation that results • You can also have intra-situation axioms

..yes, BUT Consider the previous problem, except you now have another block B which is already on table and is clear. Your goal is to get A onto table while leaving B clear. Sounds like a no-brainer, right? ..but the theorem prover won’t budge It has no axiom telling it that B will remain clear in the situation Result(Putdown(A),S0) Big deal.. We will throw in an axiom saying that Clear(x) continues to hold in the situation after Putdown(A) But WAIT. We are now writing axioms about properties that DO NOT CHANGE There may be too many axioms like this If there are K properties and M actions, we need K*M frame axioms …AND we have to resolve against them Increasing the depth of the proof (and thus exponentially increasing the complexity..) There are ways to reduce the number of frame axioms from K*M to just K (write, for each property P, the only conditions under which it transitions from True to False between situations) Called Successor State Axioms But we still have to explicitly prove to ourselves that everything that has not changed has actually not changed ..unless we make additional assumptions E.g. STRIPS assumption… If a property has not been mentioned in an action’s effects, it is assumed that it remains the same ..So, is Planning=Theorem Proving? Sphexishness

Sphexishness One kind of determinism, genetic fixity, is illustrated powerfully by the example of the digger wasp, Sphex ichneumoneus: When the time comes for egg laying, the wasp Sphex builds a burrow for the purpose and seeks out a cricket which she stings in such a way as to paralyze but not kill it. She drags the cricket into the burrow, lays her eggs alongside, closes the burrow, then flies away, never to return. In due course, the eggs hatch and the wasp grubs feed off the paralyzed cricket, which has not decayed, having been kept in the wasp equivalent of deep freeze. To the human mind, such an elaborately organized and seemingly purposeful routine conveys a convincing flavor of logic and thoughtfulness--until more details are examined. For example, the Wasp's routine is to bring the paralyzed cricket to the burrow, leave it on the threshold, go inside to see that all is well, emerge, and then drag the cricket in. If the cricket is moved a few inches away while the wasp is inside making her preliminary inspection, the wasp, on emerging from the burrow, will bring the cricket back to the threshold, but not inside, and will then repeat the preparatory procedure of entering the burrow to see that everything is all right. If again the cricket is removed a few inchies while the wasp is inside, once again she will move the cricket up to the threshold and re-enter the burrow for a final check. The wasp never thinks of pulling the cricket straight in. On one occasion this procedure was repeated forty times, always with the same result. (Woodridge, 1963, p. 82)

Init: Ontable(A),Ontable(B), Clear(A), Clear(B), hand-empty Goal: ~clear(B), hand-empty Blocks world State variables: Ontable(x) On(x,y) Clear(x) hand-empty holding(x) Initial state: Complete specification of T/F values to state variables --By convention, variables with F values are omitted STRIPS ASSUMPTION: If an action changes a state variable, this must be explicitly mentioned in its effects Goal state: A partial specification of the desired state variable/value combinations --desired values can be both positive and negative Pickup(x) Prec: hand-empty,clear(x),ontable(x) eff: holding(x),~ontable(x),~hand-empty,~Clear(x) Putdown(x) Prec: holding(x) eff: Ontable(x), hand-empty,clear(x),~holding(x) Unstack(x,y) Prec: on(x,y),hand-empty,cl(x) eff: holding(x),~clear(x),clear(y),~hand-empty Stack(x,y) Prec: holding(x), clear(y) eff: on(x,y), ~cl(y), ~holding(x), hand-empty All the actions here have only positive preconditions; but this is not necessary

Need to write all effects explicitly Can’t depend on derived effects Leads to loss of modularity Instead of saying “Clear” holds when nothing is “On” the block, we have to write Clear effects everywhere If now the blocks become bigger and can hold two other blocks, you will have to rewrite all the action descriptions Then again, state-variable (STRIPS) model is a step-up from the even more low-level “State Transition model” Where actions are just mappings from States to States (and so must be seen as SXS matrices) What do we lose with STRIPS actions? Very loose Analogy: State-transition models  Assembly lang (factored) state-variable models  C (first-order) sit-calc models  Lisp

An action A can be applied to state S iff the preconditions are satisfied in the current state The resulting state S’ is computed as follows: --every variable that occurs in the actions effects gets the value that the action said it should have --every other variable gets the value it had in the state S where the action is applied Progression: STRIPS ASSUMPTION: If an action changes a state variable, this must be explicitly mentioned in its effects holding(A) ~Clear(A) ~Ontable(A) Ontable(B), Clear(B) ~handempty Pickup(A) Ontable(A) Ontable(B), Clear(A) Clear(B) hand-empty holding(B) ~Clear(B) ~Ontable(B) Ontable(A), Clear(A) ~handempty Pickup(B)

Generic (progression) planner • Goal test(S,G)—check if every state variable in S, that is mentioned in G, has the value that G gives it. • Child generator(S,A) • For each action a in A do • If every variable mentioned in Prec(a) has the same value in it and S • Then return Progress(S,a) as one of the children of S • Progress(S,A) is a state S’ where each state variable v has value v[Eff(a)]if it is mentioned in Eff(a) and has the value v[S] otherwise • Search starts from the initial state

State Variable Models • World is made up of states which are defined in terms of state variables • Can be boolean (or multi-ary or continuous) • States are complete assignments over state variables • So, k boolean state variables can represent how many states? • Actions change the values of the state variables • Applicability conditions of actions are also specified in terms of partial assignments over state variables

Planning vs. Search: What is the difference? • Search assumes that there is a child-generator and goal-test functions which know how to make sense of the states and generate new states • Planning makes the additional assumption that the states can be represented in terms of state variables and their values • Initial and goal states are specified in terms of assignments over state variables • Which means goal-test doesn’t have to be a blackbox procedure • That the actions modify these state variable values • The preconditions and effects of the actions are in terms of partial assignments over state variables • Given these assumptions certain generic goal-test and child-generator functions can be written • Specifically, we discussed one Child-generator called “Progression”, another called “Regression” and a third called “Partial-order” • Notice that the additional assumptions made by planning do not change the search algorithms (A*, IDDFS etc)—they only change the child-generator and goal-test functions • In particular, search still happens in terms of search nodes that have parent pointers etc. • The “state” part of the search node will correspond to • “Complete state variable assignments” in the case of progression • “Partial state variable assignments” in the case of regression • “A collection of steps, orderings, causal commitments and open-conditions in the case of partial order planning

A state S can be regressed over an action A (or A is applied in the backward direction to S) Iff: --There is no variable v such that v is given different values by the effects of A and the state S --There is at least one variable v’ such that v’ is given the same value by the effects of A as well as state S The resulting state S’ is computed as follows: -- every variable that occurs in S, and does not occur in the effects of A will be copied over to S’ with its value as in S -- every variable that occurs in the precondition list of A will be copied over to S’ with the value it has in in the precondition list Regression: Termination test: Stop when the state s’ is entailed by the initial state sI *Same entailment dir as before.. Putdown(A) ~clear(B) holding(A) ~clear(B) hand-empty Stack(A,B) holding(A) clear(B) Putdown(B)??

On the asymmetry of init/goal states • Goal state is partial • It is a (seemingly) good thing • if only m of the k state variables are mentioned in a goal specification, then upto 2k-mcomplete state of the world can satisfy our goals! • ..I say “seeming” because sometimes a more complete goal state may provide hints to the agent as to what the plan should be • In the blocks world example, if we also state that On(A,B) as part of the goal (in addition to ~Clear(B)&hand-empty) then it would be quite easy to see what the plan should be.. • Initial State is complete • If initial state is partial, then we have “partial observability” (i.e., the agent doesn’t know where it is!) • If only m of the k state variables are known, then the agent is in one of 2k-m states! • In such cases, the agent needs a plan that will take it from any of these statesto a goal state • Either this could be a single sequence of actions that works in all states (e.g. bomb in the toilet problem) • Or this could be “conditional plan” that does some limited sensing and based on that decides what action to do • ..More on all this during the third class • Because of the asymmetry between init and goal states, progression is in the space of complete states, while regression is in the space of “partial” states (sets of states). Specifically, for k state variables, there are 2k complete states and 3k “partial” states • (a state variable may be present positively, present negatively or not present at all in the goal specification!)

Regression vs. Reversibility • Notice that regression doesn’t require that the actions are reversible in the real world • We only think of actions in the reverse direction during simulation • …just as we think of them in terms of their individual effects during partial order planning • Normal blocks world is reversible (if you don’t like the effects of stack(A,B), you can do unstack(A,B)). However, if the blocks world has a “bomb” the table action, then normally, there won’t be a way to reverse the effects of that action. • But even with that action we can do regression • For example we can reason that the best way to make table go-away is to add “Bomb” action into the plan as the last action • ..although it might also make you go away 

Progression has higher branching factor Progression searches in the space of complete (and consistent) states Regression has lower branching factor Regression searches in the space of partial states There are 3n partial states (as against 2n complete states) Progression vs. RegressionThe never ending war.. Part 1 You can also do bidirectional search stop when a (leaf) state in the progression tree entails a (leaf) state (formula) in the regression tree

Progression takes “applicability” of actions into account Specifically, it guarantees that every state in its search queue is reachable ..but has no idea whether the states are relevant (constitute progress towards top-level goals) SO, heuristics for progression need to help it estimate the “relevance” of the states in the search queue Regression takes “relevance” of actions into account Specifically, it makes sure that every state in its search queue is relevant .. But has not idea whether the states (more accurately, state sets) in its search queue are reachable SO, heuristics for regression need to help it estimate the “reachability” of the states in the search queue Relevance, Rechabililty & Heuristics Reachability: Given a problem [I,G], a (partial) state S is called reachable if there is a sequence [a1,a2,…,ak] of actions which when executed from state I will lead to a state where S holds Relevance: Given a problem [I,G], a state S is called relevant if there is a sequence [a1,a2,…,ak] of actions which when executed from S will lead to a state satisfying (Relevance is Reachability from goal state) Since relevance is nothing but reachability from goal state, reachability analysis can form the basis for good heuristics

Subgoal interactions Suppose we have a set of subgoals G1,….Gn Suppose the length of the shortest plan for achieving the subgoals in isolation is l1,….ln We want to know what is the length of the shortest plan for achieving the n subgoals together, l1…n If subgoals are independent: l1..n = l1+l2+…+ln If subgoals have +ve interactions alone: l1..n < l1+l2+…+ln If subgoals have -ve interactions alone: l1..n > l1+l2+…+ln If you made “independence” assumption, and added up the individual costs of subgoals, then your resultant heuristic will be perfect if the goals are actually independent inadmissible (over-estimating) if the goals have +ve interactions  un-informed (hugely under-estimating) if the goals have –ve interactions

Realistic encodings of Munich airport! Scalability of Planning • Before, planning algorithms could synthesize about 6 – 10 action plans in minutes • Significant scale-up in the last 6-7 years • Now, we can synthesize 100 action plans in seconds. Problem is Search Control!!! The primary revolution in planning in the recent years has been domain-independent heuristics to scale up plan synthesis …and now for a ring-side retrospective 

Planning Graph Basics pqr A2 • Envelope of Progression Tree (Relaxed Progression) • Linear vs. Exponential Growth • Reachable states correspond to subsets of proposition lists • BUT not all subsets are states • Can be used for estimating non-reachability • If a state S is not a subset of kth level prop list, then it is definitely not reachable in k steps A1 pq pq A3 A1 pqs A2 p pr A1 psq A3 A3 ps ps A4 pst p q r s p q r s t p A1 A1 A2 A2 A3 A3 A4 [ECP, 1997]

11/5

11/5

Presentation Transcript