Answer Set Programming: A new Paradigm for Knowledge Representation and Constraint Programming

Answer Set Programming:A new Paradigm for Knowledge Representation and Constraint Programming Russell and Norvig 10.7 Lecture Notes for Cmput 366 (Some slides ,especially those with pictures, are taken from C. Baral’s talk at AAAI’05)

Intelligent Agent • Can acquire knowledge through various means such as learning from experience, observations, reading, etc., and • Can reason with this knowledge to make plans, explain observations, achieve goals, etc.

To learn knowledge and to reason with it • we need to know how to represent knowledge in a computer readable format. • McCarthy 1959 in Programs with commonsense: “In order for a program to be capable of learning something it must first be capable of being told it.”

Importance of KR • KR is the starting point of building intelligent entities (or AI systems), and leads to the next steps of acquiring knowledge and reasoning with knowledge.

What does KR entail? • We need languages and corresponding methodologies to represent various kinds of knowledge.

Importance of inventing suitable KR languages Development of a suitable knowledge representation language and methodologyis as important to AI systems as Calculusis to Physics and Engineering.

Historical perspective • AI pioneers (especially McCarthy and Minsky) realized the importance of KR to AI. • McCarthy 1959: Programs with commonsense (perhaps the first paper on logical AI). • Minsky 1974: A framework for representing knowledge. John McCarthy Marvin Minksy

What are the properties of a good KR language. • To start with: should be non-monotonic • i.e., allow revision of conclusion in presence of new knowledge. • Hayes 1973 (Computation and Deduction) mentions monotonicity (calls it “extension property”) and notes that rules of default do not satisfy it. • Minsky 1974 (A framework for representing knowledge) criticizes monotonicity of logistic systems. Pat Hayes Marvin Minsky

Inadequacy of first order logic • They are monotonic: More information one has, more consequences one gets. • Human communication is typically based on closed world assumption.

An Example of Closed World Assumption ground-wet  watering. ground-wet  raining. • In an open world, there could be others that cause ground-wet (we simply don’t know, or have not said). • But in a closed world, what we said is all that we know, for Horn clauses, this is called Clark Completion, Ground-wet watering  raining

Problem with Clark Completion a  a. When completed, it becomes a  a Two models {a} and { }. The first model doesn’t seem to make sense (how can we have a?) The desired model should be { } – since there is no way to establish a, hence a is (believed to be) false.

Transitive Closure – graph reachability • reach(a). • reach(X)  reach(Y), edge(Y,X). • a and b are reachable but c and d are not. • But c is not reachable, neither is d. Can we infer these? a b edge(a,b). c d edge(c,d). edge(d,c)

Reachability – Clark completion (cf. page 355 of R&N) • XY edge(X,Y)  ((X=a Y =b)  (X =c  Y=d)  (X =d  Y =c)) • X reach(X)  (X = a or Y (reach(Y)  edge(Y,X)) • Equality axioms. • {edge(a,b), edge(c,d), edge(d,c), reach(a), reach(b) } is a model. • But so is {edge(a,b), edge(c,d), edge(d,c), reach(a), reach(b), reach(c), reach(d) }. • Hence one can not conclude ~reach(c), ~reach(d). • Need to go beyond first order logic.

Pre-1980 history of non-monotonic logics –from Minker’s 93 survey • THNOT in PLANNER [Hewitt in 1969] • Prolog [Colmerauer et al. 1973] • Circumscription [McCarthy 1977] • Default Reasoning [Reiter 1978] • Closed World Assumption (CWA) [Reiter 1978] • Negation as failure [Clark 1978] • Truth maintenance systems [Doyle 1979] • AIJ Volume 13, 1980, a special issue

Circumscription Only consider minimal models for the circumscribed predicates E.g. bird(X)  ~ab(X)  flies(X) To circumscribe predicate ab, we can assume ~ab(X) unless ab(X) is known to be true. Thus, in lack of information about a bird being abnormal, we conclude it flies.

Circumscription bird(X)  ~ab(X)  flies(X) bird(tweety) Models (after propositionalizing) : M1={bird(tweety), ab(tweety),flies(tweety)} M2={bird(tweety), ab(tweety)} M3={bird(tweety), flies(tweety)} M3 is “smaller” than others wrt predicate ab. Thus, flies(tweety) follows from the given formulas under circumscription.

Default Logic We write default rules. E.g. bird(X) : ~ab(X) ----------------------- flies(X) Reads: if X is a bird, and it can be consistently assumed that it is not abnormal, then it flies.

Have we invented “calculus” of KR yet? • What basic properties should it have? • have a simple and intuitive syntax and semantics; • be non-monotonic; • allow us to represent and reason with incomplete information; and • allow us to express and answer problem solving queries such as planning queries, explanation queries and diagnostic queries.

Have we invented “calculus” of KR yet? - continued. • What properties will make it useful? • should have building block results; • should have interpreters for reasoning with the language; • should have existing applications; and • should have systems that can learn knowledge in this language.

Is ASP a good candidate? • An ASP program (late 1980s) is a collection of rules of the form: A0 or … or Al B1, …, Bm, not C1, …, not Cn. where Ais, Bjs and Cks are literals. Michael Gelfond Jack Minker Vladimir Lifschitz Ray Reiter

Is ASP a good candidate? • Its syntax uses the intuitive If-then form. • It is non-monotonic. • Can express defaults and their exceptions. • Can represent and reason with incomplete information. • Can express and answer problem solving queries. • Large body of building block results. • Various implementations: Smodels, DLV, Prolog. • Many applications built using it. • Learning systems: Progol. • Its initial paper among the top 5 AI source documents in terms of citeseer citation.

How ASP differs from … • Prolog: ordering matters in Prolog; can not handle cycles with “not”; has extra-logical features; does not have disjunction and classical negation; and is not declarative. • Logic Programming: is a class of languages and many different semantics are proposed for “not”. • Classical Logic: • Classical logic is monotonic. •  in AnsProlog, which helps in expressing causality, is not reverse implication. • Disjunction symbol “or” in AnsProlog is non-classical. • The negation as failure symbol “not” in AnsProlog is non-classical.

Normal program A normal program in ASP is a collection of rules of the form: A B1, …, Bm, not C1, …, not Cn. where A, Bjs and Cks are function-free atoms. If the body is empty, we write A . Or simply A.

Semantics A function-free program can be grounded (called propositionalization in textbook) p(X)  q(X), not s(X) . % Function-free p(X)  q(f(X)), not s(X). % Not function-free

Semantics Suppose we have constants a,b,c in our program, the rule p(X)  q(X), not s(X). is a compact representation of three ground rules p(a)  q(a), not s(a). p(b)  q(b), not s(b). p(c)  q(c), not s(c).

Semantics Informally, a stable model Mof a ground program Pis a set of ground atoms such that • Every rule is satisfied, i.e., for any rule in P A B1, …, Bm, not C1, …, not Cn. if Bjs are satisfied (Bjs are in M)and Cjs are also satisfied (not Cjis satisfied if Cjis not in M), thenA is in M. • Every A M can bederived from a rule by a non-circular reasoning.

Examples P1 = { a  a. } M = {a} is not a stable model but M={} is. P2 = {a  not b.} {a} is the only stable model P3 = {a  not a.} It has no stable model

Examples P4 = {a  not b.; b  not a.} Two stable models: {a} and {b}.

Examples P4 = {a  not b.; b  not a.} Two stable models: {a} and {b}. P5 = {a  not b.; b  not a.; a  not a.} {a} is the only stable model.

Does tweety fly? • fly(X)  bird(X), not ab(X). ab(X)  penguin(X). bird(X)  penguin(X). bird(tweety). • We conclude fly(tweety). • But if we add • penguin(tweety). • We can no longer conclude fly(tweety) • and conclude ~fly(tweety), by virtue of CWA.

Constraints for disallowing … The head of a rule may be empty:  B1, …, Bm, not C1, …, not Cn. It says no stable model may contain all Bjs and none of Cjs.

Generate-and-constrain: first generate To specify both possibilities: a is in a solution or not, we can use a dummy a’ a not a’. a’  not a. Two stable models {a}, {a’}; the latter represents that a is not in solution

Generate-and-constrain: first generate To specify all subsets of {a,b,c}, we can write a not a’.b  not b’.c  not c’. a’  not a. b’ not b. c’ not c. Eight stable models each corresponding to a subset, e.g. {a, b’,c’} represents that a is in it, but not b, nor c.

Generate-and-constrain: then constrain Any subset of {a,b,c} such that a and b cannot be together. a not a’.b  not b’.c  not c’. a’  not a. b’ not b. c’ not c.  a ,b. • What if we want to say “whenever a is in a stable model, so is b?

Hamiltonian Cycle Given a set of facts defining the vertices and edges of a directed graph and a starting vertex v0,find a path that visits every vertex exactly once.

Hamiltonian Cycle Any edge could be on such a path. We use in(U,V) to represent that edge(U,V) is on such a path. in(U,V)  edge(U,V), not out(U,V). out(U,V)  edge(U,V), not in(U,V). out(U,V)is a dummy representing edge(U,V) is not on such a path.

Hamiltonian Cycle A path must be chained to form a sequence over the edges on it: reachable(V)  in(v0,V). reachable(V)  reachable(U), in(U,V).

Hamiltonian Cycle A vertex cannot be visited more than once. • This can be defined as “no more than one edge on such a path that goes into any vertex (similarly out of such an edge):  edge(U,V),in(U,V), edge(W,V)in(W,V), U  W.  edge(U,V),in(U,V), edge(U,W),in(U,W), V  W.

Hamiltonian Cycle Don’t forget to say that every vertex must be reached.  vertex(U), not reachable(U).

3-colorability Whether 3 colors, say red, blue, and yellow, are sufficient to color a map A map is represented by a graph, with facts about nodes and arc as given, e.g, vertex(a). vertex(b). arc(a,b).

3-colorability Every vertex must be colored with exactly one color: color(V,r)  vertex(V), not color(V,b), not color(V,y). color(V,b)  vertex(V), not color(V,r), not color(V,y). color(V,y)  vertex(V), not color(V,b), not color(V,r). No adjacent vertices may be colored with the same color:  vertex(V), vertex(U), arc(V,U),col(C ), color(V,C), color(U,C). Of course, we need to say what colors are: col(r). col(b). col(y).

3-colorability A different encoding: color(V,C)  node(V), col(C), not otherColor(V,C). otherColor(V,C)  node(V), col(C), not color(V,C).  node(V), col(C1), col(C2), color(V,C1), color(V,C2), C1 C2.  node(V), col(C), not color(V,C).  node(V), node(U), V U, arc(V,U), col(C ), color(V,C), color(U,C).

So, what exactly is a stable model of a normal program P Idea: you guess a set of atoms and verify it is indeed exactly the set of atoms that can be derived (page 357 of textbook) Reduct ofP w.r.t. M = {h  b1, …, bm | h b1, …, bm, not c1, …, not cn is in P and no ci is in M } M is a stable model of P iff the set of (atomic) consequences of the reduct of P is precisely M

Stable model P: a  not b. b  not a. M = {a} is a stable model, since the reduct of P wrt. M is {a .} its set of (atomic) consequences is precisely M itself.

Stable model Why a not a. has no stable model? • The empty set {} is not a stable model. (Why?) • If M={a} were a stable model, the reduct of program wrt {a} is the empty set, whose (atomic) consequences is also empty, not the same as M.

Extensions: Cardinality constraint A cardinality constraint is of form L {a1, …, am, not b1, …, not bk}U The constraint is satisfied in a model if the cardinality of the subset of the literals satisfied by the model is between integers L and U, inclusive. A cardinality constraint can be used anywhere in a rule. E.g. P = { 0{a, b, not d}2 . } {a} is a stable model, but is {a,b} a stable model?

Cardinality constraint Generate all subsets of {a,b,c,d} such that whenever a is in it so is b: 0{a, b, c, d}4 . b  a. As 4 is the max number of literals that may be satisfied, you may omit it for simplicity 0{a, b, c, d} .

Cardinality constraint Generate all subsets of {a,b,c,d} such that if a is not in it, then b is in it. 0{a, b, c, d} . b  not a. Are they stable models? M1= {a,b,c} M2 = {b,c,d,e}

ASP Systems • Smodels (Helsinki Univ. of Tech.) • DLV (Vienna Univ. of Tech.) • ASSAT (HK Univ. of Sci. and Tech.) • Cmodel (U. of Texas at Austin)

The Smodels System An efficient system for computing answer sets of normal programs (later exteneded for disjunctive programs). Consists of two parts • Lparse: ground a program • Smodels: compute the stable models of the grounded program, based on DPLL.

Answer Set Programming: A new Paradigm for Knowledge Representation and Constraint Programming