Logic for knowledge representation

Logic for knowledge representation Points Propositional logic Predicate logic • The pros and cons • Syntax • Semantics Inference • Soundness and completeness The clausal form • The conversion procedure Resolution Control of reasoning

Propositional logic We need to represent properties of objects in the world we model, and relations between objects. Propositional logic is not expressive enough. It allows no generality and no generalization: just try to express the fact “dogs have tails”. This is the simplest form of logic. It has only one advantage: there is an effective procedure for proving theorems in it. That is, propositional logic is decidable.

Predicate logic First-order predicate logic gives us much of what we need; its various extensions offer a wide spectrum of knowledge representation possibilities. Predicate logic has a few obvious advantages: • it has been known and applied in practice for some 2500 years; • it has well-defined reasoning methods; • it is based on a solid theory that gives us “for free” the semantics of this knowledge representation method.

Predicate logic (2) There also are drawbacks of predicate logic: • it offers no support for hierarchical organization (e.g., no object orientation); • it is absolute (“frozen in time”) -- there are problems with representing typical situations and objects, and expressing defaults; • it does not allow common-sense reasoning or approximate reasoning; • it has only undecidable or semidecidable proof procedures (but this is true of other representation methods).

Predicate logic syntax -- a refresher A very brief review the syntax (in a Prolog-style notation). In an atom, the predicate denotes the general nature of things, while the arguments (terms) represent the details. Example: onMarket(house(address(ottawa,main,10), bedrooms(4)), price(225000)) Variables allow us to generalize, for example: onMarket(house(A, B), price(P))

... syntax -- a refresher (2) The logical connectives , , , ,  produce formulae out of formulae. Quantifiers allow us to make general statements about classes of individuals: A N (onMarket(house(A, bedrooms(4)), price(N)) N < 190000makeOfferFor(A))

First order? “First order” means that we can only reason about sets of individuals, that is, only individuals can be quantified. Predicates are fixed. In higher-order logics variables are allowed in predicate positions. Two examples: X Y X(Y) “each property has at least one instance” X (X(a)  X(b)) “a and b have the same properties”

Predicate logic semantics Quantifiers, logical connectives, atoms (predicates, arguments), terms (functors, constants, variables) are the syntactic elements of first-order logic. They must be given a semantic interpretation if we want them to represent the world. Formally, the termaddress(ottawa, main, 10)need not represent a house. The symbols address, ottawa, main could denote anything.

Predicate logic semantics (2) We interpret logical formulae by mapping their elements into a domain. A term without variables corresponds to an object in the domain. An atom without variables corresponds to a true or false statement about domain objects. For example: ottawa corresponds to the city of Ottawa, main corresponds to Main Street in Ottawa, address(ottawa, main, 10) corresponds to a valid Ottawa address, onMarket(house(address(ottawa, main, 10), bedrooms(4)), price(225000)) denotes a true or false statement about a specific house.

Predicate logic semantics (3) Let S be a set of logical formulae that represent an AI problem. Atoms in S correspond to true or false statements about the problem. A model of S is any subset of atoms -- all those that we consider true. All atoms not in the model are by definition false. A formula with logical connectives has its truth value derived from the values of its components. For example, S1 S2 is true if both S1 and S2 are in the model. (This is a hugely simplified approach to models in logic. )

Predicate logic semantics (4) A tautology is something that is true in every model. A classic example is this formula true for any H, P: onMarket(H, P)   onMarket(H, P) Two formulae are model-equivalent if, in every model, they are either both true or both false. Formula Φ entails formula Ψ if Ψ is true in every model in which Φ is true. This is written as: Φ |= Ψ For example, (onMarket(H, P)  sold(H))  sold(H)|= onMarket(H, P)

Inference Entailment allows us to represent certain aspects of rational behaviour: a logically thinking agent draws conclusions from facts, and so in some way derives new knowledge. We need a procedure that will implement this kind of logical thinking. Inference is such a procedure, especially if it has two important properties: it derives only good knowledge; it derives all knowledge that can be derived.

Inference rules • Tableaux-based methods. (Try to construct a model that makes a given formula Φ false. If we cannot construct any such model, we have proven Φ.) • The resolution rule (much more about it soon). • Natural deduction.

Inference rules (2)

Good knowledge, all knowledge The notation Φ |- Ψ means: “we can infer Ψ from Φ”. An inference rule is sound if an inferred formula is also entailed: If Φ |- Ψ then Φ |= Ψ An inference rule is complete if every entailed formula can be inferred: If Φ |= Ψ then Φ |- Ψ We can say that, for every Φ and Ψ, a sound and complete inference rule ensures: Φ |- Ψ if and only if Φ |= Ψ or, even more neatly:

Resolution We begin with an advertisement... The resolution proof procedure is attractive: one inference rule plus unification (to be explained soon). The resolution proof procedure is practical: Prolog is based on resolution (meta-interpretation helps enrich the procedure with explanations). Downside: the clausal form, required by resolution, is less natural than full first-order logic (but intuitive!), and the proof procedure is still costly unless we restrict the form of clauses.

The clausal form Resolution (2) A literal is either an atom, or a negated atom. A clause is a disjunction of literals: A1  A2  ...  An  ¬B1  ¬B2  ...  ¬Bm In Prolog, only Horn clauses are allowed. A Horn clause has at most one positive literal. For example, the following Prolog clause a(X, Y) :- b(X, Z), c(Z, Y). is really ∀X ∀Y ∀Z (b(X, Z)  c(Z, Y)  a(X, Y)) or, in the clausal form, a(X, Y)  ¬b(X, Z)  ¬c(Z, Y)

Unification Resolution (3) Boolean function UNIFY(term T1, term T2) { if ( T1 is a variable ) T1 ← T2; /* bind */ elsif ( T2 is a variable ) T2 ← T1; /* bind */ elsif ( T1 and T2 have different main functors ) return false; else { Boolean unifiable = true; while ( unifiable, unmatched arguments remain ) unifiable = UNIFY(next argument of T1, next argument of T2); } return unifiable; }

Unification -- comments Resolution (4) Variable bindings are accumulated on an initially empty list. If the terms are unifiable, the list is our result; if they are not unifiable, no bindings will be performed. This unification procedure may create circular structures, as it does in Prolog: ?- X = f(X). X = f(f(f(f(f(f(f(f(f(f(...)))))))))) Yes We could check that, in Var  Expr, Expr does not contain Var. The cost of this “occurs-check” is too high for practical systems such as Prolog.

Converting to clauses (i) Resolution (5) Every first-order logic formula Φ can be converted into clauses in a model-equivalent way: if Φ is mapped into a true statement about some world, then Φ’s clausal form is also true. We will show the conversion procedure on an example (see the textbook), probably with too many parentheses: ∀X ((a(X)  b(X)) (c(X, k)  ∃Y (∃Z c(Y, Z)  d(X, Y))) ) ∀X e(X)

Converting to clauses (ii) Resolution (6) Step 1: eliminate implication.    ∀X ((a(X)  b(X)) (c(X, k)  ∃Y (∃Z c(Y, Z) d(X, Y))) ) ∀X e(X)

Converting to clauses (iii) Resolution (7) Step 2: move negation inward.   ()() (∧ )() ∀ X ∃ X  ∃ X ∀ X  ∀X ((a(X) b(X)) (c(X, k)  ∃Y (∃Z c(Y, Z) d(X, Y))) ) ∀X e(X)

Converting to clauses (iv) Resolution (8) Step 3: rename variables bound by independent (not nested) quantifiers. ∀X ((a(X) b(X)) (c(X, k)  ∃Y (∃Z c(Y, Z) d(X, Y))) ) ∀W e(W) Step 4: move quantifiers to the front. ∀X ∃Y ∃Z ∀W ((a(X) b(X)) (c(X, k)  (c(Y, Z) d(X, Y))) ) e(W)

Converting to clauses (v) Resolution (9) Step 5: skolemize (eliminate existential quantifiers). ∀A ...∀B ∃C (A, ..., B, C)  ∀A ... ∀B (A, ..., B, s(A, ..., B)) ∀X ∀W ((a(X) b(X)) (c(X, k)  (c(f(X), g(X)) d(X, f(X)))) ) e(W)

Converting to clauses (vi) Resolution (10) Step 6: drop the prefix (all variables are assumed to be universally quantified by default). ((a(X) b(X)) (c(X, k)  (c(f(X), g(X)) d(X, f(X)))) ) e(W) Eliminate parentheses, assuming the usual precedence. a(X) b(X)  c(X, k) (c(f(X), g(X))  d(X, f(X))) e(W)

Converting to clauses (vii) Resolution (11) Step 7: form a conjunction of disjunctions. )() (a(X) b(X)  c(X, k)  e(W))  (a(X) b(X)  c(f(X), g(X))  d(X, f(X))  e(W))

Converting to clauses (viii) Resolution (12) Step 8: split the conjunction into clauses (by convention, they are treated as conjoined). a(X) b(X)  c(X, k)  e(W) a(X) b(X)  c(f(X), g(X)) d(X, f(X))  e(W) Rewrite into a neat clausal form with arrows. a(X), b(X)  c(X, k), e(W) a(X), b(X), c(f(X), g(X)) d(X, f(X)), e(W)

Resolution (13) Let’s see where this long story got us. We started here: ∀X ((a(X)  b(X)) (c(X, k)  ∃Y (∃Z c(Y, Z)  d(X, Y))) ) ∀X e(X) We ended up here: (a(X)  b(X)  c(X, k)  e(W))  (a(X)  b(X)  c(f(X), g(X)) d(X, f(X))  e(W))

The proof procedure (i) Resolution (14) Deduction is based on a single inference rule. Let all Am, Bn be literals. Given are two clauses: A1... Ai-1 ¬Ai Ai+1... AM B1... Bk-1Bk Bk+1... BN where Ai and Bk are unifiable.

The proof procedure (ii) Resolution (15) Unification is applied to both complete clauses, the two unifiable literals are "cancelled out", and a new clause is formed. A1... Ai-1 Ai+1... AM B1... Bk-1 Bk+1... BN This is the resolvent of the two given clauses. There is one clause with a special meaning: the empty clause [], containing no disjuncts. It is interpreted as always false.

The proof procedure (iii) Resolution (16) Here is how we prove a formula , given a set S of axioms and theorems (we actually show that  is also a theorem): convert the given axioms and theorems in S into the clausal form CS, convert  into the clausal form C, try to deduce a contradiction from CS C, that is, to derive the empty clause []; this is done by repeatedly picking pairs of clauses and applying the resolution inference rule.

The proof procedure (iv) Resolution (17) Now, if the original theory S can produce , we should be able to derive it somehow; given also , we get exactly the empty clause—this proves  by contradiction. The order in which resolution develops is dictated by the proof algorithm, or by the operation of selecting two clauses that give the next resolvent. Selection strategies can lead to efficient proofs, but there is in general no guarantee that a proof can even be found.

An example (i) Resolution (18) Someone is happy if he passes a history exam and wins a lottery. You pass exams if you study or if you are lucky. You win a lottery if you are lucky (studying does not help). John is lucky, but he does not study. Is he happy? • X pass(X, history) win(X, lottery) happy(X) • X Y study(X)  lucky(X) pass(X, Y) • X lucky(X)  win(X, lottery) • lucky(john) • ¬ study(john)

An example (ii) Resolution (19) The same in the clausal form (the last clause translates our question). ¬ pass(X, history) ¬ win(X, lottery)  happy(X) ¬ study(Y)  pass(Y, Z) ¬ lucky(W)  pass(W, V) ¬ lucky(U)  win(U, lottery) lucky(john) ¬ study(john) ¬ happy(john)

An example (iii) Resolution (20) ¬ pass(X, history) ¬ win(X, lottery)  happy(X) ¬ study(Y)  pass(Y, Z) ¬ lucky(W)  pass(W, V) ¬ lucky(U)  win(U, lottery) lucky(john) ¬ study(john) ¬ happy(john)

How to control resolution? The order in which we select clauses for the next resolution step decides how fast we reach the empty clause (if we ever do). There is a number of strategies... but we have to skip this fascinating topic . Please read section 13.2.4 of the textbook if you want to find out more.

Logic for knowledge representation