Overview of AIC- PRAiSE (Lifted First-Order) Probabilistic Reasoning As Symbolic Evaluation

Overview of AIC-PRAiSE(Lifted First-Order)Probabilistic ReasoningAsSymbolic Evaluation

Propositional factor graphs • Simply a set of potential functions on random variables john and mary are friends Φ1(john and mary are friends, john smokes, mary smokes) john smokes mary smokes Φ2(john smokes) Φ3(mary smokes) P(john and mary are friends, john smokes, mary smokes) = α Φ1(john and mary are friends, john smokes, mary smokes) Φ2(john smokes) Φ3(mary smokes)

Propositional factor graphs • Bayesian nets can be represented as factor graphs Potential functions correspond to conditional probabilities P(earthquake) P(alarm|earthquake, burglary) earthquake earthquake alarm alarm burglary burglary P(burglary) Bayesian network Factor graph P(alarm, earthquake, burglary) = α P(alarm|earthquake, burglary) P(earthquake) P(burglary)

Belief Propagation (BP) earthquake alarm burglary P(alarm) = α Σearthquake, burglary P(alarm|earthquake, burglary) P(earthquake) P(burglary) = ?

Belief Propagation (BP) earthquake alarm burglary P(alarm) = α Σearthquake, burglary P(alarm|earthquake, burglary) P(earthquake) P(burglary) = α Σearthquake P(earthquake) (Σburglary P(burglary) P(alarm|earthquake, burglary))

Belief Propagation (BP) belief(V) = α ΠF in neighbors of V μ(V  F) From factor to variable F V … (args of F) – {V} μ(V  F) = ΣV’ in(args of F) – {V} F(args of F) ΠV’ in (args of F) – {V} μ(F  V’) From variable to factor F V … neighbors(V) - {F} μ(F  V) = ΠF’ in neighbors(V) – {F} μ(V  F’)

A Simple Example • The probability of an epidemic happening is 10% • For each person of a 1,000-people population, the probability of that person getting sick is40% if there is an epidemic1% if there is not an epidemic • Query: given that three people are sick and everybody else is not, what is the probability of an epidemic?

Using a Bayesian Net to Solve it sick1 = true sick2 = true epidemic sick3 = true ... sick999 = false sick1000 = false

Making it more logic-like We will use a first-order logic predicate notation instead of indices: because we want to write logic-like rules with them. sick1 sick(person1) = sick2 sick(person2) = ...

Using a Bayesian Net to Solve it sick(person1) = true sick(person2) = true epidemic sick(person3) = true ... sick(person999) = false sick(person1000) = false

Factor Graph Using a Bayesian Net to Solve it Lots of messages have the same values and derived from essentially the same computation. We could instead compute them each only once and then exponentiate sick(person1) sick(person2) sick(person3) epidemic ... sick(person999) sick(person1000)

Factor Graph Using a Bayesian Net to Solve it sick(person1) But a regular graphical model inference algorithm will compute all the repeated messages sick(person2) sick(person3) epidemic ... sick(person999) sick(person1000)

Factor Graph Using a Bayesian Net to Solve it sick(person1) sick(person2) We could write an algorithm for this specific model, but what if we don’t know the model in advance because we are writing an inference engine, or the model is going to be learned? sick(person3) epidemic ... sick(person999) sick(person1000)

An Algebraic View Representing concepts with mathematical expressions [ if epidemic then if sick(person1) then 0.4 else 0.6 else if sick(person1) then 0.01 else 0.99 ] sick(person1) [ sick(person1) ] • message to [epidemic ] from[ if epidemic then 0.1 else 0.9 ] epidemic • message to[ if epidemic then if sick(person1) then 0.4 else 0.6 else if sick(person1) then 0.01 else 0.9 ] • from[ sick(person1) ] • [ epidemic ] (its value is simplyepidemic,like X and x in statistics) • [ if epidemic then 0.1 else 0.9 ]

An Algebraic View We can represent the set of factors with a single expression: { [ if epidemic then 0.1 else 0.9 ] } union{ [ if epidemic then if sick(person1) then 0.4 else 0.6 else if sick(person1) then 0.01 else 0.99 ] } union …union { [ if epidemic then if sick(person1000) then 0.4 else 0.6 else if sick(person1000) then 0.01 else 0.99 ] } union { [ if sick(person1) then 1 else 0 ], [ if sick(person2) then 1 else 0 ], [ if sick(person3) then 1 else 0 ] } union{ [ if sick(person4) then 0 else 1 ], …, [ if sick(person1000) then 0 else 1 ] } sick(person1) sick(person2) sick(person3) epidemic ... sick(person999) sick(person1000)

An Algebraic View Now that we can denote factor graphs objects and quantities with mathematical expressions, we can write: belief(epidemic) = prod_{F in neighbors([epidemic])} message to [epidemic] from F We then compute: neighbors([epidemic]) ={ [ if epidemic then 0.1 else 0.9 ] } union{ [ if epidemic then if sick(person1) then 0.4 else 0.6 else if sick(person1) then 0.01 else 0.99 ] } union … union { [ if epidemic then if sick(person1000) then 0.4 else 0.6 else if sick(person1000) then 0.01 else 0.99 ] } And plug it back:

An Algebraic View Now that we can denote factor graphs objects and quantities with mathematical expressions, we can write: belief(epidemic) = prod_{F in { [ if epidemic then 0.1 else 0.9 ] } union{ [ if epidemic then if sick(person1) then 0.4 else 0.6 else if sick(person1) then 0.01 else 0.99 ] } union … union { [ if epidemic then if sick(person1000) then 0.4 else 0.6 else if sick(person1000) then 0.01 else 0.99 ] } } message to [epidemic] from F

An Algebraic View Now that we can denote factor graphs objects and quantities with mathematical expressions, we can write: belief(epidemic) = message to [epidemic] from [ if epidemic then 0.1 else 0.9 ]*message to [epidemic] from[ if epidemic then if sick(person1) then 0.4 else 0.6 else if sick(person1) then 0.01 else 0.99 ]* ... *message to [epidemic] from[ if epidemic then if sick(person1000) then 0.4 else 0.6 else if sick(person1000) then 0.01 else 0.99 ] [ if epidemic then if sick(person1) then 0.4 else 0.6 else if sick(person1) then 0.01 else 0.99 ] ... epidemic [ if epidemic then if sick(person1000) then 0.4 else 0.6 else if sick(person1000) then 0.01 else 0.99 ] Doesn’t change anything, really; we still need compute messages from 1000 nodes!

Intensional Representation We now introduce an intensional way of representing the set of factors in the model: { [ if epidemic then 0.1 else 0.9 ] } union{{ (on X in People) [ if epidemic then if sick(X) then 0.4 else 0.6 else if sick(X) then 0.01 else 0.99 ] }} union {{ (on X in People) [ if sick(X) then 1 else 0 ] | X = person1 or X = person2 or X = person3 }} union{{ (on X in People) [ if sick(X) then 0 else 1 ] | X != person1 and X != person2 and X != person3 }} sick(person1) sick(person2) sick(person3) epidemic ... sick(person999) sick(person1000)

Intensional Representation Intensional version: belief(epidemic) = prod_{F in neighbors([epidemic])} message to [epidemic] from F We then compute: neighbors([epidemic]) = { [ if epidemic then 0.1 else 0.9 ] } union{{ (on X in People) [ if epidemic then if sick(X) then 0.4 else 0.6 else if sick(X) then 0.01 else 0.99 ] }} And plug it back:

Intensional Representation Now that we can denote factor graphs objects and quantities with mathematical expressions, we can write: belief(epidemic) =prod_{F in { [ if epidemic then 0.1 else 0.9 ] } union{{ (on X in People) [ if epidemic then if sick(X) then 0.4 else 0.6 else if sick(X) then 0.01 else 0.99 ] }} } message to [epidemic] from F = message to [epidemic] from [ if epidemic then 0.1 else 0.9 ] * prod_{F in {{ (on X in People) [ if epidemic then if sick(X) then 0.4 else 0.6 else if sick(X) then 0.01 else 0.99 ] }} } message to [epidemic] from F

Intensional Representation Now that we can denote factor graphs objects and quantities with mathematical expressions, we can write: belief(epidemic) = message to [epidemic] from [ if epidemic then 0.1 else 0.9 ] * prod_{F in {{ (on X in People) [ if epidemic then if sick(X) then 0.4 else 0.6 else if sick(X) then 0.01 else 0.99 ] }} } message to [epidemic] from F = message to [epidemic] from [ if epidemic then 0.1 else 0.9 ] * prod_{X in People}message to [epidemic] from [ if epidemic then if sick(X) then 0.4 else 0.6 else if sick(X) then 0.01 else 0.99 ]

Intensional Representation Now that we can denote factor graphs objects and quantities with mathematical expressions, we can write: belief(epidemic) = message to [epidemic] from { [ if epidemic then 0.1 else 0.9 ] } * prod_{X in People} message to [epidemic] from[ if epidemic then if sick(X) then 0.4 else 0.6 else if sick(X) then 0.01 else 0.99 ] [ if epidemic then if sick(X) then 0.4 else 0.6 else if sick(X) then 0.01 else 0.99 ] epidemic prod_X [ if epidemic then 0.1 else 0.9 ] Stands for 1000 nodes sending 1000 messages, which are multiplied Does it make things better? How do I compute that expression?

Symbolic Evaluation • Symbolic evaluation is about evaluating expressions even if we don’t know the value of everything in them: • 1 + 2 + 3 6 • X + 2 + 3 + 0*Y X + 5 • 3 in { 1, 2, 3 } true • [ sick(X) ] in { [sick(john)], [sick(mary] } X = john or X = mary

Symbolic Evaluation Externalizing if-then-else constructs: f (if Condition then A else B) = if Condition f(A) else f(B) Example: income(X) := salary(X) + 2 salary(X) := if X = bob then 7 else 1 income(bob) = salary(bob) + 2 = 7 + 2 = 9 income(Y) = salary(Y) + 2 = (if Y = bob then 7 else 1) + 2 = if Y = bob then 7 + 2 else 1 + 2 = if Y = bob then 9 else 3

Symbolic Evaluation Case analysis: sum_{X in Set} if Cond(X) then A else B = sum_{X in Set : Cond(X)} A + sum_{X in Set : not Cond(X)} B = (if A and B are constants in X) A*|{ X in Set : Condition(X) }| + B*|{X in Set : not Condition(X)}| (analogous for product and exponentiation) Example: sum_X income(X) = sum_X if X = bob then 9 else 3 = sum_{X = bob} 9 + sum_{X != bob} 3 = 9 + 3 * |{X in People : X != bob}| = 9 + 3 * (|People| - 1)

A Schematic View prod_X ... f g(X) h(X) z(X) prod_X ... f g(X) h(X) if X = bob then 1 else 2 ... prod_X if X = bobthen 5 else 6 f g(X) if X = bob then 1 else 2 ... prod_X if X = bob then 7 else 5 if X = bobthen 5 else 6 f if X = bob then 1 else 2 ... prod_X if X = bob then 7 else 5 if X = bobthen 5 else 6 7 * 5|X| - 1 if X = bob then 1 else 2

Symbolic Evaluation to the Rescue prod_{X in People} message to [epidemic] from[ if epidemic then if sick(X) then 0.4 else 0.6 else if sick(X) then 0.01 else 0.99 ] = ? • We symbolically solve message to [epidemic] from[ if epidemic then if sick(X) then 0.4 else 0.6 else if sick(X) then 0.01 else 0.99 ] • It has to be symbolically, because it contains a free variable X.

Symbolic Evaluation to the Rescue message to [epidemic] from[ if epidemic then if sick(X) then 0.4 else 0.6 else if sick(X) then 0.01 else 0.99 ] = some function of X... = another function of X... ... = a function of neighbors([sick(X)]) For the first time, an expression actually depends on the value of X: neighbors([sick(X)]) = if X = person1 or X = person2 or X = person3 then ... union { [ if sick(X) then 1 else 0 ] } else ... union { [ if sick(X) then 0 else 1 ] } sick(X) ? [ if epidemic then if sick(X) then 0.4 else 0.6 else if sick(X) then 0.01 else 0.99 ]

Symbolic Evaluation to the Rescue message to [epidemic] from[ if epidemic then if sick(X) then 0.4 else 0.6 else if sick(X) then 0.01 else 0.99 ] = if X = person1 or X = person2 or X = person3 then < some message on epidemic > else < some other message on epidemic >

Symbolic Evaluation to the Rescue prod_X message to [epidemic] from[ if epidemic then if sick(X) then 0.4 else 0.6 else if sick(X) then 0.01 else 0.99 ] = prod_X if X = person1 or X = person2 or X = person3 then < some message on epidemic > else < some other message on epidemic > = prod_{X = person1 or X = person2 or X = person3} < some message on epidemic > prod_{X != person1 and X != person2 and X != person3} < some other message on epidemic > = < some message on epidemic > |{X = person1 or X = person2 or X = person3}| < some other message on epidemic > |{X != person1 and X != person2 and X != person3}| = < some message on epidemic > 3 * < some other message on epidemic > 997 = < yet another message on epidemic > epidemic

An Algebraic View belief(epidemic) = message to [epidemic] from { [ if epidemic then 0.1 else 0.9 ] } * prod_{X in People} message to [epidemic] from[ if epidemic then if sick(X) then 0.4 else 0.6 else if sick(X) then 0.01 else 0.99 ] = < prior message on epidemic > * < yet another message on epidemic > = < value of belief of epidemic > [ if epidemic then if sick(X) then 0.4 else 0.6 else if sick(X) then 0.01 else 0.99 ] epidemic prod_X [ if epidemic then 0.1 else 0.9 ] Stands for 1000 nodes sending 1000 messages, which are multiplied We computed the exact belief without considering all the individuals

Lifted Inference • Lifted inference is about performing inference from intensional representations without unnecessarily considering individual random variables

A Schematic View prod_X ... f g(X) h(X) z(X) prod_X ... f g(X) h(X) if X = bob then 1 else 2 ... prod_X if X = bobthen 5 else 6 f g(X) if X = bob then 1 else 2 ... prod_X if X = bob then 7 else 5 if X = bobthen 5 else 6 f if X = bob then 1 else 2 ... prod_X if X = bob then 7 else 5 if X = bobthen 5 else 6 7 * 5|X| - 1 if X = bob then 1 else 2

Why this is progress • Lifted inference algorithms to date have been very loosely described • Notions such as “parfactor” not clear • How many neighbors does this parfactor have? • Are the smoker nodes really separate? friends(X,Y) smoker(X) smoker(Y)

Symbolic Evaluation to the Rescue { U, V, a } = (set normalization) if U = V then { X, a } else if U = a then { V, a } else if V = a then { U, a } else { U, V, a } f(X) = f(Y), f injective = X = Y

Symbolic Evaluation to the Rescue neighbors([if smoker(X) then if smoker(Y) then friends(X,Y) ... ] ) = { [friends(X,Y)], [smoker(X)], [smoker(Y)] } = (set normalization) if [smoker(X)] = [smoker(Y)] then { [friends(X,Y)], [smoker(X)] } else { [friends(X,Y)], [smoker(X)], [smoker(Y)] }= (equality on injective function) if X = Y then { [friends(X,Y)], [smoker(X)] } else { [friends(X,Y)], [smoker(X)], [smoker(Y)] }

Cardinalities can also split | { X in People : X != bob and X != Neighbor } | = if Neighbor = bob then |People| - 1 else | People | - 2 This was also very awkward for algorithms so far, but with symbolic evaluation it is dealt with just like any other if-then-else. Symbolically computing cardinalities of sets is a useful sub-routine for a lot of things other than probabilistic inference!

Conclusion • An approach to Lifted inference with a clear and formal representation • Lifted algorithm based on straightforward math manipulations; good for current state, essential for future extensions • Get compilation, short-circuiting etc for free from symbolic evaluation base • Meta-level gives you lots of opportunities • Lifted computation, really, not only for probabilistic inference.

Overview of AIC- PRAiSE (Lifted First-Order) Probabilistic Reasoning As Symbolic Evaluation