Create Presentation
Download Presentation

Download Presentation
## Tutorial on Bayesian Networks

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -

**Tutorial on Bayesian Networks**Jack Breese Microsoft Research breese@microsoft.com Daphne Koller Stanford University koller@cs.stanford.edu First given as a AAAI’97 tutorial.**Probabilities**• Probability distribution P(X|x) • X is a random variable • Discrete • Continuous • xis background state of information**Discrete Random Variables**• Finite set of possible outcomes X binary:**Continuous Random Variable**• Probability distribution (density function) over continuous values 5 7**Bayesian networks**• Basics • Structured representation • Conditional independence • Naïve Bayes model • Independence facts**P(**S=no) 0.80 P( S=light) 0.15 P( S=heavy) 0.05 Smoking= no light heavy P( C=none) 0.96 0.88 0.60 P( C=benign) 0.03 0.08 0.25 P( C=malig) 0.01 0.04 0.15 Bayesian Networks Smoking Cancer**Product Rule**• P(C,S) = P(C|S) P(S)**Marginalization**P(Smoke) P(Cancer)**Cancer=**none benign malignant P( S=no) 0.821 0.522 0.421 P( S=light) 0.141 0.261 0.316 P( S=heavy) 0.037 0.217 0.263 Bayes Rule Revisited**A Bayesian Network**Age Gender Exposure to Toxics Smoking Cancer Serum Calcium Lung Tumor**Independence**Age and Gender are independent. Age Gender P(A,G) = P(G)P(A) P(A|G) = P(A) A ^G P(G|A) = P(G) G ^A P(A,G) = P(G|A) P(A) = P(G)P(A) P(A,G) = P(A|G) P(G) = P(A)P(G)**Conditional Independence**Cancer is independent of Age and Gender given Smoking. Age Gender Smoking P(C|A,G,S) = P(C|S) C ^ A,G | S Cancer**Serum Calcium is independent of Lung Tumor, given Cancer**P(L|SC,C) = P(L|C) More Conditional Independence:Naïve Bayes Serum Calcium and Lung Tumor are dependent Cancer Serum Calcium Lung Tumor**Naïve Bayes in general**H …... E1 E2 E3 En 2n + 1 parameters:**P(E = heavy | C = malignant) >**P(E = heavy | C = malignant, S=heavy) More Conditional Independence:Explaining Away Exposure to Toxics and Smoking are independent Exposure to Toxics Smoking E ^ S Cancer Exposure to Toxics is dependent on Smoking, given Cancer**Age**Gender Exposure to Toxics Smoking Cancer Serum Calcium Lung Tumor Put it all together**General Product (Chain) Rule for Bayesian Networks**Pai=parents(Xi)**Conditional Independence**A variable (node) is conditionally independent of its non-descendants given its parents. Age Gender Non-Descendants Exposure to Toxics Smoking Parents Cancer is independent of Age and Gender given Exposure to Toxics and Smoking. Cancer Serum Calcium Lung Tumor Descendants**Another non-descendant**Age Gender Cancer is independent of Dietgiven Exposure toToxics and Smoking. Exposure to Toxics Smoking Diet Cancer Serum Calcium Lung Tumor**Independence and Graph Separation**• Given a set of observations, is one set of variables dependent on another set? • Observing effects can induce dependencies. • d-separation (Pearl 1988) allows us to check conditional independence graphically.**CPCS**Network**Age**Gender Exposure to Toxic Smoking Genetic Damage Cancer Structuring Network structure corresponding to “causality” is usually good. Extending the conversation. Lung Tumor**Local Structure**• Causal independence: from 2nto n+1 parameters • Asymmetric assessment: similar savings in practice. • Typical savings (#params): • 145 to 55 for a small hardware network; • 133,931,430 to 8254 for CPCS !!**Course Contents**• Concepts in Probability • Bayesian Networks • Inference • Decision making • Learning networks from data • Reasoning over time • Applications**Inference**• Patterns of reasoning • Basic inference • Exact inference • Exploiting structure • Approximate inference**Predictive Inference**Age Gender How likely are elderly males to get malignant cancer? Exposure to Toxics Smoking P(C=malignant| Age>60, Gender= male) Cancer Serum Calcium Lung Tumor**Combined**Age Gender How likely is an elderly male patient with high Serum Calciumto have malignant cancer? Exposure to Toxics Smoking Cancer P(C=malignant| Age>60, Gender= male, Serum Calcium = high) Serum Calcium Lung Tumor**Smoking**• If we then observe heavy smoking, the probability of exposure to toxics goes back down. Explaining away Age Gender • If we see a lung tumor, the probability of heavy smoking and of exposure to toxics both go up. Exposure to Toxics Smoking Cancer Serum Calcium Lung Tumor**P(q, e)**P(q | e) = P(e) Inference in Belief Networks • Find P(Q=q|E= e) • Q the query variable • E set of evidence variables X1,…, Xn are network variables except Q, E P(q, e) = S P(q, e, x1,…, xn) x1,…, xn**Basic Inference**S C P(c) = ? • P(C,S) = P(C|S) P(S)**C**P(b) = S P(a, b) = S P(b | a) P(a) a a P(c) = S P(c | b) P(b) b = S P(c | b) P(b | a) P(a) P(c) = S P(a, b, c) b,a b,a = S P(c | b) S P(b | a) P(a) b a P(b) Basic Inference A B**= S P(x | y1, y2) P(y1) P(y2)**because of independence of Y1, Y2: y1, y2 Inference in trees Y2 Y1 X X P(x) = S P(x | y1, y2) P(y1, y2) y1, y2**Polytrees**• A network is singly connected (a polytree) if it contains no undirected loops. D C Theorem: Inference in a singly connected network can be done in linear time*. Main idea: in variable elimination, need only maintain distributions over single nodes. * in network size including table sizes.**c**c P(g) = P(r, s) ~ 0 The problem with loops P(c) 0.5 Cloudy c c Rain Sprinkler P(s) 0.01 0.99 P(r) 0.01 0.99 Grass-wet deterministic or The grass is dry only if no rain and no sprinklers.**0**0 P(g | r, s) P(r, s) + P(g | r, s) P(r, s) + P(g | r, s) P(r, s) + P(g | r, s) P(r, s) 0 1 = P(r, s) = P(r) P(s) ~ 0.5 ·0.5 = 0.25 problem The problem with loops contd. P(g) = ~ 0**P(c) = S P(c | b) S P(b | a) P(a)**P(A) P(B | A) b a P(b) x S P(B, A) P(B) P(C | B) A x S P(C, B) P(C) B Variable elimination A B C**Inference as variable elimination**• A factor over X is a function from val(X) to numbers in [0,1]: • A CPT is a factor • A joint distribution is also a factor • BN inference: • factors are multiplied to give new ones • variables in factors summed out • A variable can be summed out as soon as all factors mentioning it have been multiplied.**P(A)**P(G) P(S | A,G) P(E | A) S P(A,E,S) P(A,S) P(A,G,S) x x G S P(C | E,S) P(E,S) A x S P(C) P(E,S,C) E,S S P(L | C) x P(C,L) P(L) C Variable Elimination with loops Age Gender Exposure to Toxics Smoking Cancer Serum Calcium Lung Tumor Complexity is exponential in the size of the factors**A, G, S**A, E, S Join trees* A join tree is a partially precompiled factorization Age Gender P(A) x P(G) x P(S | A,G) x P(A,S) Exposure to Toxics Smoking Cancer E, S, C Serum Calcium Lung Tumor C, S-C C, L * aka junction trees, Lauritzen-Spiegelhalter, Hugin alg., …**Boolean 3CNF formula f= (u v w) (u w y)**U V W Y prior probability1/2 or or and Probability ( ) = 1/2n · # satisfying assignments of f Computational complexity • Theorem: Inference in a multi-connected Bayesian network is NP-hard.**# of live samples with B=b**P(b|c) ~ total # of live samples 0.001 0.03 0.4 0.3 0.8 B E A C N b n b e a e b e b e Samples: b e a c n Stochastic simulation Burglary Earthquake P(b) P(e) 0.03 0.001 b e Alarm P(a) 0.98 0.7 0.4 0.01 Call Newscast = c e a P(n) 0.3 0.001 P(c) 0.05 0.8 e a c ...**weight**0.8 b weight of samples with B=b n a P(b|c) = 0.05 b e a c n total weight of samples Likelihood weighting Burglary Earthquake a P(c) Alarm 0.05 0.8 P(c) 0.95 0.2 Call Newscast = c Samples: B E A C N e a c ...**MCMC with Gibbs Sampling**• Fix the values of observed variables • Set the values of all non-observed variables randomly • Perform a random walk through the space of complete variable assignments. On each move: • Pick a variable X • Calculate Pr(X=true | all other variables) • Set X to true with that probability • Repeat many times. Frequency with which any variable X is true is it’s posterior probability. • Converges to true posterior when frequencies stop changing significantly • stable distribution, mixing**Markov Blanket Sampling**• How to calculate Pr(X=true | all other variables) ? • Recall: a variable is independent of all others given it’s Markov Blanket • parents • children • other parents of children • So problem becomes calculating Pr(X=true | MB(X)) • We solve this sub-problem exactly • Fortunately, it is easy to solve**Example**A C X B**Example**Smoking Heartdisease Lungdisease Shortnessof breath**Example**• Evidence: s, b Smoking Heartdisease Lungdisease Shortnessof breath**Example**• Evidence: s, b • Randomly set: h, b Smoking Heartdisease Lungdisease Shortnessof breath**Example**• Evidence: s, b • Randomly set: h, g • Sample H using P(H|s,g,b) Smoking Heartdisease Lungdisease Shortnessof breath