1 / 17

Bayesian Learning

Bayesian Learning. Introduction Bayes Theorem Maximum Likelihood Estimation Bayes Optimal Classifier and Naïve Bayes Bayesian Belief Networks. Bayesian Belief Networks. A Bayesian Belief Network is a method to describe the joint probability distribution of a set of variables.

nasim-rich
Télécharger la présentation

Bayesian Learning

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Bayesian Learning • Introduction • Bayes Theorem • Maximum Likelihood Estimation • Bayes Optimal Classifier and Naïve Bayes • Bayesian Belief Networks

  2. Bayesian Belief Networks A Bayesian Belief Network is a method to describe the joint probability distribution of a set of variables. Let x1, x2, …, xn be a set of variables or features. A Bayesian Belief Network or BBN will tell us the probability of any combination of x1, x2 , .., xn. Similar to Naïve Bayes we will make some independence assumptions, but not as strong as the assumption of all variables being independent.

  3. An Example Set of Boolean variables and their relations: Storm Bus Tour Group Lightning Campfire Thunder Forest Fire

  4. Conditional Probabilities S,B S,~B ~S,B ~S,~B C 0.4 0.1 0.8 0.2 ~C 0.6 0.9 0.2 0.8 Storm Bus Tour Group Campfire

  5. Conditional Independence We say x1 is conditionally independent of x2 given x3 if the probability of x1 is independent of x2 given x3: P(x1|x2,x3) = P(x1|x3) The same can be said for a set of variables: x1,x2,x3 is independent of y1,y2,y3 given z1,z2,z3: P(x1,x2,x2|y1,y2,y3,z1,z2,z3) = P(x1,x2,x3|z1,z2,z3)

  6. Representation • A BBN represents the joint probability distribution of a set • of variables by explicitly indicating the assumptions of • conditional independence through the following: • A directed acyclic graph and • Local conditional probabilities. Storm Bus Tour Group variable conditional probabilities Campfire

  7. Representation Each variable is independent of its nondescendants given its predecessors. We say x1 is a descendant of x2 if there is a direct path from x2 to x1. Example: Predecessors of Campfire: Storm, Bus Tour Group. (Campfire is a descendant of these two variables). Campfire is independent of Lightning given its predecessors. Bus Tour Group Storm Lightning Campfire

  8. Joint Probability Distributionn To compute the joint probability distribution of a set of variables given a Bayesian Belief Network we simply use the following formula: P(x1,x2,…,xn) = Π i P(xi | Parents(xi)) Where parents are the immediate predecessors of xi.

  9. Joint Probability Distributionn Example: P(Campfire, Storm, BusGroupTour, Lightning, Thunder, ForestFire)? P(Storm)P(BusTourGroup)P(Campfire|Storm,BusTourGroup) P(Lightning|Storm)P(Thunder|Lightning) P(ForestFire|Lightning,Storm,Campfire).

  10. Joint Distribution, An Example Storm Bus Tour Group Lightning Campfire Thunder Forest Fire P(Storm)P(BusTourGroup)P(Campfire|Storm,BusTourGroup) P(Lightning|Storm)P(Thunder|Lightning) P(ForestFire|Lightning,Storm,Campfire).

  11. Conditional Probabilities, An Example S,B S,~B ~S,B ~S,~B C 0.4 0.1 0.8 0.2 ~C 0.6 0.9 0.2 0.8 Storm Bus Tour Group P(Campfire=true|Storm=true,BusTourGroup=true) = 0.4 Campfire

  12. Inference and Learning What is the connection between a BBN and classification? Suppose one of the variables is the target variable. Can we compute the probability of the target variable given the other variables? In Naïve Bayes: Concept C X1 xn X2 … P(x1,x2,…xn,c) = P(c) P(x1|c) P(x2|c) … P(xn|c)

  13. General Case In the general case we can use a BBN to specify independence assumptions among variables. General Case: Concept C X1 x4 X2 X3 P(x1,x2,…xn,c) = P(c) P(x1|c) P(x2|c) P(x3|x1,x2,c)P(x4,c)

  14. Learning Belief Networks • We can learn BBN in different ways. Two basic • approaches follow: • Assume we know the network structure: • We can estimate the conditional probabilities • for each variable from the data. • Assume we know part of the structure but some variables • are missing: • This is like learning hidden units in a neural network. • One can use a gradient ascent method to train the BBN.

  15. Learning Belief Networks • Assume nothing is known. • We can learn the structure and conditional • probabilities by looking in the space of possible • networks.

  16. The EM Algorithm • It is a method to learn in the presence of unobserved variables. • It is based on the following idea. • Let x1, x2, …, xn be a set of hidden variables for which we • don’t know their actual value. • Start with some hypothesis h about the model. • Iterate: • Calculate the expected value of x1,x2,…xn assuming • hypothesis h is true. • Calculate a new hypothesis h’ assuming the value • taken by each hidden variable is its expected value.

  17. Summary • Bayesian Belief Networks provide a framework to • compute the joint probability distribution of a set of • variables. • BBN explicitly indicate independence relation. • Applied to learning, a BBN can serve to predict the • value of a target variable. • A BBN can be learned from data even if the network • structure is not known.

More Related