1 / 26

Using Bayesian Networks to Analyze Expression Data

Using Bayesian Networks to Analyze Expression Data. N . Friedman, M. Linial , I. Nachman , D. Pe’er @ Hebrew University. What I will cover. Domain background Overview of their work Causal networks vs. Bayes networks Application Results. Background information.

tareq
Télécharger la présentation

Using Bayesian Networks to Analyze Expression Data

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Using Bayesian Networks to Analyze Expression Data N. Friedman, M. Linial, I. Nachman, D. Pe’er @ Hebrew University

  2. What I will cover • Domain background • Overview of their work • Causal networks vs. Bayes networks • Application • Results

  3. Background information

  4. What are gene expressions? • It is the process in which information is used in the synthesis of a functional gene product (protein or Rna). • Think of it as a menu for a dinner given a certain holiday. • Need certain ingredients / food to pull it off right. • Too much or too little of something can lead to odd results.

  5. Advancement in technology lead to DNA Microarrays. • Snapshot of internals of a cell at a given moment in time. • No more having to look at one gene at a time for comparison. • Most computational analysis has focused on clustering algorithms. • Cluster like genes with like genes. • Useful for finding co-regulated genes but not really for finding the structure of the regulation process.

  6. Overview

  7. Overview • How to discover key relations in cellular systems given large amounts of micro array data. • Propose a Bayesian Network framework for gene interaction discovery from micro array data. • Trying to build statistical dependencies. • Understand interactions from multiple expression measurements.

  8. Overview • Want to uncover properties of the network by examining the dependence and conditional dependence of the gene data. • How does one gene interact with another etc. • Can use this information to determine causal influence.

  9. Bayes nets

  10. Bayesian Network

  11. Bayesian Network • Useful for a few reasons • Great for describing locally interacting entities. • Well understood array of algorithms and successful use in many areas. • Can be used to infer a causal network even though they are not mathematically defined as such. • Able to handle noise fairly well.

  12. Causal Network • Very similar to a typical Bayesian net. • Bayesian network with a strict requirement that the relationships are causal. • X causes something about Y. • Learning multiple networks with the same directed path could mean there is a causal indication between X and Y.

  13. Bayesvs Causal • Bayesian Network generally deals with dependence. • Causal Networks deal with strict relationships. • Bayesian Network can have equivalent networks. • X  Y is equivalent to Y  X • Causal Network • The above cannot hold due to the definition of Causal networks.

  14. Learning Causal Patterns • Need to determine a causal interpretation of the network. • Observation • Passive domain measurement. • Intervention • Setting variable values using outside forces.

  15. Causal Markov assumption • Given the values of a variables immediate causes, it is independent of its earlier causes. • Once we know the makeup of the genes parents, we don’t care about the ancestors anymore in terms of the current gene.

  16. Analyzing Expression Data • Consider distributions over all possible states ( can include environmental states etc) • State of the system is a series of random variables. • Each random variable denotes expression level of each gene. • Take all of these variables and build the joint distribution.

  17. Difficult to learn from expression data due to involving transcript levels from thousands of genes! • However these gene networks are sparse so Bayes Nets are still well suited.

  18. Learning the model • Markov relations are a feature that indicates if two genes are related in a joint biological process. • Order relations are a feature that captures a global property about the network. • Used as an indication of some causality between X and Y. Its not certain though.

  19. Confidence of features • Produce m different networks and for each feature of interest calculate its confidence. • Where f(G) is 1 if f is a feature of G, 0 otherwise.

  20. Learning the network structure • Issues • Extremely large search space (super-exponential in the number of variables) • Need to id potential parents for each gene using simple statistics to build the network. • Reduces search space to networks that only contain the candidate parents as parents of some variables Xi .

  21. Different local probability models • Multinomial Model • Treat each variable as discrete and learn multinomial distribution to describe the possible state of each child given the stat of the parents. • Linear Gaussian Model • Linear regression model for the child given its parents.

  22. Results • Applied Cell Cycle Expression patterns. • 76 gene expression measurements. • Treat each measurement as an independent sample. • Performed the boot strapping algorithm along with the sparse search algorithm to extract learned features. • Performed on only 250 genes

  23. Test robustness • Tested their confidence assessment by using a randomly created data set. Random permutation of the order of experiments per gene. • Found that random data did not perform well due to not finding real features that correspond in the data. • Tells us that the learned features are not artifacts of the boot strapping estimation.

  24. Managed to extract plausible biological knowledge without use of priors. • Framework builds a much “richer” structure from the data compared to clustering techniques. • Capable of discovering causal relationships between genes from expression data.

More Related