1 / 23

Combining Inductive Logic Programming, Active Learning and Robotics to Discover the Function of Genes

Combining Inductive Logic Programming, Active Learning and Robotics to Discover the Function of Genes. by C.H. Bryant, S.H. Muggleton, S.G. Oliver, D.B. Kell, P. Reiser and R.D. King Presenter: Mark H. Rich 2/7/2003 University of Wisconsin - Madison

cato
Télécharger la présentation

Combining Inductive Logic Programming, Active Learning and Robotics to Discover the Function of Genes

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Combining Inductive Logic Programming, Active Learning and Robotics to Discover the Function of Genes by C.H. Bryant, S.H. Muggleton, S.G. Oliver, D.B. Kell, P. Reiser and R.D. King Presenter: Mark H. Rich 2/7/2003 University of Wisconsin - Madison CS 838 Learning and Modeling Biological Networks

  2. Discovering Gene Function • Yeast (S. cerevisiae) has 6,000 protein-encoding genes • Only 60% can be assigned function with confidence • The cell is a bio-chemical machine • Logic can help us discover these metabolic functions and networks

  3. ASE-Progol Robot Scientist Background Knowledge Learning Engine Analysis New Knowledge Experiment Selection Results

  4. Outline • Introduction • Abduction and Active Learning • Functional Genomics • Metabolism in Logic • Experiments • Results

  5. Logic in AI • Deduction • Given facts with sound and complete proof theory, show that other facts can be proven • Induction • Given positive and negative examples of facts and background knowledge, find hypothesis that explains difference between positives and negatives

  6. Abduction and TCIE • Given a theory and partial facts, discover what facts are missing to form one consistent hypothesis • Lateral Thinking Puzzles • Presented with a confusing situation • There is an Oracle that knows what happened • You can only ask yes or no questions

  7. The Mysterious Package • One day a man received a parcel in the post. Carefully packed was a human arm. He examined it, repacked it and then sent it on to another man. The second man also carefully examined the arm before taking it to the woods and burying it. Why did they do this?

  8. The Mysterious Package • Was the arm cut off intentionally? • Is the arm’s person still alive? • Is he a doctor? • Did the three men know each other? • Are the other men also missing an arm? • Were they ever stuck on a desert island with no food, make a pact to each cut off an arm to eat and survive, but were rescued before the doctor could cut off his own arm, and the doctor later fulfilled his commitment? YES!

  9. Lateral Thinking Lessons • Certain questions are valuable and lead to large leaps of information . . . • How do we form hypotheses? • How can we pick good questions? • probability that question leads to consistent hypotheses • cost of asking question • We want to find quickest cheapest path to consistent hypotheses

  10. Hypothesis Generation • Use contra-positives for inverse entailment Background Knowledge hasbeak(X) :- bird(X). bird(X) :- vulture(X). Example hasbeak(tweety). Hypotheses bird(tweety). bird(X). vulture(tweety). vulture(X).

  11. One possible trial path e1 f t e2 H1 f t H2 H3 Trial Selection Theory

  12. Hypothesis Probability • Each trial partitions H into {H[t],H[t’]} • Assuming optimal encoding scheme… • Prior probability of each hypothesis • Compression is rounded f measure

  13. Experiment Cost • Ct is the cost of a trial t

  14. Functional Genomics • Want to learn gene-enzyme mapping • Genes encode for • Enzymes that catalyze reactions between • Metabolites to eventually create • Amino Acid Products • Perform auxotrophic growth experiments to determine phenotype

  15. gene1 gene2 gene3 Z A Y B C Trp X Functional Genomics: Simple • A, B and C are Enzymes • X is ubiquitous metabolite, Y and Z optional • If we knock out gene2, we need to add nutrient Z to produce Trp want to learn codes(gene2, B, [Y], [Z]) but only ask: pheno_effect(gene2,[Y]) is false pheno_effect(gene2,[Z]) is true pheno_effect(gene2,[Y,Z]) is true

  16. aromatic amino acids enzymes metabolites Aromatic amino acid pathway

  17. Metabolism in Logic • Hypotheses: codes(‘YDR254W’, ‘4.2.1.11’, [‘C00631’],[‘C00074’]). codes(‘YDR254W’, ‘5.3.1.24’, [‘C04302’],[‘C01302’]). etc ... • Background Knowledge: enzyme(‘4.2.1.11’,[‘C00631’],[‘C00074’]). enzyme(‘5.3.1.24’,[‘C04302’],[‘C01302’]). etc ... generated_by_other_pathways([‘C00002’, ‘C00005’, ‘C00006’, ... , ‘C03356’]). ends([‘C00078’, ‘C00079’, ‘C00082’]).

  18. Metabolism in Logic • What the Oracle answers: phenotypic_effect(ORF, Growth_medium):- generated_by_other_pathways(Ubiquitous_metabolites), union(Ubiquitous_metabolites, Growth_medium, Starts), connected(Starts, Wild_products), ends(Ends), subset(Wild_products, Ends), enz(Enzyme, Reactants, Products), encodes(ORF, Enzyme, Reactants, Products), connected_without_this_step(Starts, Mutant_products, Enzyme, Reactants, Products), not(subset(Mutant_products, Ends)).

  19. Experiments • Learn function of 17 genes by removing ORF • Growth Media • 13 optional nutrients, at most 3 at a time • 378 possible experiments for each ORF • Cost of Optional Nutrients • Determined from www.sigmaaldrich.com catalog • Strategies for Comparison • Random • Naïve Cheapest • ASE-Progol

  20. Experiments • Remove all codes(…) facts • Loop • Generate random sample of trials • Generate hypotheses using Theory Completion by Inverse Entailment • Find minimum EC(H,T) trial and perform • Add results to known examples • until hypotheses consistent with trials

  21. Results:Cost

  22. Results: Time

  23. Conclusions and Future Work • ASE-Progol finds hypotheses inexpensively and quickly • 5 of 17 genes had only negative examples… why? Look into inhibitors and nonmonotonic logics. • Limited answers to yes/no. Probabilities? • Can this be applied to gene regulatory networks, using microarray technology? • What other networks have similar frameworks?

More Related