html5
1 / 26

Bioinformatics in the Department of Computer Science

Bioinformatics in the Department of Computer Science. Lenwood S. Heath Department of Computer Science Blacksburg, VA 24061. College of Engineering Northern Virginia Engineering Showcase March 5, 2004. Bioinformatics Faculty. Layne Watson. Cliff Shaffer. Naren Ramakrishnan.

cutler
Télécharger la présentation

Bioinformatics in the Department of Computer Science

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Bioinformatics in the Department of Computer Science Lenwood S. Heath Department of Computer Science Blacksburg, VA 24061 College of Engineering Northern Virginia Engineering Showcase March 5, 2004

  2. Bioinformatics Faculty Layne Watson Cliff Shaffer Naren Ramakrishnan Alexey Onufriev Roger Ehrich Eunice Santos Chris North Adrian Sandu Lenny Heath T. M. Murali Joao Setubal, CS and VBI

  3. Relevant Expertise • Algorithms— Heath, Santos, Setubal, Shaffer, Watson • Computational structural biology — Onufriev, Sandu • Computational systems biology — Murali • Data mining — Ramakrishnan • Genomics — Heath, Murali, Ramakrishnan • Human-omputer interaction, visualization — North • Image processing — Ehrich, Watson • High performance computing— Sandu, Santos, Watson • Numerical analysis — Onufriev, Watson • Optimization — Watson • Problem solving environments — Ramakrishnan, Shaffer 3/5/2004 Bioinformatics in Computer Science

  4. Selected Collaborations • Virginia Tech: Biochemistry, Biology, Fralin Biotechnology Center, Plant Physiology, Veterinary Medicine, Virginia Bioinformatics Institute (VBI), Wood Science • North Carolina State University: Forest Biotechnology Center • Duke: Biology • University of Illinois: Plant Biology 3/5/2004 Bioinformatics in Computer Science

  5. Selected Funding • NSF IBN 0219322: ITR: Understanding Stress Resistance Mechanisms in Plants: Multimodal Models Integrating Experimental Data, Databases, and the Literature. L. S.Heath;R. Grene, B. I. Chevone,N. Ramakrishnan,L. T. Watson.$499,973. • NSFEIA-01903660: A Microarray Experiment Management System. N. Ramakrishnan, L. S. Heath, L. T. Watson,R. Grene,J. W. Weller (VBI). $600,000. • DARPAN00014-01-1-0852: Dryophile Genes to Engineer Stasis-Recovery of Human Cells. M. Potts,L. S. Heath,R. F. Helm, N. Ramakrishnan, T. O. Sitz, F. Bloom, P. Price (Life Technologies), J. Battista (LSU). $4,532,622. • NSF MCB-0083315: Biocomplexity---Incubation Activity: A Collaborative Problem Solving Environment for Computational Modeling of Eukaryotic Cell Cycle Controls. J. J. Tyson,L. T. Watson, N. Ramakrishnan, C. A. Shaffer,J. C. Sible.$99,965. • NIH 1 R01 GM64339-01: ``Problem Solving Environment for Modeling the Cell Cycle. J. J. Tyson, J. Sible, K. Chen,L. T. Watson, C. A. Shaffer, N. Ramakrishnan,P. Mendes (VBI). 211,038. • Air Force Research Laboratory F30602-01-2-0572: The Eukaryotic Cell Cycle as a Test Case for Modeling Cellular Regulation in a Collaborative Problem Solving Environment. J. J. Tyson, J. C. Sible, K. C. Chen,L. T. Watson, C. A. Shaffer, N. Ramakrishnan.$1,650,000.

  6. Research Resources System X • Third fastest computer on the planet Laboratory for Advanced Scientific Computing & Applications (LASCA) • Parallel algorithms & math software • Anantham Cluster • Grid computing Bioinformatics Research LAN • Linux, Mac OS X, Windows • Bioinformatics databases and analysis 3/5/2004 Bioinformatics in Computer Science

  7. JigCell: A PSE for Eukaryotic Cell Cycle Controls Marc Vass, Nick Allen, Jason Zwolak, Dan Moisa, Clifford A. Shaffer, Layne T. Watson, Naren Ramakrishnan, and John J. Tyson Departments of Computer Science and Biology 3/5/2004 Bioinformatics in Computer Science

  8. DNA …TACCCGATGGCGAAATGC... mRNA …AUGGGCUACCGCUUUACG... …Met - Gly - Tyr - Arg - Phe - Thr... Protein -P Enzyme ATP ADP E4 E1 E3 Reaction Network X Y Z E2 Cell Physiology Computational Molecular Biology 3/5/2004 Bioinformatics in Computer Science

  9. Cell Cycle of Budding Yeast Cln2 Clb2 Clb5 Sic1 Sic1 P Sister chromatid separation Cdc20 PPX Lte1 Esp1 Budding Pds1 Tem1 Esp1 Net1P Esp1 Bub2 Cdc15 Cln2 SBF Unaligned chromosomes Pds1 SBF Net1 RENT Mcm1 Unaligned chromosomes Cdh1 Mcm1 Cdc20 Mad2 Cdc20 Cdc14 Cln3 Cdc15 and Bck2 Cdh1 Mcm1 APC Clb2 Cdc14 growth CDKs Swi5 SCF Cdc14 ? Cdc20 MBF Clb5 Esp1 DNA synthesis

  10. JigCell Problem-Solving Environment Experimental Database WiringDiagram DifferentialEquations ParameterValues Simulation Analysis Visualization Automatic Parameter Estimation 3/5/2004 Bioinformatics in Computer Science

  11. Why do these calculations? • Is the model “yeast-shaped”? • Bioinformatics role: the model organizes experimental information. • New science: prediction, insight JigCell is part of the DARPA BioSPICE suite of software tools for computational cell biology. 3/5/2004 Bioinformatics in Computer Science

  12. Expresso: A Next Generation Software System for Microarray Experiment Management and Data Analysis 3/5/2004 Bioinformatics in Computer Science

  13. Expresso: A Problem Solving Environment (PSE) for Microarray Experiment Design and Analysis • Integration of design, experimentation, and analysis • Data mining; inductive logic programming (ILP) • Closing the loop • Drought stress experiments with pine trees and Arabidopsis 3/5/2004 Bioinformatics in Computer Science

  14. Scenarios for Effects of Abiotic Stress on Gene Expression in Plants 3/5/2004 Bioinformatics in Computer Science

  15. Data Mining with ILP • ILP (inductive logic programming) is a data mining algorithm for inferring relationships or rules. • ILP groups related data and chooses in favor of relationships having short descriptions. • ILP can also flexibly incorporate a priori biological knowledge (e.g., categories and alternate classifications). • Hybrid reasoning: Information Integration • “Is there a relationship between genes in a given functional category and genes in a particular expression cluster?” • ILP mines this information in a single step 3/5/2004 Bioinformatics in Computer Science

  16. Rule Inference in ILP • Infers rules relating gene expression levels to categories, both within a probe pair and across probe pairs, without explicit direction • Example Rule: • [Rule 142] [Pos cover = 69 Neg cover = 3] • level(A,moist_vs_severe,not positive) :- level(A,moist_vs_mild,positive). • Interpretation: • “If the moist versus mild stress comparison was positive for some clone named A, it was negative or unchanged in the moist versus severe comparison for A, with a confidence of 95.8%.” 3/5/2004 Bioinformatics in Computer Science

  17. ILP in the Expresso Pipeline Expresso is a next generation software system for microarray experiments that provides a database interface to ILP functionality. 3/5/2004 Bioinformatics in Computer Science

  18. Status of Expresso • Capabilities • Data capture and storage • Statistical analysis • Data mining by ILP • Microarray experiment design — GeneSieve • Expresso-assisted experiment composition • Closing the experimental loop • Successful microarray experiment analysis • Pine, Norway spruce, yeast, Deinococcus radiodurans (an extremophile microorganism), human cell lines • Planned microarray experiment analysis • Potato, Arabidopsis thaliana, tomato, rice, corn 3/5/2004 Bioinformatics in Computer Science

  19. Networks in Bioinformatics • Mathematical Model(s) for Biological Networks • Representation: What biological entities and parameters to represent and at what level of granularity? • Operations and Computations: What manipulations and transformations are supported? • Presentation: How can biologists visualize and explore networks? 3/5/2004 Bioinformatics in Computer Science

  20. Reconciling Networks Munnik and Meijer, FEBS Letters, 2001 Shinozaki and Yamaguchi-Shinozaki, Current Opinion in Plant Biology, 2000 3/5/2004 Bioinformatics in Computer Science

  21. Multimodal Networks • Nodes and edges have flexible semantics to represent: • Time • Uncertainty • Cellular decision making; process regulation • Cell topology and compartmentalization • Rate constants • Phylogeny • Hierarchical 3/5/2004 Bioinformatics in Computer Science

  22. Using Multimodal Networks • Help biologists find new biological knowledge • Visualize and explore • Generating hypotheses and experiments • Predict regulatory phenomena • Predict responses to stress • Incorporate into Expresso as part of closing the loop 3/5/2004 Bioinformatics in Computer Science

  23. Conclusions • Engaged faculty with the right expertise • Numerous life science collaborations • Federal research funding • First-class computational resources • A variety of cutting-edge bioinformatics research projects 3/5/2004 Bioinformatics in Computer Science

  24. Bioinformatics Education • Courses in Computer Science • Courses in the Life Sciences • Bioinformatics Option • Doctoral Program in Genetics, Bioinformatics, and Computational Biology 3/5/2004 Bioinformatics in Computer Science

  25. Doctoral Program in Genetics, Bioinformatics, and Computational Biology Multidisciplinary: biology, biochemistry, crop science, plant physiology, computer science, mathematics, statistics, veterinary medicine 3/5/2004 Bioinformatics in Computer Science

  26. Anantham Cluster • Previous cluster specs • 200 AMD 1 GHz processors • 1 GB RAM per processor • 2 TB disk space • 2.56 Gb/s Myrinet network Previous 200 processor cluster 3/5/2004 Bioinformatics in Computer Science

More Related