1 / 30

“ Collaborations Between Calit2, SIO, and the Venter Institute—a Beginning "

“ Collaborations Between Calit2, SIO, and the Venter Institute—a Beginning ". Talk to the Venter Institute Board La Jolla, CA December 5, 2005. Dr. Larry Smarr Director, California Institute for Telecommunications and Information Technology; Harry E. Gruber Professor,

elaine-moon
Télécharger la présentation

“ Collaborations Between Calit2, SIO, and the Venter Institute—a Beginning "

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. “Collaborations Between Calit2, SIO, and the Venter Institute—a Beginning" Talk to the Venter Institute Board La Jolla, CA December 5, 2005 Dr. Larry Smarr Director, California Institute for Telecommunications and Information Technology; Harry E. Gruber Professor, Dept. of Computer Science and Engineering Jacobs School of Engineering, UCSD

  2. Driving Cyberinfrastructure with Environmental Metagenomics Samples Collected by Sorcerer II Funded Today! $24. 5 M Over 7 Years J. Craig Venter, et al. Science 2 April 2004: Vol. 304. pp. 66 - 74 How did Calit2, SIO, and VI Arrive at This Unified Vision?

  3. Metagenomics “Extreme Assembly” Requires Large Amount of Pixel Real Estate Prochlorococcus Microbacterium Rhodobacter SAR-86 unknown Burkholderia unknown Source: Karin Remington J. Craig Venter Institute

  4. Metagenomics Requires a Global View of Data and the Ability to Zoom Into Detail Interactively Overlay of Metagenomics Data onto Sequenced Reference Genomes(This Image: Prochloroccocus marinus MED4) Source: Karin Remington J. Craig Venter Institute

  5. The OptIPuter – Creating High Resolution Portals Over Dedicated Optical Channels to Global Science Data 300 MPixel Image! Source: Mark Ellisman, David Lee, Jason Leigh Green: Purkinje Cells Red: Glial Cells Light Blue: Nuclear DNA Calit2 (UCSD, UCI) and UIC Lead Campuses—Larry Smarr PI Partners: SDSC, USC, SDSU, NW, TA&M, UvA, SARA, KISTI, AIST

  6. Scalable Displays Allow Both Global Content and Fine Detail Source: Mark Ellisman, David Lee, Jason Leigh 30 MPixel SunScreen Display Driven by a 20-node Sun Opteron Visualization Cluster

  7. Allows for Interactive Zooming from Cerebellum to Individual Neurons Source: Mark Ellisman, David Lee, Jason Leigh

  8. Why Optical NetworksWill Become the 21st Century Driver Optical Fiber (bits per second) (Doubling time 9 Months) Data Storage (bits per square inch) (Doubling time 12 Months) Silicon Computer Chips (Number of Transistors) (Doubling time 18 Months) Performance per Dollar Spent 0 1 2 3 4 5 Number of Years Scientific American, January 2001

  9. Challenge: Average Throughput of NASA Data Products to End User is Only < 50 Megabits/s Tested from GSFC-ICESAT January 2005 http://ensight.eos.nasa.gov/Missions/icesat/index.shtml

  10. Solution: Individual 1 or 10Gbps Lightpaths -- “Lambdas on Demand” “Lambdas” (WDM) Source: Steve Wallach, Chiaro Networks

  11. National Lambda Rail (NLR) and TeraGrid Provides Cyberinfrastructure Backbone for U.S. Researchers NSF’s TeraGrid Has 4 x 10Gb Lambda Backbone International Collaborators Seattle Portland Boise UC-TeraGrid UIC/NW-Starlight Ogden/ Salt Lake City Cleveland Chicago New York City Denver Pittsburgh San Francisco Washington, DC Kansas City Raleigh Albuquerque Tulsa Los Angeles Atlanta San Diego Phoenix Dallas Baton Rouge Las Cruces / El Paso Links Two Dozen State and Regional Optical Networks Jacksonville Pensacola DOE, NSF, & NASA Using NLR Houston San Antonio NLR 4 x 10Gb Lambdas Initially Capable of 40 x 10Gb wavelengths at Buildout

  12. Extending Telepresence with Remote Interactive Analysis of Data Over NLR 25 Miles Venter Institute OptIPuter Visualized Data HDTV Over Lambda www.calit2.net/articles/article.php?id=660 August 8, 2005 SIO/UCSD NASA Goddard

  13. First Trans-Pacific Super High Definition Telepresence Meeting in New Calit2 Digital Cinema Auditorium Keio University President Anzai UCSD Chancellor Fox Lays Technical Basis for Global Scientific Collaboration Sony NTT SGI

  14. Calit2@UCSD Is Connected to the World at 10,000 Mbps i Grid 2005 September 26-30, 2005 Calit2 @ University of California, San Diego California Institute for Telecommunications and Information Technology Maxine Brown, Tom DeFanti, Co-Chairs THE GLOBAL LAMBDA INTEGRATED FACILITY www.igrid2005.org 50 Demonstrations, 20 Counties, 10 Gbps/Demo

  15. Calit2 is Partnering with SIOto Prototype a Digital Environment Research Systems • Viewing and Analyzing Earth Satellite Data Sets • Earth Topography • Atmospheric Brown Clouds • Climate Modeling • Surface, Subsurface, and Ocean Floor Observatories • Coastal Zone Data Assimilation • Ocean Environmental Metagenomics Smarr March 2005 Talk to SIO Council Led to Calit2 Discussions with Craig Venter John Orcutt, Director CEOA Deputy Director, SIO

  16. First Remote Interactive High Definition Video Exploration of Deep Sea Vents Canadian-U.S. Collaboration Source John Delaney & Deborah Kelley, UWash

  17. A Near Future Metagenomics Fiber Optic-Enabled Data Generator Source John Delaney, UWash

  18. Use SCCOOS As Prototype for Coastal Zone Data Assimilation Testbed Yellow—Proposed Initial Lambda Backbone Goal: Link SCCOOS Sites with LambdaGrid to Prototype Future Ocean and Earth Sciences Observing System www.sccoos.org

  19. Use OptIPuter to Couple Data Assimilation Models to Remote Data Sources Including Biology NASA MODIS Mean Primary Productivity for April 2001 in California Current System Regional Ocean Modeling System (ROMS) http://ourocean.jpl.nasa.gov/

  20. Marine Microbial MetagenomicsFrom Species Genomes to Ecological Genomes • Each Sequence is a Part of an Entire Biological Community • Sequences, Genes and Gene Families, Coupled With Environmental Metadata • Tremendous Potential to Better Understand the Functioning of Natural Ecosystems • Challenge • Much More Powerful Information Infrastructure Required to Support Metagenomics Dr. Terry Gaasterland Scripps Genome Center

  21. Evolution is the Principle of Biological Systems:Most of Evolutionary Time Was in the Microbial World You Are Here Much of Genome Work Has Occurred in Animals Source: Carl Woese, et al

  22. Comparative Genomics Can Reveal Biological FactsThat Are Not Visible Within a Species Co-Authors Pavel Pevzner and Glenn Tesler, UCSD December 05, 2002 April 1, 2004 December 9, 2004 “After sequencing these three genomes, it is clear that substantial rearrangements in the human genome happen only once in a million years, while the rate of rearrangements in the rat and mouse is much faster.” --Glenn Tesler, UCSD Dept. of Mathematics www.calit2.net/culture/features/2004/4-1_pevzner.html

  23. Advanced Algorithmic Techniques Reveal Unexpected Results “Many of the chicken–human aligned, non-coding sequences occur far from genes, frequently in clusters that seem to be under selection for functions that are not yet understood.” Nature 432, 695 - 716 (09 December 2004)

  24. Calit2 Researcher Eskin Collaborates with Perlegen Sciences on Map of Human Genetic Variation Across Populations “We have characterized whole-genome patterns of common human DNA variation by genotyping 1,586,383 single-nucleotide polymorphisms (SNPs) in 71 Americans of European, African, and Asian ancestry.” David A. Hinds, Laura L. Stuve, Geoffrey B. Nilsen, Eran Halperin, Eleazar Eskin, Dennis G. Ballinger, Kelly A. Frazer, David R. Cox. “Whole-Genome Patterns of Common DNA Variation in Three Human Populations” Science 18 February, 2005: 307(5712):1072-1079. “Although knowledge of a single genetic risk factor can seldom be used to predict the treatment outcome of a common disease, knowledge of a large fraction of all the major genetic risk factors contributing to a treatment response or common disease could have immediate utility, allowing existing treatment options to be matched to individual patients without requiring additional knowledge of the mechanisms by which the genetic differences lead to different outcomes .” “More detailed haplotype analysis results are available at http://research.calit2.net/hap/wgha/ “

  25. The Bioinformatics Core of the Joint Center for Structural Genomics will be Housed in the Calit2@UCSD Building Extremely Thermostable -- Useful for Many Industrial Processes (e.g. Chemical and Food) 173 Structures (122 from JCSG) • Determining the Protein Structures of the Thermotoga Maritima Genome • 122 T.M. Structures Solved by JCSG (75 Unique In The PDB) • Direct Structural Coverage of 25% of the Expressed Soluble Proteins • Probably Represents the Highest Structural Coverage of Any Organism Source: John Wooley, UCSD

  26. Providing Integrated Grid Software and Infrastructure for Multi-Scale BioModeling Grid and Cluster Computing Applications Infrastructure Gtomo2 TxBR QMView Rocks Grid of Clusters GAMESS APBS Continuity Autodock National Biomedical Computation Resource an NIH supported resource center Located in Calit2@UCSD Building Rich Clients Web Portal Grid Middleware and Web Services Workflow APBSCommand Middleware PMV ADT Vision Telescience Portal Continuity

  27. Calit2 Intends to Jump BeyondTraditional Web-Accessible Databases BIRN PDB NCBI Genbank W E B PORTAL (pre-filtered, queries metadata) Data Backend (DB, Files) Request Response + many others Source: Phil Papadopoulos, SDSC, Calit2

  28. Calit2’s Direct Access Core Architecture Will Create Next Generation Metagenomics Server OptIPuter Cluster Cloud Dedicated Compute Farm (100s of CPUs) W E B PORTAL Data- Base Farm 10 GigE Fabric Local Environment Flat File Server Farm Direct Access Lambda Cnxns Web (other service) Local Cluster TeraGrid: Cyberinfrastructure Backplane (scheduled activities, e.g. all by all comparison) (10000s of CPUs) Traditional User Request Response + Web Services Source: Phil Papadopoulos, SDSC, Calit2

  29. What Will Our Core Data Sets Be? Metagenomic Sargasso Sea + Sorcerer II Expedition (GOS) JGI Community Sequencing Project Microbial Genomes Moore Marine Microbial Project JGI Community Sequencing Project Other Relevant genomes (e.g., from Genbank) Standard Non-Redundant Nucleotide and AA Databases Environmental and Satellite data NOAA Oceans and NASA Goddard Satellite Date Source: Saul Kravitz Director of Software Engineering J. Craig Venter Institute

  30. Looking Back Nearly 4 Billion YearsIn the Evolution of Microbe Genomics Science Falkowski and Vargas 304 (5667): 58

More Related