360 likes | 519 Vues
A New Era for Computational Science. NPACI Parallel Computing Institute August 28, 2000 Sid Karin Director, NPACI/SDSC skarin@sdsc.edu. SDSC. A National Laboratory for Computational Science and Engineering. Leading-Edge Site for NPACI. NPACI. Continuing Evolution. NPACI. SDSC.
E N D
A New Era for Computational Science NPACI Parallel Computing Institute August 28, 2000 Sid Karin Director, NPACI/SDSC skarin@sdsc.edu
SDSC A National Laboratory for Computational Science and Engineering Leading-Edge Site for NPACI
NPACI Continuing Evolution NPACI SDSC Resources Resources Education Outreach & Training Enabling technologies Technology & applications thrusts Applications Individuals Partners 1985 2000
A Distributed National Laboratory for Computational Science and Engineering
NPACI is a Highly Leveraged National Partnership of Partnerships 46 institutions 20 states 4 countries 5 national labs Many projects Vendors and industry Government agencies
Mission Accelerate Scientific Discovery Through the development and implementationof computationaland computerscience techniques By creating a ubiquitous, continuous, and pervasive national infrastructure: the grid
Vision Changing How Science is Done • Collect data from digital libraries, laboratories, and observation • Analyze the data with models run on the grid • Visualize and share data over the Web • Publish results in a digital library
Goals: Fulfilling the Mission Embracing the Scientific Community • Capability Computing • Provide compute and information resources of exceptional capability • Discovery Environments • Develop and deploy novel, integrated, easy-to-use computational environments • Computational Literacy • Extend the excitement, benefits, and opportunities of computational science
Partnership Organizing Principle: “Thrusts” Computational Literacy EOT Discovery Environments TECHNOLOGIES Discovery Environments APPLICATIONS Metasystems Programming Tools & Environments Data-intensive Computing Interaction Environments Molecular Science Neuroscience Earth Systems Science Engineering Capability Computing RESOURCES
Projects Meld Applications and Technology Brain databases Data-IntensiveComputing+Neuroscience Metasystems andParallel Tools + Engineering
Leadership Team Sid Karin, SDSCDirectorskarin@sdsc.edu Peter Arzberger, SDSC Executive Directorparzberg@sdsc.edu Paul Messina, CaltechChief Architect(on leave) Susan Graham, UC Berkeley Chief Computer Scientistgraham@cs.berkeley.edu Peter Taylor, SDSCChief Applications Scientisttaylor@sdsc.edu Wayne Pfeiffer, SDSCDeputy Directorpfeiffer@sdsc.edu Greg Moses, U WisconsinEducation, Outreach, and Training Leadermoses@engr.wisc.edu
NPACI Executive Committee Andrew Grimshaw, U VirginiaMetasystems Joel Saltz, U MarylandProgramming Tools and Environments Reagan Moore, SDSCData-Intensive Computing Arthur Olson, TSRIInteraction Environments William Martin, U MichiganResource Representative Russ Altman, Stanford UMolecular Science Mark Ellisman, UCSDNeuroscience Bernard Minster, UCSD (SIO)Earth Systems Science Tinsley Oden, U Texas (TICAM)Engineering James Pool, CaltechResource Representative Leadership Team plus: Aron Kuppermann, Caltech User Representative
NPACI Oversight Institutional Oversight Board External Visiting Committee Director’s Advisory Committee Users’ Advisory Committee EVC UAC IOB DAC ExecutiveCommittee Leadership Team Resource Partner Representatives Technologies Thrust Leaders Applications Thrust Leaders
EOT Applications Technologies Resources Budget Balance SDSC Partners
Complementary rolesof five compute resource sites • Leading-edge site (SDSC) • Very high-performance resources • IBM SP teraflops system • Mid-range sites (U Texas & U Michigan) • Smaller systems compatible with LES • Support for applications with limited scalability, large-memory jobs, application development, OS testing, and education • Alternate architecture & research systems • Caltech, UC Berkeley, SDSC • Support for leading-edge applications, thrusts, and evaluation
14 10 13 10 World's fastest 12 10 supers Peak speed (flops) 11 10 SDSC's vector supers 10 10 9 10 8 10 1980 1985 1990 1995 2000 2005 Year installed Leading-Edge Site Supercomputer Roadmap 1 TFLOPSIBM SP1999
NPACI’s balanced complement ofhigh-end resources for 2000 • Compute resources (SDSC & 4 partners) • IBM SP Teraflops system at SDSC • Complementary systems at partner sites • Data resources (SDSC & 10 partners) • >180 TB mass store at SDSC • >100 GB data sets at partner sites • Network resources (SDSC & all partners) • >100 Mbps access to compute & data resources • Communications backbone for metacomputing
IBM Selected as First NPACI Teraflops Vendor • Strong commitment to high end by IBM • Technology being developed through ASCI • SDSC has largest system in US academia • Growing partnership with IBM
1st Teraflops System for US Academia Nov 1999 • 1 TFLOPs IBM SP • 144 8-processor compute nodes • 12 2-processor service nodes • 1,176 Power3 processors at 222 MHz • > 640 GB memory (4 GB/node), upgrade to > 1 TB later • 6.8 TB switch-attached disk storage • Largest SP with 8-way nodes • High-performance access to HPSS • Trailblazer switch interconnect with subsequent upgrade
Fundamental Physics T. Kinoshita, Cornell University R. Sugar, UC Santa Barbara Ab initio Biochemistry H. Scheraga, Cornell University A. McCammon, UC San Diego M. Klein, Univ. of Pennsylvania M. Gordon, Iowa State University Biomedicine A. Garfinkel, UCLA B. Pettitt, University of Houston Materials Science F. Abraham, IBM Almaden J. Kim, Ohio State University Fluid Dynamics K. Gubbins, Cornell University J. Kim, UCLA G. Karniadakis, Brown University Astrophysics P. Hauschildt, Univ. of Georgia J. Raeder, UCLA M. Ashour-Abdalla, UCLA Current Large SP Allocations
NPACI “alpha” projects • Bioinformatics Infrastructure for Large-Scale Analyses • Protein Folding in a Distributed Computing Environment • Telescience for Advanced Tomography Applications • Multi-Component Models for Energy and the Environment • Scalable Visualization Toolkits for Bays to Brains
Bioinformatics Infrastructure for Large-Scale Analyses • Next-generation tools for accessing, manipulating, and analyzing biological data • Russ Altman, Stanford University • Reagan Moore, SDSC • Analysis of Protein Data Bank, GenBank and other databases • Accelerate key discoveries for health and medicine
Protein Folding in a Distributed Computing Environment • Simulating protein movement governing reactions within cells • Andrew Grimshaw, U Virginia • Charles Brooks, The Scripps Research Institute • Bernard Pailthorpe, UCSD/SDSC • Computationally intensive • Distributed computing power from Legion
Telescience for Advanced Tomography Applications • Integrates remote instrumentation, distributed computing, federated databases, image archives, and visualization tools. • Mark Ellisman, UCSD • Fran Berman, UCSD • Carl Kesselman, USC • 3-D tomographic reconstruction of biological specimens
Multi-Component Modeling for Energy and the Environment • Simulating contaminant movement through ecosystems • Leaders: Joel Saltz, U Maryland and Johns Hopkins U; Mary Wheeler, U Texas • Will assist environmental cleanup efforts and strategies • Engineering and environmental models linked through metasystems and data manipulation tools
Scalable Visualization Toolkits • Vast data collections and large-scale simulations require scalable visualization tools • Art Olson, The Scripps Research Institute • Bernard Pailthorpe, SDSC/UCSD • Art Toga, UCLA • Carl Wunsch, MIT • 3-D reconstruction, time-dependent modeling
Examples of Additional Projects • NPACI and SDSC activities
MICE: Transparent Supercomputing • Molecular Interactive Collaborative Environment • Gallery allows researchers, students to search for, visualize, and manipulate molecular structures • Integrates key SDSC technological strengths • Biological databases • Transparent supercomputing • Web-based Virtual Reality Modeling Language
The Protein Data Bank • World’s single scientific resource for depositing and searching protein structures • Protein structure data growing exponentially • 10,500 structures in PDB today • 20,000 by the year 2001 • Vital to the advancement of biological sciences • Working towards a digital continuum from primary data to final scientific publication • Capture of primary data from high-energy synchrotrons (e.g. Stanford Linear Accelerator Center) requires 50Mbps network bandwidth 1CD3: The PDB’s 10,000th structure.
New Mode of Visualization • Network-accessible “TeleManufacturing” • 3-D hardcopy for visualization • Used by many disciplines • Molecules to Hurricanes • Death Valley to Venus • Reimann Zeta Function to Ozone Hole
Collaboration with Hayden Planetarium American Museum of Natural History Support from NASA Linking SDSC’s mass storage to Hayden Planetarium requires 155 Mbps MPIRE Galaxy Renderer Scalable volume visualization Linked to database of astronomical objects Produces translucent, filament-like objects An artificial nebula, modeled after a planetary nebula Digital Galaxy
The Digital Sky • Billions of objects can be detected with optical, infrared, and radio telescopes • Tens of terabytes of image and catalog data • Digital Sky federating four sky surveys to allow multi-wavelength studies across the data sets • DPOSS, 2MASS, NVSS, FIRST • Tom Prince, Caltech, leading federation effort • Uses MIX, SDSC SRB, and NPACI mass storage systems A globular cluster from the DPOSS archive. Such clusters provide a minimum age for the universe. Image by Thomas Handley, Caltech.
Looking out for San Diego’s Regional Ecology • Unique partnership • 31 federal, state, regional,and local agencies • John Helly, et al., SDSC • Combines technologies and multi-agency data • Sensing, analysis, VRML • Physical, chemical, and biological data • Web-based tool for science and public policy
AMICO: The Art of Managing Art • Art Museum Image Consortium (AMICO) • 28 art museums working toward educational use of digital multimedia • Launch of the AMICO Library includes more than 50,000 works of art • AMICO, CDL, SDSC • XML information mediation • SDSC SRB data management • Links between images, scholarly research, educational material
Mapping the Net’s Terra Incognita Nature: Web Matters, 1/7/99. Science 10/16/98
This is Only the Beginning... YOU ARE HERE TIME