1 / 17

Tony Hey Director of UK e-Science Programme Tony.Hey@epsrc.ac.uk

e-Science, Databases and the Grid. Tony Hey Director of UK e-Science Programme Tony.Hey@epsrc.ac.uk. e-Science and the Grid. ‘e-Science is about global collaboration in key areas of science, and the next generation of infrastructure that will enable it.’

dysis
Télécharger la présentation

Tony Hey Director of UK e-Science Programme Tony.Hey@epsrc.ac.uk

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. e-Science, Databases and the Grid Tony Hey Director of UK e-Science Programme Tony.Hey@epsrc.ac.uk

  2. e-Science and the Grid ‘e-Science is about global collaboration in key areas of science, and the next generation of infrastructure that will enable it.’ ‘e-Science will change the dynamic of the way science is undertaken.’ John Taylor Director General of Research Councils Office of Science and Technology

  3. NASA’s IPG • The vision for the Information Power Grid is to promote a revolution in how NASA addresses large-scale science and engineering problems by providing persistent infrastructure for • “highly capable” computing and data management services that, on-demand, will locate and co-schedule the multi-Center resources needed to address large-scale and/or widely distributed problems • the ancillary services that are needed to support the workflow management frameworks that coordinate the processes of distributed science and engineering problems

  4. IPG Baseline System 300 node Condor pool MCAT/SRB MDS CA DMF Boeing O2000 cluster MDS EDC GRC O2000 NGIX CMU NREN ARC NCSA GSFC LaRC JPL O2000 cluster SDSC NTON-II/SuperNet MSFC MDS O2000 JSC KSC

  5. Human Models Multi-disciplinary Simulations Wing Models • Lift Capabilities • Drag Capabilities • Responsiveness Stabilizer Models Airframe Models • Deflection capabilities • Responsiveness Crew Capabilities - accuracy - perception - stamina - re-action times - SOP’s Engine Models • Braking performance • Steering capabilities • Traction • Dampening capabilities • Thrust performance • Reverse Thrust performance • Responsiveness • Fuel Consumption Landing Gear Models Whole system simulations are produced by couplingall of the sub-system simulations

  6. The Grid as an Enabler for Virtual Organisations • Ian Foster and Carl Kesselman – ‘Take 2’ • The Grid is a software infrastructure that enables flexible, secure, coordinated resource sharing among dynamic collections of individuals, institutions and resources - includes computational systems and data storage resources and specialized facilities • Enabling infrastructure for transient ‘Virtual Organisations’

  7. Globus Grid Middleware • Single Sign-On • Proxy credentials, GRAM • Mapping to local security mechanisms • Kerberos, Unix, GSI • Delegation • Restricted proxies • Community authorization and policy • Group membership, trust • File-based • GridFTP gives high performance FTP integrated with GSI

  8. US Grid Projects • NASA Information Power Grid • DOE Science Grid • NSF National Virtual Observatory • NSF GriPhyN • DOE Particle Physics Data Grid • NSF Distributed Terascale Facility • DOE ASCI Grid • DOE Earth Systems Grid • DARPA CoABS Grid • NEESGrid • DOH BIRN • NSF iVDGL

  9. EU GridProjects • DataGrid (CERN, ..) • EuroGrid (Unicore) • DataTag (TTT…) • Astrophysical Virtual Observatory • GRIP (Globus/Unicore) • GRIA (Industrial applications) • GridLab (Cactus Toolkit) • CrossGrid (Infrastructure Components) • EGSO (Solar Physics)

  10. National Grid Projects • UK e-Science Grid • Japan – Grid Data Farm, ITBL • Netherlands – VLAM, PolderGrid • Germany – UNICORE, Grid proposal • France – Grid funding approved • Italy – INFN Grid • Eire – Grid proposals • Switzerland - Grid proposal • Hungary – DemoGrid, Grid proposal • ApGrid • ……

  11. UK e-Science Initiative • £120M Programme over 3 years • £75M is for Grid Applications in all areas of science and engineering • £10M for Supercomputer upgrade • £35M for development of ‘industrial strength’ Grid middleware • Require £20M additional ‘matching’ funds from industry

  12. UK e-Science Grid Edinburgh Glasgow DL Newcastle Belfast Manchester Cambridge Oxford Hinxton RAL Cardiff London Southampton

  13. IBM Grid Press Release: 2/8/01 Interview with Irving Wladawsky-Berger: • ‘Grid computing is a set of research management services that sit on top of the OS to link different systems together’ • ‘We will work with the Globus community to build this layer of software to help share resources’ • ‘All of our systems will be enabled to work with the grid, and all of our middleware will integrate with the software’

  14. Grid Database Requirements (1) • Scalability • Store Petabytes of data at TB/hr • Low response time for complex queries to retrieve data for more processing • Large number of clients needing high access throughput • Grid Standards for Security, Accounting, .. • GSI with digital certificates • Data from multiple DBMS • Co-schedule database and compute servers

  15. Grid Database Requirements (2) • Handling Unpredictable Usage • Most existing DB applications have reasonably predictable access patterns and usage ond DB resources can be restricted • Typical commercial applications generate large numbers of small transactions from large number of users • Grid applications can have small number of large transactions needing more ad hoc access to DBMS resources • much greater variations in time and resource usage

  16. Grid Database Requirements (3) • Metadata-driven access • Expect need 2-step access to data Step 1: Metadata search to locate required data on one or more DBMS Step 2: Data accessed, sent to compute server for further analysis • Application writer does not know which specific DBMS accessed in Step 2 • Need standard API for Grid-enabled DBMS • Multiple Database Integration - Support distributed queries and transactions - Scalability requirements

  17. Summary • Application projects use Clusters, Supercomputers, Data Repositories • Emphasis on support for data federation and annotation as much as computation • Metadata and ontologies key to higher level Grid services • For commercial success Grid needs to have interface to DBMS

More Related