html5-img
1 / 17

Data Management GridPP and EDG

Data Management GridPP and EDG. Gavin McCance University of Glasgow May 2, 2002. http://www.gridpp.ac.uk/datamanagement http://cern.ch/grid-data-management. Who are we?. GridPP Effort based at Glasgow Collaboration with European DataGrid WP2: Data Management work package

Télécharger la présentation

Data Management GridPP and EDG

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Data ManagementGridPP and EDG Gavin McCance University of Glasgow May 2, 2002 http://www.gridpp.ac.uk/datamanagement http://cern.ch/grid-data-management

  2. Who are we? • GridPP • Effort based at Glasgow • Collaboration with European DataGrid • WP2: Data Management work package • CERN, Finland, Italy • Replication collaboration with Globus + PPDG project Gavin McCance

  3. What do we* do? • Replica management • Replica catalogues • File access and transfer • Grid query optimisation (replica optimisation)* • Secure meta-data catalogues* • Service Index Gavin McCance

  4. Replica Catalogues • Must maintain replica of the same files • Have a globally unique Logical File Name (LFN) mapping to multiple physical instances of the file (PFNs). • Catalogue to keep track of all these mappings! File-1 LFN Paris File-1 Chicago Glasgow File-1 File-1 Gavin McCance

  5. …catalogues • Current services use LDAP • Collaboration with Globus + PPDG on new replica catalogue framework (GIGGLE) • Prototype Replica Location Service (RLS) under development • Will use meta-data service (Spitfire)… • API implemented as wrapper for current LDAP based replica catalogue Gavin McCance

  6. …RLS RLI • Implemented as web service RLI RLI LRC LRC LRC LRC LRC Storage Element Storage Element Storage Element Storage Element Storage Element Gavin McCance

  7. Transferring files • What replicates the files? • Grid Data Mirroring Package (GDMP) • GDMP 3.0 software just released • GSI authentication and authorisation • GridFTP file transfer • Subscription based file replication • Automatic update of replica catalogue • http://cmsdoc.cern.ch/cms/grid/ Gavin McCance

  8. Replica Manager • New web service under development • GDMP functionality will be absorbed • Will use replica location service • Core API has been defined • replicateFile, copyAndRegisterFile, deleteFile, registerEntry, unregisterEntry • Iteration with WP5 on accessing data from Storage Elements Gavin McCance

  9. Optimisation • Negotiation with scheduling for data intensive jobs • minimise job time / max grid throughput • Given the distribution of data a job will use, what is the most appropriate place to run it? • Once its running: is it better to remote-open, cache or make a new replica nearby? Gavin McCance

  10. …Optimisation • Dynamic replication decisions based on network stats and file access patterns • Economic model being tested • “Greedy” local optimisation leads to a reasonable global optimum… • Data-centric grid simulation to test these replication algorithms Gavin McCance

  11. Meta-data • Need for transparent, secure access to meta-data • Both for grid-specific (e.g. Replica catalogue) and application specific meta-data. • Spitfire service available • Current version 1.1.0 • http://hep-proj-spitfire.web.cern.ch/hep-proj-spitfire Gavin McCance

  12. Current Spitfire • Secure access over HTTPS to retrieve from or publish to any RDBMS • Can use web-browser as client Gavin McCance

  13. Security • Authentication is provided over SSL via a Globus certificate • Remote users are mapped onto a database role, so can only perform authenticated operations on the database Gavin McCance

  14. HTTP + SSLRequest + client certificate Is certificate signedby a trusted CA? Has certificatebeen revoked? No No Yes Finddefault Role ok? Request and connection ID Security Mechanism Servlet Container SSLServletSocketFactory RDBMS Trusted CAs TrustManager Revoked Certsrepository Security Servlet ConnectionPool Authorization Module Does user specify role? Role repository Translator Servlet Role Connectionmappings Map role to connection id Gavin McCance

  15. Developments to Spitfire • Web Services API is defined • Implementation to start immediately • Access via SOAP, initially over HTTPS • Higher level services • Meta-data distribution and replication • Clean-up services Gavin McCance

  16. Service Index • How do I find a specific grid service? • E.g. replica location server, image database, information service • XML Service description • What, where, attributes, how to contact. • Scalable architectures for querying this developed • Service index web service • W. Hoschek’s thesis and paper (WP2@CERN) • API developed Gavin McCance

  17. More Info • More information available at… http://www.gridpp.ac.uk/datamanagement http://cern.ch/grid-data-management Gavin McCance

More Related