1 / 24

GLUE 2.0 migration plan

GLUE 2.0 migration plan. Stephen Burke egi.eu EGI OMB July 17 th 2012. Overview. Why we needed a new schema GLUE 2 timeline Implementation and deployment Schema design BDII infrastructure Service publication Clients Migration. GLUE history.

Télécharger la présentation

GLUE 2.0 migration plan

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. GLUE 2.0 migration plan Stephen Burke egi.eu EGI OMB July 17th 2012

  2. Overview • Why we needed a new schema • GLUE 2 timeline • Implementation and deployment • Schema design • BDII infrastructure • Service publication • Clients • Migration GLUE 2.0 migration - EGI OMB

  3. GLUE history • The European DataGrid project (predecessor of EGEE) initially had its own schema (2001) • The GLUE (Grid Laboratory for a Uniform Environment) project was a collaboration between EDG, EU DataTAG, iVDGL (predecessor of OSG) and Globus to promote interoperability • The GLUE schema 1.0 was defined in September 2002 after several months of discussion • Version 1.1 was released with some minor improvements in April 2003, and deployed by EDG and then LCG and EGEE in 2003/4 • Version 1.2 was agreed in February 2005, finalised in May 2005 and deployed (fairly gradually) by LCG/EGEE in 2006 • Version 1.3 was agreed in October 2006, finalised in December 2006 and deployed from 2007 onwards GLUE 2.0 migration - EGI OMB

  4. Problems with GLUE 1.x • The schema has worked, but we have many accumulated issues • Initial schema definitions were based on limited experience • Only for CE and SE • No SRM for storage in 2002, just “classic SE” • Embedded assumptions which turned out to be too restrictive • Not easily extendable • Definitions not always clear, documentation somewhat limited • Case sensitivity, optional attributes, units, special values • Ambiguities (CPUs/job slots) • Too specific (only two CPU benchmarks) • Many things effectively defined by LCG/EGEE practice • We always required changes to be backward-compatible to make upgrading easier • 1.x schema had limited scope for additions, so changes often “shoe-horned” into the available structure • 1.2 schema introduced a generic GlueService object, but it had no connection to the existing CE and SE objects GLUE 2.0 migration - EGI OMB

  5. Upgrading the schema Schema migration is a complex process: • Define the abstract schema • Define the LDAP rendering • Implement the schema in the BDII and roll out • Write and deploy information providers • Update client tools to understand GLUE 2 • You are here! • ((Retire GLUE 1)) • The schema interacts with everything, so the rollout must be a gradual process without breaking anything GLUE 2.0 migration - EGI OMB

  6. GLUE 2 timeline • October 2006: First discussion, decision to move into the OGF • January 2007 (OGF 19): First working group meeting • June 2008 (OGF 23): Draft specification opened to public comment • August 2008: Public comment period ended • January 2009: Final specification ready • March 2009 (OGF 25): GLUE 2.0 becomes an official OGF standard • http://www.ogf.org/documents/GFD.147.pdf • LDAP rendering defined in May/June 2009 • Resource BDII in production since September 2009 • Site BDII in production since August 2010 • Top-level BDII in production since October 2010 • Information providers gradually rolled out (2010/11) • EMI 2 has full GLUE 2 support (May 2012) GLUE 2.0 migration - EGI OMB

  7. Ground rules • Complete redesign, not backward-compatible • OGF working group • Real Grid standard • Buy-in from other projects, especially Nordugrid • Incorporates many years of experience • Supports existing uses in GLUE 1.x • Designed to be easy to extend GLUE 2.0 migration - EGI OMB

  8. Key concepts User Domain Admin Domain Negotiates Share with Provides Service Manager Contacts Has Has End Point Share Resource Maps User to Defined on Has Runs Has Access Policy Mapping Policy Activity GLUE 2.0 migration - EGI OMB

  9. Computing Service Computing Service Computing Manager Has Has Manages Application Environment Execution Environment Computing End Point Computing Share Maps User to Defined on Can use Runs Computing Activity GLUE 2.0 migration - EGI OMB

  10. Storage Service Storage Access Protocol Storage Capacity Has Offers Storage Service Storage Manager Has Manages Offers Storage End Point Storage Share Data Store Maps User to Defined on Has Storage Share Capacity GLUE 2.0 migration - EGI OMB

  11. Changes in terminology • GLUE 2 looks a bit different to GLUE 1, but most of the concepts are there under different names • Site -> AdminDomain • (VO) -> UserDomain • Element -> Service • Service -> Endpoint • AccessControlBaseRule -> AccessPolicy, MappingPolicy • CE, VOView -> ComputingManager, ComputingShare • Remove duplication/double counting • Cluster/SubCluster -> ExecutionEnvironment • (Job) -> Activity • SA/VOInfo -> StorageShare • Existing attributes should all map to something • Unless they were unused • All existing use cases should be met GLUE 2.0 migration - EGI OMB

  12. Major changes • Generic concept of a Service as a coherent grouping of Endpoints, Managers and Resources • ComputingService and StorageService are specialisations, sharing a common structure as far as possible • Generic concepts for Manager (software) and Resource (Hardware) • All objects are extensible • Multivalued string “OtherInfo” and/or Key-Value pairs • All objects have a globally unique ID • Many objects allow many-to-many relations • More flexible, but more complex • Some concepts made more generic/flexible by making them separate objects rather than attributes • Location, Contact, Policy, Benchmark, Capacity • More complete/rigorous definitions • Many more enumerated types – but not fully defined yet • Placeholder values, case sensitivity, optional vs mandatory GLUE 2.0 migration - EGI OMB

  13. Main benefits • General structure for any service • CE, SE, WMS, VOMS, MyProxy, LFC, FTS, … • Generic service discovery tool • Much more expandable • All objects can be extended • We always find new cases we didn’t anticipate • Schema upgrades can take a long time • Fixes many long-standing problems • StorageService designed for SRM! • ComputingService has a better separation of Grid endpoint, LRMS and queue/fairshare • Interoperability and standardisation • EMI adopted GLUE 2 as a unified standard GLUE 2.0 migration - EGI OMB

  14. BDII implementation • Merged LDAP schema, GLUE 1.3 + GLUE 2 • Generally follows GLUE 1 practice, but some changes • Attribute names like GLUE2ComputingShareRunningJobs • Case sensitivity • Some attributes are mandatory • The naming and usage of foreign keys are somewhat different • Single LDAP server, on port 2170 as usual • Separate root DNs • o=glue vs o=grid • Should be no crosstalk other than data volume • Resource BDII: GLUE2GroupID=resource, o=glue • Site BDII: GLUE2DomainID=<site-name>, o=glue • Top BDII: GLUE2DomainID=<site-name>, GLUE2GroupID=grid, o=glue GLUE 2.0 migration - EGI OMB

  15. Service publisher • Generic service publisher to publish the GLUE 1 GlueService object in production for several years • FTS and LFC have their own dedicated providers • Upgraded to publish the GLUE 2 Service, Endpoint andAccessPolicy • Backward compatible with the GLUE 1 publisher • Supports all relevant GLUE 2 attributes, and Services with multiple Endpoints • Progressively rolled out as new versions of services are released • Already have WMS, LB, MyProxy, bdii_site, bdii_top, msg.broker.*, VOMS and VOBOX in production • In EMI 2 for AMGA, Argus • In work for Hydra, Nagios, Frontier/squid, … • Easy to add publication for any service (APEL?) • FTS and LFC have upgraded their own providers GLUE 2.0 migration - EGI OMB

  16. CREAM • For GLUE 1 we introduced the glite-CLUSTER node type to allow the GlueCluster and GlueSubcluster objects to be published from a different node • Supports sites with multiple CEs submitting to the same cluster • Also publishes the GlueService object for the “RTEPublisher” – a GridFTP server used to allow VOs to publish RunTimeEnvironment tags in the SubCluster • No-CLUSTER mode continues to publish everything from the CREAM node for small sites • For GLUE 2, use the CLUSTER node to publish all objects except the CREAM and CEMonEndpoints (and associated AccessPolicy) • Objects are merged in the site BDII • The detailed plan is described in a wiki page: https://wiki.italiangrid.it/twiki/bin/view/CREAM/CreamGlue2 • EMI 2 has complete publication • Batch system integration for PBS, LSF and SGE is done GLUE 2.0 migration - EGI OMB

  17. Storage • In EMI 2 we should have full GLUE 2 publication for DPM, dCache and StoRM • First versions so will need testing • Need to verify interoperability • Different implementations may have made different choices • CASTOR and BeStMan missing • What about classic SE (standalone gridftp)? GLUE 2.0 migration - EGI OMB

  18. Profile • The schema is intentionally very flexible • Many ways to use it, not necessarily interoperable • Need a profile to specify how it should be used • Detailed semantics of each attribute, what should and should not be published • Monitoring tools should enforce the usage • Currently in work • Will need agreement with EMI, LCG etc • Hope to finalise a document by the TF • May need updates in the light of experience GLUE 2.0 migration - EGI OMB

  19. Deployment status • 389 sites published in GLUE 1 in the CERN top BDII (as of 16/7/12) • 221 sites publishing a GLUE 2 AdminDomain • Missing sites mainly still have a gLite 3.1 site BDII • GLUE 2 support since gLite 3.2 update 16 (4/8/10) • Case-sensitivity in site name in the GOC DB • Sites we noticed were ticketed and fixed • May be other problems at some sites? • No explicit steps needed to configure for GLUE 2, just clones the GLUE 1 configuration • Sites may not have realised that they’re publishing! GLUE 2.0 migration - EGI OMB

  20. Published Endpoints 33 MyProxy 12 org.glite.ce.ApplicationPublisher 493 org.glite.ce.CREAM 337 org.glite.ce.Monitor 9 org.glite.ChannelManagement 9 org.glite.Delegation 9 org.glite.FileTransfer 178 org.glite.lb.Server 14 org.glite.RTEPublisher 217 org.glite.voms 95 org.glite.voms-admin 178 org.glite.wms.WMProxy 1 org.globus.gram 1 org.nordugrid.gridftpjob 1 org.nordugrid.ldapglue1 1 org.nordugrid.ldapglue2 1 org.nordugrid.ldapng 106 SRM 50 VOBOX 66 xroot 413 bdii_site 156 bdii_top 82 dcap 35 emi.storm 1 file 83 gsidcap 129 gsiftp 3 http 5 https 11 lcg-file-catalog 8 lcg-local-file-catalog 2 msg.broker.openwire 2 msg.broker.openwire-ssl 2 msg.broker.stomp 2 msg.broker.stomp-ssl CERN services currently missing - GGUS ticket submitted - CERN seem very slow to respond GLUE 2.0 migration - EGI OMB

  21. Clients • All clients need to become GLUE2-aware • Must be backward-compatible • Can happen gradually • WMS:JDL – first version in EMI 2 (next update) • Storage:GFAL/lcg-utils – first version in EMI 2 • Service discovery: lcg-info(sites), glite-sd-query • First version of OGF/SAGA service discovery tool available • CERN has possible replacement for lcg-info(sites) • Monitoring, resource accounting: gstat – in progress (Taiwan) • User tools - ??? GLUE 2.0 migration - EGI OMB

  22. Next steps • Start ticketing sites not publishing in GLUE 2 • Early September? • Now have a push to upgrade for other reasons • Every cloud has a silver lining! • Start pushing EMI 2 deployment • ~ 1 year? • May get help from LCG, e.g. for multicore support • Get new clients into UI/WN distributions • Aim for late 2013 for a full beta-test system? GLUE 2.0 migration - EGI OMB

  23. Summary • Define LDAP schema and deploy in BDIIs • 1.3 and 2.0 together in parallel • Now deployed in production • But sites are slow to upgrade! • Write and deploy information providers to populate the new objects • Generic Service publisher available • Being rolled out progressively • ComputingService publication (for CREAM) developed incrementally • Full version with EMI 2 • Including support for main batch systems • StorageService for DPM, dCache and StoRM in EMI-2 • Update clients to look at the new information • Workload management, data management, service discovery, monitoring, accounting, user, … • Upgrades need to be backward-compatible • Need to start validating the published information • Eventually want to make GLUE 2 the default • Maybe start in 2013??? • GLUE 1 still available as a fallback GLUE 2.0 migration - EGI OMB

  24. References • OGF GLUE working group home page • http://forge.ogf.org/sf/projects/glue-wg • GLUE 2.0 specification • http://www.ogf.org/documents/GFD.147.pdf • LDAP rendering specification (draft) • http://forge.ogf.org/sf/go/doc15518?nav=1 GLUE 2.0 migration - EGI OMB

More Related