1 / 20

Grid Coordination by Using the Grid Coordination Protocol

Grid Coordination by Using the Grid Coordination Protocol. R. Harakaly, F. Bonnassieux, P. Primet Presented by: Laurent LEFEVRE CNRS-UREC, Lyon, FRANCE INRIA RESO, LIP (UMR CNRS, ENS, INRIA, UCB), Lyon, FRANCE. Outline. Why do we need grid scheduling? Grid Coordination Protocol Features

ahanu
Télécharger la présentation

Grid Coordination by Using the Grid Coordination Protocol

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Grid Coordination by Using the Grid Coordination Protocol R. Harakaly, F. Bonnassieux, P. Primet Presented by: Laurent LEFEVRE CNRS-UREC, Lyon, FRANCE INRIA RESO, LIP (UMR CNRS, ENS, INRIA, UCB), Lyon, FRANCE

  2. Outline • Why do we need grid scheduling? • Grid Coordination Protocol • Features • Architecture • Multiple ring support • Robustness • Security • One time token • User Interface • Implementation and Results • Network monitoring • Configuration coordination • Network Topology Discovery • Summary GAN 2004

  3. Why do we need grid scheduling? • Centralized services: • VO servers • CRL distribution servers • Configuration servers • Distributed services • Network monitoring and discovery GAN 2004

  4. Grid Coordination Protocol • Based on the Probes Coordination Protocol (PCP) • Generalized functions, not focused only to the network monitoring • Ring with token approach • Multiple ring support with inter-ring host locking for scalability • Used for: • Network monitoring synchronization • Coordination of the configuration updates • Scheduling of information distribution GAN 2004

  5. Features • Openness: Possibility to schedule any service needed • Flexibility/Customizability: Full and easy (re)configuration/parametrization of the service on the remote nodes. • Robustness/Reliability: Necessity to provide fully reliable service • Scalability: Possibility to schedule big number of members • Security: Distributed information and participating member nodes must be secure. • One time token: information distribution on demand GAN 2004

  6. GCP Architecture • Distributed architecture • No central information source • No single point of failure • Distributed token registration • Distributed functions • Scalability • Ring: logical group of services • Support of multiple rings • Possibility to build hierarchy of rings GAN 2004

  7. Multi-ring support • Required by need of: • Support of scalability by creation of the ring hierarchy • Scheduling of different services (e.g. CRL update, topogrid, Iperf, etc.) • Multiple independent rings: danger of possible collision • Critical for active network measurements GAN 2004

  8. Inter-Ring Experiment Collision Two measurements on the same host • Collision possibility: • In case of multiple independent rings sharing one or more hosts • Ring1 members {1, 2, 6, 7} • Ring2 members {3, 4, 5, 7} • Solution: • Inter-ring host locking 2 3 1 7 4 ! 6 5 GAN 2004

  9. GCP host locking mechanism Unable to lock destination • Source and destination host locking • Conflicting experiments are delayed due to lock on the host BLOCKED 2 3 1 7 4 6 5 GAN 2004

  10. GCP Robustness • Distributed architecture • No single point of failure • In case of failure of one measurement host, GCP will bypass it without any impact on a service periodicity • In case of reliable service the failure report can be created for later successful finishing of the task • Protocols based on token passing face to problems connected with lost and/or duplicated token. • Timeout based token recovery mechanism • Token_ID and regenerating_host_ID based duplicate token elimination GAN 2004

  11. GCP Security • Three main security issues: • Host Security: Impossibility to start non-approved service on the host, or action which compromises the host security • Token Security: Integrity of the token cannot be modified on the way • User Authentication: Assign owner to the token and base any token manipulation and service on this information GAN 2004

  12. One Time Token • New feature • Token passes once through all member nodes. • Used for: • Non-periodic/on demand/interactive services • On demand CRL update • Ad Hoc monitoring measurements • On demand/interactive active network monitoring probes • Plan: Add possibility to define an arbitrary number of passes. GAN 2004

  13. User Interface • Set of utilities is provided for easy manipulation (creation, deletion, update, ..) of the rings and for an external GCP host (un)locking. • C and JAVA API for embedding of GCP client functionality (ring creation, modification, etc.) is prepared. GAN 2004

  14. edg-gcpd-admin output [hary@ccwp7 bin]$ ./edg-gcpd-admin -L grid-nm.ifae.es GCP daemon version: 2.0.7 Reporting node: 192.101.162.78 Ring name: pinger, token id: 940, options = 0 Token status: NORMAL Token state: WAITING Period 1800, Delay 60, Timeout 600 Command: edg-pinger Last execution timestamp: Fri Apr 9 10:50:14 2004 Members: 134.158.105.254 137.138.225.18 141.52.160.24 130.246.187.145 193.136.90.138 193.206.210.133 131.154.99.101 192.101.162.78 192.16.186.229 ... GAN 2004

  15. Implementation and results • Most of presented use cases are already deployed on the application testbed of the European DataGrid project. GAN 2004

  16. Network monitoring • Scheduling of the set of distributed network monitoring sensors • Scalability problems solved by multilayer monitoring architecture • Inter-ring locking used for avoiding the concurrent measurements between two rings Fr Backbone ring Es It GAN 2004

  17. 700 period 600 20 500 15 token regeneration count 400 10 300 5 200 0 118 120 122 124 126 128 130 100 0 118 120 122 124 126 128 130 Periodicity [s] Experiment periodicity measurement GAN 2004

  18. Network monitoring configuration • Network monitoring management cannot be completely distributed. It is always centralized in one (or several) network operation centers. • Monitoring nodes then downloads the configuration files from these centers. • GCP enables to create the easily maintainable and configurable upgrade scenarios • This approach is easily applicable for any service which publish the information on a central node like (CA CRL updates, VO servers, etc.) GAN 2004

  19. Network Topology Discovery GAN 2004

  20. Summary • GCP is a generic coordination protocol for grid control and management services • Stability and usability were demonstrated on the use cases already implemented in the EDG DataGrid project • Download: http://ccwp7.in2p3.fr • Questions: robert.harakaly@urec.cnrs.fr GAN 2004

More Related