80 likes | 200 Vues
EDT-WP4 monitoring group status report. Gennaro Tortone (INFN Napoli) [gennaro.tortone@na.infn.it]. DataTAG WP4 meeting Bologna – January 14, 2003. EDT monitoring group. Partecipants Sergio Andreozzi (INFN CNAF) Vincenzo Ciaschini (INFN CNAF) Sergio Fantinel (INFN Legnaro)
 
                
                E N D
EDT-WP4 monitoring groupstatus report Gennaro Tortone (INFN Napoli)[gennaro.tortone@na.infn.it] DataTAG WP4 meetingBologna – January 14, 2003
EDT monitoring group Partecipants Sergio Andreozzi (INFN CNAF) Vincenzo Ciaschini (INFN CNAF) Sergio Fantinel (INFN Legnaro) Antonia Ghiselli (INFN CNAF) Gennaro Tortone (INFN Napoli) Cristina Vistoli (INFN CNAF) Goal development of a Grid monitoring toolin order to monitor the overall functioning of the Grid. The software should enable the grid administrators to quickly identify problems and take appropriate action
Tasks • identify the requirements for Grid monitoring • done – Grid monitoring analysis draft [with some LCG inputs](available on http://gridmon.na.infn.it/lcg-edt) • evaluation of existing monitoring tools (sensors) to use as “first monitoring layer” on each grid-element • done – tools evaluated: Ganglia gmond • very easy to use • multicast based (to gather metrics in a farm) • it has not an historical archive • some RPM dependencies …
Tasks … EDG-WP4 fabric-monitoring tool (fmon) • client-server model • very easy to use • very easy to install (one RPM – without dependencies) • highly customizable (time interval for each metric, …) • it is very easy to add a new metric • historical archive • database in plain-text format • extension of the WP4 fabric-monitoring tool (fmon) to include other monitoring metrics • done – (all metrics added are available on http://gridmon.na.infn.it/lcg-edt)
Tasks • GLUE schema extension to include all monitoring metrics • done – “host level” added to GLUE schema • development of information-providers “to fill” the GLUE host level extension – in progress • definition of database structure to store snapshot/historical monitoring data – in progress
GRIS (GLUE schema) monitoring service information providers discovery service GIIS (GLUE schema) web interface WP4 fmonserver WP4 monitoring agent WP4 monitoring agent run run metric output metric output WP4 sensor WP4 sensor read read metric output metric output /procfilesystem /procfilesystem worker node worker node ldap query information index ldap query monitoring server write run ldif output farm monitoringarchive read computing element GRID monitoring architecture for LCG/EDT testbeds author: G. Tortone date: 18/12/2002
Future activities • “personal” Grid-monitoring [integration with VOMS] • job monitoring • automatic resource discovery using MDS infrastructure and GLUE schema • evaluation of OGSA as monitoring service • development of a "Nagios based" Grid monitoring tool • scalability • very low intrusivity • automatic resource discovery • fault detection and notification • metrics graphs • web interface