50 likes | 171 Vues
This document outlines a framework for on-demand diagnostics to address issues within infrastructure systems. The main objective is to provide timely information about the status of services based on user reports and historical data analysis. Key components include the development of end-to-end (E2E) diagnostics tools, collection of relevant data, and the creation of "weather maps" for status analysis. It emphasizes collaboration across federations, utilizing tools such as NAGIOS and other monitoring systems to ensure effective monitoring and problem diagnosis.
E N D
TF-EMC2WI: diagnostics Miroslav Milinović University Computing Centre - Srce <miro@srce.hr> TF-EMC2 meetingFlorence, March 2007
The Task • to be able to diagnose the problem on demand ... • to be able to give the information about the status for the specified period in the past ... ... for ... • e2e problem user reports • problems with / status of the part(s) of the infrastructure • tools: • on demand e2e diagnostics tools • “weather maps” – tools for status analisys at given time
Steps • collect the info on all tools (available, used, planned, ...) • identify the: • elements about which the data will be collected / retrieved • data that will be collected / retieved • build/test “weather map” or/and e2e diagnostics tool(s) • ...
Step 1: • REFEDS pages (13 federations listed)http://www.rediris.es/wiki/tf-emc2/index.php/FederationsReadMeFirst • AAIEye (http://www.csc.fi/english/institutions/haka/technology/aaieye) • shib based, eduGAIN support planned, interface to NAGIOS monitoring framework • EDDY: www.cmu.edu/eddy(http://middleware.internet2.edu/e2ed/) • Internet Detective (Surfnet) • others like: • http://radius-rap.a3.surf.net/reports/radius-check-eduroam/stateoverview.html • www.aaiedu.hr/status/ • CESNET • ... • what about your organisation / country / (con)federation? • PerfSonar? (GEANT2)
??? • confederation level: fed publishes info? • to central repository or on demand? • federation level: • centralised monitoring server with monitoring client/probe • eduGAIN monitoring client? • use MDS of eduGAIN? • NAGIOS framework? • we still need to: • identify elements / parts / services • define the data to be collected / published / retrieved • define the collection / analisys method • harmonise (meta)data about the elements