1 / 21

CMS Monitoring tools

CMS Monitoring tools. Farida Fassi. November 28 th , 2008. Goal. Review of some CMS monitoring tools using ARDA Dashboard Useful features of dashboard for remote monitoring Services status for your site SAM tests for basic diagnostics - Job activities status for your site

rmadigan
Télécharger la présentation

CMS Monitoring tools

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CMS Monitoring tools Farida Fassi November 28th, 2008

  2. Goal • Review of some CMS monitoring tools using ARDA Dashboard • Useful features of dashboard for remote monitoring • Services status for your site • SAM tests for basic diagnostics - Job activities status for your site • PhEDEx monitoring tool for transfer activities • http://cmsweb.cern.ch/phedex/

  3. Starting point http://arda-dashboard.cern.ch/cms/ Jobs SAM

  4. SAM visualization • 4 clickable buttons • Latest Results • Historical View • Feedback  Savannah • Help  Twiki • Every page you’ll find has an URL

  5. Latest results: CE view Click to reset to menus • The one that ‘comes easy’ Click to see log From GOCDB Click to see 48h history

  6. Last 48h • This view is not clickable ! • But shows when tests ran

  7. select service Types menu • Great instructions from Facility Operation team • https://twiki.cern.ch/twiki/bin/view/CMS/SAMChecklist • Your favorite site will look like this SRMv2,CE tests

  8. SAM availability browsing • Can browse and click down to single test and • then will get log • every time the color matrix • has a blue border • Means it is clickable Click

  9. SAM visualization (1)

  10. SAM visualization (2) Click to see log of this test

  11. Job processing on the Grid • To follow the job processing and analysis on the Grid You can use the main CMS Dashboard page: http://dashboard.cern.ch/cms Click on the “Interactive view”

  12. Job Dashboard • Direct link is : http://lxarda09.cern.ch/dashboard/request.py/jobsummary You have a choice: 1).Select to see all jobs submitted in the selected time window (default), By default you get last 24 hours time Window 2).Select all jobs which had been terminated in last 24 hours or are pending or running at the current moment. Then select ‘all jobs regardless submission time’ option

  13. Running time (wall clock, from job wrapper) One random day http://tinyurl.com/2l6s4s click here • One random day

  14. Waiting time (from submission to start of job) http://tinyurl.com/22vknn click here • One random day

  15. Interactive viewWhat info it can provide me? All my jobs at a given site had failed, does the site have a problem? Supposing you are having Problems in FZK. Let’s check whether you are the only one who. Sort by site. The sites having a lot of light green or red, are those which might have a trouble. FKZ looks suspicious in this respect, Let’s investigate further.

  16. Expand using bars • Left click on color bars to get menu for expanding by… • Keep doing it Note: more items then on left menu, in particular by task, by submission type (crab server/direct), etc

  17. Interactive viewWhat info it can provide me? (1) Each column can be used for sorting Each blue number is clickable\Get list of jobs, Grid/Crab id’s, times, exit codes, WorkerNode name (or IP)

  18. Interactive viewWhat info it can provide me? (2) The full list of job failure codes you can get it by clicking at ExitCode Jobs are failing with the code 50115 cmsRun did not produce a valid/readable job report at runtime The full list of job failure codes You can get by clicking at ExitCode Jobs indicating site problem are all marked there

  19. Feedback ! Link to Savannah

  20. Useful links • Commissioning Twiki: https://twiki.cern.ch/twiki/bin/view/CMS/ComputingCommissioning • Dashboard: http://arda-dashboard.cern.ch/cms • SAM: http://lxarda16.cern.ch/dashboard/request.py/samvisualization • Squid Monitoring • http://belforte.home.cern.ch/belforte/misc/Squid-Hit-Summary.html • Details for your site • http://frontier.cern.ch/squidstats/indexcms.html • PhEDEx: http://cmsweb.cern.ch/phedex/

More Related