1 / 15

AFFAIR a flexible fabric and application information recorder

Tome Anti č i ć Ruđer Bošković Institute, Zagreb,Croatia ALICE,CERN. AFFAIR a flexible fabric and application information recorder. Muon. Trigger detectors. Inner tracking system. TPC. TRD. Particle identifcation. Multi Event bufffers. Trigger data. 216. 1.2 msec. x 435. DDL.

moshe
Télécharger la présentation

AFFAIR a flexible fabric and application information recorder

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Tome Antičić Ruđer Bošković Institute, Zagreb,Croatia ALICE,CERN AFFAIRa flexible fabric and application information recorder

  2. Muon Trigger detectors Inner tracking system TPC TRD Particle identifcation Multi Event bufffers Trigger data 216 1.2 msec x 435 DDL 1oo Mb/s L0 trigger RORC RORC RORC RORC RORC RORC x 334 216 RORC RORC RORC RORC RORC RORC 5.5 msec 1oo Mb/s L1 trigger 216 x 278 LDC LDC LDC LDC LDC LDC 88.0 msec 60 Mb/s L2 trigger L3 trigger EDM Gigabit SWITCH Trigger system 40 Mb/sec GDC GDC GDC GDC x 50 DDL-Detector Data Link 1.25 GB/s RORC-Read-Out Receiver Card SWITCH LDC-Local Data Concentrator GDC-Global Data Collector PDS PDS PDS PDS EDM-Event Destination Manager PDS-Permanent Data Storage Why/What is AFFAIR? DATE But, Affair is also able to run in stand alone mode (no DATE)

  3. Requirements • Monitor system performance (bandwidth, CPU, disk usage, …) • Monitor DATE performance (LDC/GDC/DDL bandwidth, events recorded,…) • Need down to 10 (or even less) sec updates • Should be as “invisible” as possible • No growing (or better yet none) logfiles on monitored nodes • Not cpu intensive • Not network intensive • Web access to processed, real time data in the form of graphs, histograms,.. • Scalable – should work equally well for 10 as for 1000 computers • All monitored data should be permanently stored for offline analysis Has to work, with no lost data, crashes, etc, no maintainance So some choices made, wich may not be optimal, but gets the job done

  4. System collector System collector /proc /proc DATE shared memory DATE shared memory DATE collector DATE collector rrd 1 • Round robin excellent way to write/read file fast and easy, with no performance loss • Works with fixed amount of data (fixed time depth), so unchanging size rrd 2 rrd 3 Root program that reads files and creates plots Web/apache/php AFFAIR structure AFFAIR Monitor D I M data ~100-1000

  5. Snapshot plots

  6. Time dependent plots Rates (kB/sec) for last 24 hours for some GDC nodes • Full lines average • Dashed lines max values Rates (kB/sec) for last 7 days for some GDC nodes

  7. Time dependent plots II

  8. Web interface • Web interface written using php/java script • Completely automatically generated • New variables, monitored sets automatically reflected in plots

  9. Web interface II

  10. System collector System collector /proc /proc LDC/GDC shared memory LDC/GDC shared memory DATE collector DATE collector • As have hundreds asynchronous storage calls every few seconds, have one root file per node root 1 root 2 root 3 Root program that reads files and creates plots from root files Web/apache/php AFFAIR structure AFFAIR Monitor D I M data ~100-1000

  11. Offline analysis • Detailed histograms (aggregate and individual) can now also be created

  12. ROOT GUI for configuration/operation

  13. ROOT GUI for monitoring

  14. Graph configuration • All graphs created using one configuration file • Completely defines units/ labels/ if graphs aggregate / if graphs superimposed • Thus no code intervention needed to create the plots • New monitored variables can be added and configured easily GUI in process But not easy:as far as I am aware, cannot easily add rows of data

  15. Conclusion • AFFAIR successfully monitors hundreds of nodes • Field tested in ALICE Data Challenges • ROOT huge part of it • It is a work in progress: • Much more detailed offline analysis • Add feature to see performance data/plots on mobiles/palm pilots • A lot more work on the GUI • Add high/low warnings • … http://www.cern.ch/affair

More Related