1 / 15

Monitoring for GridNNN project

Monitoring for GridNNN project. Sergey Belov , LIT JINR 15 September, NEC’2011 , Varna, Bulgaria. GridNNN project (I). Grid support for national nanotechnology network of Russia

alize
Télécharger la présentation

Monitoring for GridNNN project

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Monitoring for GridNNN project Sergey Belov, LIT JINR 15 September, NEC’2011, Varna, Bulgaria

  2. GridNNN project (I) S. Belov, GridNNN monitoring • Grid support for nationalnanotechnology networkof Russia • To provide for science and industry an effective access to the distributed computational, informational and networking facilities • Expecting breakthrough in nanotechnologies • Supported by the special federal program • Main technical points • based on a network of supercomputers (about 15-30) • has two grid operations centers (main and backup) • is a set of grid services with unified interface • partially based on Globus Toolkit 4

  3. GridNNN project (II) S. Belov, GridNNN monitoring • Main aim • integration of small and medium supercomputers into a unified distributed computing environment • Highly heterogeneous grid environment (hardware, software) • Oriented to parallel tasks rather than single batch tasks • Workflow management • Jobs consist of tasks • Follows core OGSA principles • GSI based security model • RESTful grid services

  4. GridNNN architecture layers Based on the report of A.Kryukov et al., Architecture of GridNNN, GRID’2010 S. Belov, GridNNN monitoring

  5. Core grid services Based on the report of A.Kryukov et al., Architecture of GridNNN, GRID’2010 • WebUI server • Resource Brocker/metascheduler + Workflow management (RESTful) • Information Service (RESTful / WS MDS) • Monitoring & Accounting • Registration service (RESTful) • GSI services • CA, MyProxy, VOMS • GridFTP servers S. Belov, GridNNN monitoring

  6. Monitoring goals S. Belov, GridNNN monitoring • State of sites and services • Availability • Real operational state • Monitoring of user's jobs and tasks • Keeping history on different system's parameters • Information representation • General infrastructure state in whole • Running jobs and tasks • Separate sites and services (real-time and history) • Visualization of job events

  7. Monitoring of resources S. Belov, GridNNN monitoring • State of computational resources by site (based on data from information index(es)) • Slots available for tasks • Jobs (total on site), jobs belong to GridNNN • Structure and properties of clusters • Subclusters, nodes, slots, operation system, architecture • Application software • Supported VOs (with ACLs, Access Control Lists) • Monitoring of jobs running on sites (by information from Pilot servers)

  8. Simple functional tests of services S. Belov, GridNNN monitoring Goal: checks of services' operation Simple tests for services registered in Service for Registration of Resources and Services Connection to the declared port of the machine (plane or secured — in depend of specified protocol) Information requests to some services Separate tests scenarios for MDS information indexes and Service for Registration of Resources and Services: information Web page with the history of functional tests results

  9. Accounting and job monitoring S. Belov, GridNNN monitoring • Goal: to get information, both real-time and historical, on resources utilization and jobs running on GridNNN infrastructure (by users, VOs, sites) • Information sources: Pilot servers, GRAMs and local resources managers • Collecting data on jobs and tasks in the system • All jobs events timestamps, real consumed CPU time • Accounting information reports in different views: • by sites, VOs and single users • Aggregation of actual job's execution time from all sites

  10. GridNNN accounting S. Belov, GridNNN monitoring • Gathering statistics on CPUtime consumed by usersand VOs • In plain hours, later with allowanceof computational system productivity • Displaying the statistics of CPU resources usage • Different report kinds: for user, VO manager, site admin, GridNNN project admins • Statistics access roles to protect private information of users and VOs

  11. Accounting and jobs monitoring: screenshots S. Belov, GridNNN monitoring

  12. Monitoring and accounting information flows Infosys central Information index Monitoring andaccounting data storage Information collector Pilot Jobmanagementservices Monitoring website Monitoring data provisioning (Web Services) Accounting Informationpublisher Functional tests of the services S. Belov, GridNNN monitoring

  13. GridNNN centers on the map http://mon.ngrid.ru • More than 15 resource centers at the moment in different regions of Russia • RRC KI, «Chebyshev» (MSU), IPCP RAS, CC FEB RAS, ICMM RAS, JINR, SINP MSU, PNPI, KNC RAS, SPbSU, SPII RAS and others S. Belov, GridNNN monitoring

  14. Infrastructure operation visualization with Google Earth S. Belov, GridNNN monitoring

  15. Conclusion S. Belov, GridNNN monitoring GridNNN project was successfully finished this summer The resulting software and created infrastructure are to be used for developing Russian Grid Network project Fully operational monitoring and accounting tools are in production Further user interfaces improvements are planned within Russian Grid Network project

More Related