SAGE – Storage Accounting for Grid Environments in gLite

  1. SAGE – Storage Accounting for Grid Environments in gLite Presenter: Diego Scardaci INFN (Italy) Fabio Scibilia Consorzio COMETA EELA-2 First Conference Bogota, Columbia, 25-27.02.2009

  2. Main Features • SAGE • Storage Accounting for Grid Environments • Makes accounting of the amount of disk space usage • Keeps track of all accesses to grid files • Is a software solution • Implements a plug-in architecture that can be adopted by different storage systems • Currently, plug-ins are available for DPM based storage systems • Sight – on – SAGE • Uses information collected by SAGE systems dispersed across the grid • Is a web application for disk usage reporting • Privileges depend on the profile of the logged user • Can be used to monitor health status of storage systems • Minimal intervention of the administrator Bogota, EELA-2 Conference, 25-27.02.2009

  3. Disk usage • Analoguous to the consumption of electric power • Integral of the electric power in time • Consumption is evaluated periodically • Considers both space and time consumed by a file • Integral of the size of a file (MBytes) in time (hours) • Unit of measure is MBytes * hours • Disk consumption is sampled periodically • Linear function • space: disk usage of more files is the sum of disk usage of the same files taken singularly • time: disk usage of contiguous time ranges equals the sum of disk usage same ranges taken singularly • Simple to evaluate • Equals the sum of rectangles surfaces • Rectangles edges can be taken by intercepting file accesses Bogota, EELA-2 Conference, 25-27.02.2009

  4. Accounting • Accountable disk usage • Is the real disk energy consumed by a file • In figure a) is the area filled up with slashes • Accounting time • Is the time in which SAGE was able to make accounting • Coincides with the time in which the AGENT was up and running • In figure b) is the sum of the two time ranges (40% + 20% = 60% of the considered time) • Accounted Disk Usage • Is the disk usage evaluated in period in which SAGE was able to make accounting. • In figure c) is the intersection of area in figure a) with time ranges in figure b) • Sampling • Every midnight (GMT 0) SAGE samples partial disk usage of all files stored on that storage system • These sampled values are used later to generate reports • The difference of two consecutive samples is the disk usage reported for that day Bogota, EELA-2 Conference, 25-27.02.2009

  5. Architecture • sage agent (one for each disk server) • At the start up scans DPM partitions • Later, intercepts all file access events by parsing log files • Creates and queue raw messages waiting for sending them to the server • Parsers are plug-in executables (then, they can be implemented for different storage systems and protocol) • sage server (one for each SE) • Listens for incoming connections from agents • Processes raw messages and create data to store in the SAGE Database • Makes accounting operations periodically (every night) • Through plug-ins, communicates with local Mass Storage System (MSS). At the moment, the plug-in is available for DPM • sos probe (one for each SE) • Independent of the SE it is installed on. • Makes reports on demand by accessing to the SAGE Database • Sight-on-SAGE (one for e-Infrastructure) • Web application for reporting • Management of authentication and authorization according to a list of user profiles (anonymous, authenticated user, VO-admin, SE-admin, web-admin) Bogota, EELA-2 Conference, 25-27.02.2009

  6. Sight-on-SAGE: web reporting • Disk usage reporting • Disk usage detailed per: storage, VO, user, file, day (5 nested levels) • Depending on user credentials, reports can be navigated at different levels • Status reporting • On the current status of files which SAGE is make accounting for. • File Browsing • Users can browse SEs in which SAGE is installed and take a look at the current consumption of disk space • Activity Reports • Reports on the accesses to files • Allowed to file owners only • Pool reporting • Through Sight-on-SAGE, SE administrators can check the status SAGE systems installed across the grid infrastructures. Bogota, EELA-2 Conference, 25-27.02.2009

  7. Sight-on-SAGE: user roles • Anonymous • A user with no valid certificate • Can browse public area of the Sight-on-SAGE web portal only • Authenticated grid user • Has a valid certificate • Can generate reports on disk consumption due to owned files with maximum level of details. • Storage administrator • Has a valid certificate and is registered as SE-admin on one or more storage systems on which SAGE is installed and running. • Can generate reports on disk consumption due to files resident on storages he/she is administrator of • Can also generate reports on the health status of SAGE systems he/she is responsible for • VO administrator • Has a valid certificate and has the role of VO-admin for some VOs • Reports on the disk consumption due to files owned by members of the VO(s) he/she is administrator of • Web administrator • Has a valid certificate and is present in the Sight-on-SAGE configuration file • Cannot generate reports. Can access to logs and debug information Bogota, EELA-2 Conference, 25-27.02.2009

  8. Web reporting: an example Storages for which this user is responsible for This menù is visible to SE-admins only. In this case he is generating a report Storages to which demand for report generation Time range the user is interested in Disk usage at different levels of details Flag to detail user by user Details on storage Details on VO cometa Details user by user Cumulative for all storages (at the page bottom) Bogota, EELA-2 Conference, 25-27.02.2009

  9. Questions ? Bogota, EELA-2 Conference, 25-27.02.2009