1 / 14

SAMGrid Database Server

A. Lyon, Fermilab (for the SAMGrid Team). SAMGrid Database Server. Outline. Introduction Issues Addressed with Redesign Redesign Goals New DB Server Design/Features Outstanding Issues Integration with SBIR II Concluding Remarks. Introduction: The SAMGrid System.

daw
Télécharger la présentation

SAMGrid Database Server

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. A. Lyon, Fermilab (for the SAMGrid Team) SAMGrid Database Server

  2. Outline • Introduction • Issues Addressed with Redesign • Redesign Goals • New DB Server Design/Features • Outstanding Issues • Integration with SBIR II • Concluding Remarks

  3. Introduction: The SAMGrid System • SAMGrid: general data-handling system designed to work for experiments with peta-byte sized datasets and widely distributed production/analysis facilities • Offers a wide variety of services, including those for: • data transfer, storage and management • process bookkeeping on distributed systems • Used by D0 and CDF, being tested for use by MINOS and CMS

  4. Introduction: DB Server Role/Usage • SAMGrid uses central Oracle RDBMS • Most of the communication with the DB handled by the CORBA-based DB Server • Services provided: • Cataloguing services (file metadata, event catalog, replica catalog • Dataset services • Process accounting • Runtime support for the SAMGrid station services • Usage: About 250 million DB queries over the recent 3 month period

  5. Issues Addressed with Redesign • Large code base: more than 27000 lines of python code, 350 CORBA IDL methods implemented – more than 60% obsolete • Single threaded code => performance issues • Removing/modifying old code is very difficult => maintenance problems, hard to adapt to the DB schema changes (the latest change resulted from the CDF adoption of the SAMGrid system was very complex)

  6. Redesign Goals • Update treatment of file metadata, align it with the latest DB schema • Improve code maintainability • Easier new development • Improve server performance

  7. New DB Server Design/Features • DB Server Generator • taken from the old infrastructure • handles automatic generation of the core (DB-derived) classes – each of those correspond to one table in the DB • CORBA wrapper classes: layer of code on top of the ORB-generated structs end exceptions with the purpose of shielding developers from having to manipulate those structs/exceptions directly • promote code maintainability/re-use (e.g., SAMGrid python API uses the same code as the Db Server) • easier development

  8. New DB Server Design/Features • CORBA interfaces • redesigned and reorganized so that they closely match services which the DB server provides • File metadata • described as dictionaries • each file type has a certain set of required parameters • flexible/configurable system • Multithreading • should minimize performance problems

  9. Outstanding Issues • Impact of the new CORBA infrastructure with respect to the server performance (issue for large lists) • We have not completely finished transferring all functionality of the existing code into the new server

  10. Deployment Path • Major changes in the core software component => deployment into production is not easy • Upgrade will be incremental, so that its impact on both users and the DH system itself should be minimal. • Plan for deployment in three phases • Upgrade both experiments DB to the latest schema (completed in June ’04, required patching of the old code) • Deploy new db server in parallel to the old one, install new clients, start testing (ongoing now) • Start gradually upgrading main production stations

  11. Integration with SBIR II • SBIR II strives to provide access to distributed databases with a single query • This would remove the SAMGrid dependence on the centralized DB • We are working on interfaces which will allow us to plug different query mechanisms into our code

  12. Concluding Remarks • SAMGrid DB Server, one of the most critical components of the system, was completely re-designed • New architecture promotes code maintainability, easier development, and better performance • New treatment of the file metadata: flexible and configurable • Deployment into production and necessary system upgrades will be done incrementally to minimize impact on users/DH system

More Related