1 / 17

POOL Data Storage, Cache and Conversion Mechanism

POOL Data Storage, Cache and Conversion Mechanism. Motivation Data access Generic model Experience & Conclusions. D.D ü llmann, M. Frank, G. Govi, I. Papadoupolos, S. Roiser. Motivation. Physics software should be independent of the underlying data storage technology

wayne-paul
Télécharger la présentation

POOL Data Storage, Cache and Conversion Mechanism

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. POOL Data Storage, Cache and Conversion Mechanism Motivation Data access Generic model Experience & Conclusions D.Düllmann, M. Frank, G. Govi, I. Papadoupolos, S. Roiser CHEP 2003 March 22-28, 2003

  2. Motivation • Physics software should be independent of the underlying data storage technology • Data of different nature has to be accessed • Event data, detector data, statistical data, … • The data sizes: O(106) to O(1013) Bytes/experiment/year • The access patterns differ • It is unclear how these data will be stored • Locking into one technology may be a disadvantage Need for a technology free data storage and data access mechanism CHEP 2003 March 22-28, 2003

  3. Strategy • Hide any technology details from the clients • Clients deal with objects or object references • Hide all cache/persistency specific details • No compromise on transient data representation due to technology details • Each technology can be handled transparently • Transient representation sufficient for persistency • Ensure independence of experiment framework • Run-time binding of transient data to the underlying technology • Need for object description: “dictionary” CHEP 2003 March 22-28, 2003

  4. Manages object cache Ref<T> Data Service Object Cache Client accessdata through References Client Client Client Ref<T> Object Cache Data Service • Different context • Event data • Detector data • other Ref<T> Data Service Object Cache Client Data Access CHEP 2003 March 22-28, 2003

  5. Reference to Cache Manager Ref<T> Object Reference in Cache Manager Dereference Pointer to object Cache Access Through References • References know about the Data Cache • 2 operation modes: - Clear at checkpoint - Auto-clear with reference count • References are implemented as smart pointers • Use cache manager for “load-on-demand” • Use the object key of the cache manager CHEP 2003 March 22-28, 2003

  6. Data Service object cache Object Token Ref<T> <…> … <pointer> <…> Cache Ref Data Service T o k e n Pointer Storage type Object type Persistent Reference File Catalog Persistency Service Cache Access by Smart Pointer CHEP 2003 March 22-28, 2003

  7. Persistent C++ pointer >> object ID Objects & pointers Objects, object IDs, collections & DBs Generic Persistent Model Transient CHEP 2003 March 22-28, 2003

  8. Ref<T> Client Data Service (3) Load request (1) read(…) Technology dispatcher PersistencyService (2) Look-up Try to accessan object data Data Cache Common Handling Conversion Service (5) Register-Object-References Storage Service Access to the Data Will be unsuccessful,requested object is not present CHEP 2003 March 22-28, 2003

  9. Object Cache Ref<T>. mark for write Data Service Client Technology dispatcher PersistencyService Common handling Conversion Service Map objects and write Storage Service Storing objects Start Transaction Commit Transaction cache.startTransaction(...) Ref<T>.mark_write(placement) ... Ref<T>.mark_write(placement) cache.endTransaction(...,COMMIT) CHEP 2003 March 22-28, 2003

  10. The Storage Mechanism • The underlying model assumptions • How they map to “known” technologies • Migrating objects to/from the persistent medium • Object mapping • Reference handling • References are objects, not primitives • Need setup: Reference to data cache • ROOT: Callback for base class (Streamer) CHEP 2003 March 22-28, 2003

  11. Data Cache StorageSvc • Object type (class name) • Optional data transform Storage type DB name DiskStorage database database Database Objects Cont.name Objects Item ID Objects Objects Objects Container Objects Container Objects Container Objects The Generic Model CHEP 2003 March 22-28, 2003

  12. Database Technologies • Identify commonalties and differences between technologiesNecessary knowledge when reading/writing • Model adapts to any technology with direct record access • Need to know record identifier in advance • RDBMS: More or less traditional • Primary key must be uniquely determined before writing • Probably two round-trips CHEP 2003 March 22-28, 2003

  13. Object Mapping • Objects must maintain personality when persistent • Allow for queries, selections and independent element access • If technology supports objects… • Want to make use of such features • These technologies must be instructed how to do it • Need object dictionary • If technologies support only primitives • Split objects into primitives [until reasonable level] • Need full access to object member data [member offset, type] • Constructor and Destructor with defined signature • Need object dictionary CHEP 2003 March 22-28, 2003

  14. .xml .h GCC-XML Code Generator ROOTCINT LCG dictionary code CINT dictionary code Gateway I/O CINT dictionary LCGdictionary Other Clients Data I/O Reflection Technology dependent Dictionary: Population/Conversion DictionaryGeneration CHEP 2003 March 22-28, 2003

  15. Object Token <pointer> <…> <…> <…> (3) (2) (4) Link ID Link Info ... ... DB/Cont.name,... <number> (1) Local lookup table in each file Follow Object Associations Entry ID Link ID CHEP 2003 March 22-28, 2003

  16. The Link Table • Contains all information to resurrect an object • Storage type • Database name • Container name • Object type (class name) • Cache hints • E.g. other possible transient conversions • Size: O(Associations in class model) • Local to every database • Size is limited CHEP 2003 March 22-28, 2003

  17. Experience & Conclusions • We adopted a mechanism to write physics data without knowledge of the underlying store technology • Our approach can adopt any technology based on database files, collections and objects within collections • ROOT (implemented) and RDBMS (work ongoing) • We are able to choose technologies according to needs • As can save any objects described by the dictionary • We can offer a uniform interface to persistency clients http://http://lcgapp.cern.ch/project/persist CHEP 2003 March 22-28, 2003

More Related