1 / 11

The CDF Run II Data Catalog and Data Access Modules

The CDF Run II Data Catalog and Data Access Modules. P. Calafiura, J. Kowalkowski, S. Lammel, M. Lancaster, F. Ratnikov, E. Sexton-Kennedy, I. Sfiligoi, T. Watts, E. Wicklund. Data Handling Software Components. Storage Management. S. Lammel - C 366. Data Management. Data Access Hierarchy.

zahina
Télécharger la présentation

The CDF Run II Data Catalog and Data Access Modules

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The CDF Run II Data Catalog and Data Access Modules P. Calafiura, J. Kowalkowski, S. Lammel, M. Lancaster, F. Ratnikov, E. Sexton-Kennedy, I. Sfiligoi, T. Watts, E. Wicklund

  2. Data Handling Software Components Storage Management • S. Lammel - C 366 Data Management

  3. Data Access Hierarchy • Data view • Dataset • Run Section • Storage view • (Tape) Stream • Fileset/Partition • File

  4. Reading Data Transparent Storage Management Logical Data Selection

  5. Writing Data Temporary disk space management Fileset Creation Log progress

  6. The File Catalog • Locate file(set)s belonging to a dataset from • a time range • a run range • applying quality cuts, … • Log output files and filesets info • Maintain tape management info • Log job progress (error recovery, checkpoint-restart) • C++ API • Command-line and web based tools • Distributed access

  7. The File Catalog Clients DFC DBManager Data Logger Offline Farm L3 Farm Reader Writer Filtered Data Data Logger Raw Data Writer Oracle MSQL

  8. The DBManager Package • J. Kowalkowski C236 Poster • DBMS-independent C++ API (calibration,geometry,DFC) • type-safe mapping table rows transient C++ objects • smart pointers • lazy instantiation • caching • update pointer when new key notified • pluggable factory to select DBMS at run time • code generator • provide binding (Oracle, MSQL, JDBC, text) for predefined queries • java-based table description language

  9. Data Handling Input Module • Module of the Babar/CDF AC++ framework • Invisible to users • Select relevant filesets in a logical fashion • Iterate over them • stage ahead • out-of-order • Mantain state of request for error recovery

  10. Data Handling Output Module In Out • AC++ Module • close files at target size but • aligned to run section boundaries (keep events from a section together) • Log output files info into catalog • Commit blocks of completed files to the DIM

  11. Data Logger Offline Farm L3 Farm Filtered Data Data Logger Raw Data Status and Outlook • Defined Interfaces between all components • All components have at least a prototype implementation • Successful system integration for Mock Data Challenge 1 • T. Watts C 268 (tomorrow) • Improve performance and reliability

More Related