1 / 14

Towards a Framework for Organized Analysis

Andreas Morsch Weekly Offline Meeting 31/5/2007. Towards a Framework for Organized Analysis. Why Organized Analysis ?. Most efficient way for many users (analysis tasks) to read and process the full data set. In particular if resources are sparse. Optimise CPU/IO ratio But also

Télécharger la présentation

Towards a Framework for Organized Analysis

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Andreas Morsch Weekly Offline Meeting 31/5/2007 Towards a Framework forOrganized Analysis

  2. Why Organized Analysis ? • Most efficient way for many users (analysis tasks) to read and process the full data set. • In particular if resources are sparse. • Optimise CPU/IO ratio • But also • Helps to develop a common well tested framework for analysis. • Develops common knowledge base and terminology. • Helps documenting the analysis procedure and makes results reproducible.

  3. Scope • Focus on production of AODs from ESD/AOD

  4. Design Goals • Flexible task and data container structure • User code independent of computing schema (interactive: local/proof or batch: grid) • Input data: ESD, AOD • Same design (done) • Common base class ? • Output data: • AOD + user histograms • Transparent handling of memory resident and file resident data

  5. Implementation • Analysis train/taxi similar to PHENIX • Based on the existing AliAnalysisManager/Task framework (A. and M. Gheata)

  6. Organization of Data and Tasks • Input data staging ? • Several trainlets on sub-data sets staged prior to train departure. • Better: One analysis “train” on the complete data set. • Limits the complexity of the production. • Should be designed to give the optimum under all conditions.

  7. Organization of Tasks • Proposal • On top level • Tasks reading ESDs/AOD and producing AODs. • Organized by analysis manager • Below top level • Sub tasks producing intermediate transient data • Organized by users (PWGs)

  8. Organization of Data and Tasks • Organisation of analysis tasks • One sub-job per task • Better: One job executing all tasks. • Protection against sub-task crashes • “Isolate” tasks using C++ try-throw-catch mechanism • Check memory / task • Check output data size / task • Protection against data corruption • Access rights per task • Dynamic cancelling of tasks • Input data quality checks • could be the first task in the row • Robust book-keeping

  9. GRID/PROOF • Transparence of computing schema • Some improvements in AliAnalysisManager/AliAnalysisTask • Possibility to notify tasks when file is changed in chain (done) • More robust output data streaming (done) • Possibility to flag tasks as “post event loop”-tasks (done) • Handling of file resident data • PROOF uses object streaming • What is a streamable object/task ? • Needs exact defintion. • Attention • Normally persistent objects are streamed • Here: transient object are streamed !! • Needs user support and documentation

  10. Possible Integration of User Code AliAnalysisTask Steers Delegates AliAnalysisUserTask User AnalysisCode Implements Interface Deals with AliAODEvent Documents selection and analysis parameters Factory Configuration Macro Working prototype for AliAnalysisTaskJets exists

  11. Who manages the common output objects AliAODEvent and AOD Tree ? • What has to be called when • SlaveBegin • AliAODEvent constructor • Open file • AOD tree constructor • ExecuteAnalysis • AOD tree fill • AliAODEvent Clear • Terminate • WriteTree • Close File

  12. Possible Solution • Header and trailer analysis tasks handling the AliAODEvent I/O • Top Task with user tasks as daughters • AliAnalysisManagerAOD deriving from AliAnalysisManager and re-implementing • SlaveBegin • ExecuteAnalysis • Terminate • Virtualize AliAODEvent (AliVEvent) and add calls to an object of type AliVEvent in AliAnalysisManager. Move has much code as possible into AliAODEvent • CreateTree • FillTree • Clear • ,,,

  13. AliAnalysisManager AliAODEvent AliAnalysisManagerAOD AliVEvent AliAnalysisManager AliAODEvent AliVOutputEventHandler AliAnalysisManager AliAODEventHandler AliAODEvent

  14. More AOD I/O Management Tasks • Granting access rights per branch • Consistency checks • Chopping output files • …

More Related