1 / 6

Flat AOD

Flat AOD. Gheata ALICE weekly meeting 2 Dec 2013. Motivation. Important I/O for analysis jobs, decreasing job efficiency Big file size -> storage problems Format highly dependent on AliRoot types Other experiments are using ntuple -like data in end-user analysis

vian
Télécharger la présentation

Flat AOD

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Flat AOD Gheata ALICE weekly meeting 2 Dec 2013

  2. Motivation • Important I/O for analysis jobs, decreasing job efficiency • Big file size -> storage problems • Format highly dependent on AliRoot types • Other experiments are using ntuple-like data in end-user analysis • Investigations and tests at small scale already hint for improvements • Disabling non-needed branches -> x5 speed • Tests on analysis specific data skimming -> huge factors to gain • Discussed flattening/reducing AOD format • Useful for improving I/O performance • Useful for Run2/Run3 • Useful for data preservation (smaller, simpler, easier to share) • Useful for different levels of parallelism in analysis

  3. Exercise • Use AODtree->MakeClass() to generate a skeleton, then rework • Keep all AOD info, but restructure the format Int_tfTracks.fDetPid.fTRDncls -> Int_t *fTracks_fDetPid_fTRDncls; //[ntracks_] • More complex cases to support I/O: typedefstruct { Double32_t x[10]; } vec10dbl32; Double32_tfTracks.fPID[10] -> vec10dbl32 *fTracks_fPID; //[ntracks_] Double32_t *fV0s.fPx //[fNprongs] -> TArrayF *fV0s.fPx; //[nv0s_] • Convert AliAODEvent-> FlatEvent • Try to keep the full content AND size • Write FlatEvent on file • Compare file size and read speed

  4. Results • Tested on AOD PbPb: AliAOD.root ****************************************************************************** *Tree :aodTree : AliAOD tree * *Entries : 2327 : Total = 2761263710 bytes File Size = 660491257 * * : : Tree compression factor = 4.18 * ****************************************************************************** ****************************************************************************** *Tree :AliAODFlat: Flattened AliAODEvent * *Entries : 2327 : Total = 2248164303 bytes File Size = 385263726 * * : : Tree compression factor = 5.84 * ****************************************************************************** • Data smaller (no TObject overhead, TRef->int) • 30% better compression • Reading speed • Old: CPU time= 103s , Real time=120s • New: CPU time= 54s , Real time= 64s

  5. Implications • User analysis more simple, working mostly with basic types (besides the event) • Simplified access to data, highly reducing number of (virtual) calls • ROOT-only analysis • No problem with schema evolution, supported as before • Filtering hierarchical to flat AOD’s straightforward • Much better vectorizable track loops • Deltas would have to be also flattened

  6. Conclusions • Flattening the event structure has benefits • Sizeable gains in size and speed (>50%) • Many benefits for user code performance • Conversion straightforward • Does not rule out further skimming of data content per analysis type • An approach we will have to consider seriously for Run2 and specially Run3

More Related