1 / 19

Analysis of CMS Heavy Ion Simulation Data Using ROOT/PROOF/Grid

Analysis of CMS Heavy Ion Simulation Data Using ROOT/PROOF/Grid. Jinghua Liu for Pablo Yepes, Jinghua Liu Rice University, Houston, TX Maarten Ballintijn, Gunther Roland, Bolek Wyslouch, Jinlong Zhang MIT, Cambridge, MA Supported by NSF grants #0218603, #0219063. Outline.

nida
Télécharger la présentation

Analysis of CMS Heavy Ion Simulation Data Using ROOT/PROOF/Grid

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Analysis of CMS Heavy Ion Simulation Data Using ROOT/PROOF/Grid Jinghua Liu for Pablo Yepes, Jinghua Liu Rice University, Houston, TX Maarten Ballintijn, Gunther Roland, Bolek Wyslouch, Jinlong Zhang MIT, Cambridge, MA Supported by NSF grants #0218603, #0219063

  2. Outline From data analysis user’s point of view • Why: ROOT/PROOF/Grid • How: Step by Step • What: Test Result • Summary Other PROOF talks in this conference: Fons Rademakers Maarten Ballintijn

  3. ROOT/PROOF • ROOT as a data analysis tool • PROOF: Parallel ROOT Facility ,based on and part of ROOT • on clusters of heterogeneous machines • parallel analysis of objects in a set of files • parallel execution of scripts • Transparency, Scalability, Adaptability, Error handling, Authentication • “Bring the KB to the PB not the PB to the KB” KB: code-->CPU, PB: data Use distributed CPUs to analyze distributed data

  4. PROOF/Grid Interface • Use a Grid Resource Broker to detect which nodes in a cluster can be used in the parallel session • Use Grid File Catalogue and Replication Manager • Utilize Grid Monitoring Services • Support Globus Authentication • Abstract Grid interface

  5. Step by Step • Setup PC cluster(s) (for PROOF/Grid) • Prepare the data files • Write analysis code (algorithm) • Compile a data set for PROOF • Run a PROOF job • Get the results

  6. PC Clusters • Client machine (desktop) P4 @ 1.8GHz /512MB/40GB • Cluster1: 2 Dual Xeon @ 2.4GHz /1GB/360GB 1 Dual Athlon @ 1.73GHz /1GB/240GB 8 Dual PIII @ 400MHz /512MB/60GB • Cluster 2: 3 Dual Athlon @ 1.67GHz /2GB/200GB • Operating systems: RedHat 6.1, RedHat 7.3, Slackware 8.1 • Globus version: 2.2

  7. CMS Heavy Ion Simulation • Jet & high-pT particle angular correlation • Use Calorimeters only

  8. CMS Heavy Ion Simulation • Pythia (event generator): 10,000 jet events • Hijing (Heavy Ion event generator): 1000 events • Each Hijing event (dN/dy~5000) was divided into ~500 sub-events • Randomly re-combine 500 sub-events (from different events) to form a new Hijing event, a cheap way to obtain more Monte Carlo events • CMSIM (GEANT 3 based simulation program for CMS)

  9. Data Production: Globus Jobs • Globus used to submit & manage the jobs • No data replication (files were intentionally stored locally)

  10. Build ROOT Tree • Superimpose jet events on top of Hijing events and generate ROOT Tree • Standalone code linked with ROOT libraries CMS: Ecal (Electromagnetic Calorimeter): barrel 61200 cells, endcap 14648 cells HCal (Hadronic Calorimeter): 14616 cells (multi-layer) 4032 towers calotree--Ecal cells (energy, position) Hcal towers (energy, position) • 10,000 events were split into 100 files, 100 events each, file size ~160MB, total data 16GB • Data distributed, each node got some local files

  11. TSelector – The Algorithms • Create TSelector from TTree $ root root[0] TFile f(“heavyion001.root”) root[1] calotree->MakeSelector(“myselector”) root[2] .q $ ls myselector.C myselector.h • Add the analysis code (algorithm) into TSelector $ vi myselector.h $ vi myselector.C

  12. TSelector – The Algorithms • myselector.h Class myselector : public TSelector { public: TTree *fChain; . . private: TH1F *hist1d; TH2F *hist2d; . . . }

  13. TSelector – The Algorithms • myselector.C void myselector::Begin(TTree *tree) { hist1d = new TH1F(“DeltaPhi”,”DeltaPhi”,100,180.,180.); Hist2d = new TH2F(“EtaPhi”,”EtaPhi”,100,-5.,5.,100,-4.,4.); fOutput->Add(hist1d); fOutput->Add(hist2d); } Bool_t myselector::Process(Int_t entry) { user’s analysis code goes here! for(i=0; i< nclusters; i++) { if (Et1>5) for(j=i+1; j< nclusters; j++) { if(Et2>5) { DeltaPhi= … hist1d->Fill(DeltaPhi); }

  14. TDSet – Data Location • Specify a collection of TTrees or files [] TDSet *ds = new TDSet(“TTree”, “calotree”); [] ds->Add(“/data1/cms/cmsim/heavyion001.root”); [] ds->Add(“/data1/cms/cmsim/heavyion002.root”); … [] ds->Add(“lfn://pcs21.rice.edu/data5/heavyion110.root”); [] ds->Add(“lfn://pcs11.rice.edu/cms/cmsim/heavyion230.root”); … [] ds->Print(); • It’s better to put these into a macro • Returned by DB or File Catalog query etc

  15. Running a PROOF Job $ root [] gROOT->Proof(“proofmaster.rice.edu”); [] TDSet *ds = new TDSet(“TTree”, “calotree”); [] ds->Add(“. . .”); . . . [] ds->Process(“myselector.C+”, “options”, nentries, first); (note: options must be pre-coded in myselector.C) [] TH1F *h1=(TH1F *)gProof->GetOutput(“DeltaPhi”); [] h1->Draw();

  16. Angular Correlation

  17. Scale plot • Analysis speed vs. CPUs (PIII 1GHz equivalent) • CPU power/data size balanced • CPU intensive calculations

  18. Summary • CMS Heavy Ion Analysis implemented and tested with PROOF • Scales well with CPUs • PROOF/Grid can provide the data analysis power unavailable otherwise. This power can be achieved without much extra effort • PROOF/Grid interface is under rapid development. The plan is to extend the presented study to use Grid interface

  19. The End

More Related