1 / 27

Introduction to PAT

Introduction to PAT. PAT Tutorial – CERN – December 2011 Felix Höhle. RWTH Aachen. Content. Part I: Brief review of CMSSW Framework Essentials Event Data Model (EDM). Part II: Introduction to PAT The PAT Dataformat The PAT Workflow. Framework Essentials. Framework Essentials.

adriel
Télécharger la présentation

Introduction to PAT

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Introduction to PAT PAT Tutorial – CERN – December 2011 Felix Höhle RWTH Aachen

  2. Content Part I: Brief review of CMSSW • Framework Essentials • Event Data Model (EDM) • Part II: Introduction to PAT • The PAT Dataformat • The PAT Workflow

  3. Framework Essentials

  4. Framework Essentials • One executable cmsRunwhich can be configured with python files • Those files contain configurations and parameters for modules written in C++ • You can compose your analysis with these modules • The python config file defines: • Which data is used • Which modules are executed, their parameters and execution order (path) • How these paths are connected to output files The Framework offers to you:

  5. Framework Essentials The Framework has predefined types of modules: • EDAnalyzer:Reads collections and creates histograms • EDFilter:Reads collections and returns a boolean • EDProducer: Reads a collection and writes a new collection in the Event • There are routines to create skeletons for these modules: • mkedanlzr mkedfltr mkedprod • These create the necessary substructure for the modules • BuildFile.xml: Needed for compilation • myanalyzer_cfi.py: Demo python configuration • Myanalyzer.h:Header file • Myanalyzer.cc: Definition file • Compilation with: $ scram b

  6. Framework Essentials The Framework contains a lot of very useful tools: • ROOT is able to read your datafiles: • $ root –l myfile.root

  7. Framework Essentials The Framework contains a lot of very useful tools: • ROOT is able to read your datafiles: • $ root –l myfile.root • Have a look in datafiles with edmDumpEventContent and edmEventSize –v

  8. Framework Essentials The Framework contains a lot of very useful tools: • ROOT is able to read your datafiles: • $ root –l myfile.root • Have a look in datafiles with edmDumpEventContent and edmEventSize –v • Config files can be checked and investigated with python –i • $ python –i myfile_cfg.py • >>> process.mypath • cms.Path(MyJetAnalyzer) • >>> process.MyJetAnalyzer • cms.EDAnalyzer(”MyAnalyzer”, jetTag = cms.InputTag(”myjets”) )

  9. Framework Essentials The Framework contains a lot of very useful tools: • ROOT is able to read your datafiles: • $ root –l myfile.root • Have a look in datafiles with edmDumpEventContent and edmEventSize –v • Config files can be checked and investigated with python –i • And interactively with the edmConfigEditor

  10. Event Data Model (EDM) • The EDM is centered around the concept of an Event • An edm::Event is a C++ Container for RAW and reconstructed data of a particular collision • It is built up of several independent ROOT trees, each entry corresponds to a particular collision • One ROOT tree for one object class Basics of the Event Data Model:

  11. Event Data Model (EDM) • The EDM is centered around the concept of an Event • An edm::Event is a C++ Container for RAW and reconstructed data of a particular collision • It is built up of several independent ROOT trees, each entry corresponds to a particular collision • One ROOT tree for one object class • They are connected via SmartPointersedm::Ref, edm::Ptr, … Basics of the Event Data Model:

  12. Event Data Model (EDM) • Modules can only communicate via the Event • The Event can be extended by modules which can add collections (EDProducer) Basics of the Event Data Model:

  13. Event Data Model (EDM) • Modules can only communicate via the Event • The Event can be extended by modules which can add collections (EDProducer) • These collections are identified within the Event by four quantities:C++ class type, module label, sublabel within module and process name • These is shown in the edmDumpEventContent command Basics of the Event Data Model:

  14. FWLite: A light Version of EDM This is ROOT with known data formats • PAT isfully compatiblewith (and even especially supports) FWLite. • No writingto the event content! • Full framework ↔ FWLite: This isnotan exclusive or! • Python configuration, edm::Handle, TFileService, data access equivalent to EDM • Very useful for plotting and interactive analysis • Have a look at: WorkBookFWLite

  15. Event Data Model (EDM) Difficulties with the EDM: • Retrieval of high level information for an analysis is complicated pointer arithmetic! (What do I need? Where do I find it?) • Reduction to the data needed for an high level analysis is complicated due to high complexity of connections between collections.(Where is the dropped data used troughout the Event?)

  16. Top 5 Analyst‘s Problems PAT can help you with these problems!

  17. Part II Part I: Brief review of CMSSW • Framework Essentials • Event Data Model (EDM) • Part II: Introduction to PAT • The PAT Data Format • The PAT Workflow

  18. What it is the Physics Analsis Toolkit? • PAT is a toolkit which is an integral part of the CMSSW Framework • It is an interface between the some times complicated EDM and the simple mind of a common user • It serves as well tested and supported ground for user and group analysis • It facilitates the reproducibility and comprehensibility of an analsis • If another CMS analyst describes you a PAT analysis you can easily knowwhat he/she is talking about

  19. What it is the Physics Analsis Toolkit? Three main aspects of PAT: Interface: • Between RECO expertise and analysis contacts • Simplifies access via dataformats • Canalizes expertise of POG and PAG Common Tool: • approved algorithms & sensible defaults • synergy (everybody can profit from recent developments) • quick start into analysis for the beginners Common Format • facilitates transfer & comparisons • PAG common configurations • sustained provenance

  20. Facilitated Access to Event Information PAT summarizes information for you: The reco::Candidate is a base class common to all kind of “particles” It has a lot of information from different subdetectors and reconstruction algorithms PAT objects summarize this information which is distributed over different collections When you are using PAT it is just calling a member function to get this information!

  21. PAT Data Formats Concept of PAT Data Formats: All pat::Objects inherit from their corresponding reco::RecoCandidates Additional information (e.g. overlap with other objects) is accessible A PAT Candidate is a Reco Candidate + more All reco::Candidate information is accessible, you don't need to know the details!

  22. PAT Data Formats Have a look in the online documentation: https://twiki.cern.ch/twiki/bin/view/CMSPublic/WorkBookPATDataFormats

  23. PAT Data Formats The PAT Data Formats are configured by the user via the _cfi.py files: Size: 14 kb/event ( for ttbar)

  24. The PAT Workflow Steps of the PAT Workflow: Candidate Creation: aodReco collecting of information which is not in AOD/RECO, e.g. isolation variables, overlaps, … Candidate Production: patCandidates translation of the collected information into pat::Object e.g. pat::Muon, pat::Electron, pat::Jet Candidate Selection: selectedPatCandidates selection of interesting Objects with specific properties e.g. pT > 30 GeV Candidate Disambiguation : cleanPatCandidates Due to the way objects are reconstructed in CMS there are ambiguities: e.g. two objects sharing an energy deposit or track

  25. The Code Location • DataFormats/PatCandidates • Definition of all PAT Candidates. • pat::Photon, pat::Electron, pat::Muon, pat::Tau, pat::Jet, pat::MET, … • PhysicsTools/PatAlgos • Implementation and filling of all data formats. • Definition of common workflow and PAT tools • PhysicsTools/PatUtils • Definition of common tools and helper functions used in • PatAlgos • PhysicsTools/PatExamples • Location of many examples e.g. all non-trivial examples used during this • Tutorial

  26. Documentation • SWGuidePATandWorkBookPATmain documentation pages • WorkBookPATDataFormatsdescription of all PAT Candidate • WorkBookPATWorkflowdescription of the PAT workflow • WorkBookPATConfigurationdescription of the configuration of PAT • SWGuidePATToolsdescription of all PAT tools • WorkBookPATTutorialtutorials and examples to get started • SWGuidePATRecipesinstallation recipes • SWGuidePATEventSizetools for event size estimate • And last but not least: This Tutorial and/or former Tutorials...

  27. Exercises By now you should be prepared to do the following Exercises on WorkBookPATTutorial: Have Fun! Exercise 1:(WorkBookPATDocNavigationExercise) The PAT Documentation is one of the most looked after parts of the WorkBook. To know the documentation and how to use it can speed up your learning curve enormously. Learn more about the PAT Documentation and how to make effective use of it. Exercise 2: (WorkBookTupleCreationExercise) Learn how the default PAT tuple is produced Exercise 3: (SWGuidePATConfigExercise) Learn how to configure PAT and its tools.

More Related