1 / 30

AIDA Abstract Interfaces for Data Analysis

AIDA Abstract Interfaces for Data Analysis. Andreas Pfeiffer CERN IT/API andreas.pfeiffer@cern.ch. Outline. What is AIDA History/Collaboration/Documentation Some Details Examples Ongoing work Summary. What is AIDA. Abstract Interfaces for Data Analysis (AIDA)

brone
Télécharger la présentation

AIDA Abstract Interfaces for Data Analysis

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. AIDAAbstract Interfaces for Data Analysis Andreas Pfeiffer CERN IT/API andreas.pfeiffer@cern.ch Andreas Pfeiffer, CERN/IT-API, andreas.pfeiffer@cern.ch

  2. Outline • What is AIDA • History/Collaboration/Documentation • Some Details • Examples • Ongoing work • Summary Andreas Pfeiffer, CERN/IT-API, andreas.pfeiffer@cern.ch

  3. What is AIDA • Abstract Interfaces for Data Analysis (AIDA) • “The goals of the AIDA project are to define abstract interfaces for common physics analysis objects, such as histograms, ntuples, fitters, IO etc.The adoption of these interfaces should make it easier for developers and users to select to use different tools without having to learn new interfaces or change their code. In addition it should be possible to exchange data (objects) between AIDA compliant applications.” Andreas Pfeiffer, CERN/IT-API, andreas.pfeiffer@cern.ch

  4. Motivation • Advantages • The user needs to learn only one set of interfaces • Same user code can be used with different AIDA-compliant analysis applications • Pool experience of different developer teams • LHC++, OpenScientist, JAS • Different analysis tools can exchange analysis objects • same storage format, use functionality from other tools • Two versions of AIDA interfaces • One for C++ • One for Java • As identical as possible Andreas Pfeiffer, CERN/IT-API, andreas.pfeiffer@cern.ch

  5. Abstract Interfaces • Abstract Interfaces • only pure virtual methods, inheritance only from other A.I. • components use other components onlythrough their A.I. • defines a kind of a “protocol” for a component • Maximize flexibility and re-use of packages • allow each component to develop independently • re-use of existing packages to implement components reduces start-up time significantly • De-couple implementation of a package from its use Andreas Pfeiffer, CERN/IT-API, andreas.pfeiffer@cern.ch

  6. A I D A Analysis tool 1 Analysis tool 2 User code (e.g. GEANT4) AIDA Example • Use same code with any AIDA-compliant analysis tool. Andreas Pfeiffer, CERN/IT-API, andreas.pfeiffer@cern.ch

  7. Architectural issue: Components (I) • Identify components by functionality • Define “protocol” using Abstract Interfaces • Emphasize separation of different aspects for each component • Example: Histogram • statistical entity (density distribution of a physics quantity) • view of a “collection of data points” (which can be a density distribution but also a detector efficiency curve) • command to manipulate/store/plot/fit/... • “User’s view” is different from “implementor’s (developer’s) view” • separate Abstract Interfaces for both aspects Andreas Pfeiffer, CERN/IT-API, andreas.pfeiffer@cern.ch

  8. User Code Histo-IF Fitter-IF Histo- Impl. 1 Fitter- Impl. X Fitter- Impl. Y Use of Components withAbstract Interfaces • User Code uses only Interface classes • IHistogram1D * hist = histoFactory-> create1D(‘track quality’, 100, 0., 10.) • Actual implementations are selected at run-time • loading of shared libraries • No change at all to user code but keep freedom to choose implementation Histo- Impl. 2 Andreas Pfeiffer, CERN/IT-API, andreas.pfeiffer@cern.ch

  9. Across the languages • JAida : C++ access to Java libs • using C++ proxies implementing the C++ Abstract Interfaces to the Java interfaces C++UserCode AIDA-IF C++ JAida AIDA-IF Java Java Lib Andreas Pfeiffer, CERN/IT-API, andreas.pfeiffer@cern.ch

  10. XML standards • Started with 1D and 2D Histograms • aim: easy transfer between applications • Will extend to other data types • other histos, fits, ntuples, … • Comments/contributions welcome ! Andreas Pfeiffer, CERN/IT-API, andreas.pfeiffer@cern.ch

  11. History • Initial idea formed during discussion at HepVis-99 workshop at Orsay • Informal AIDA discussions at CERN in 2000 • AIDA workshops: • January 2001 - Paris/Orsay • April 2001 - Boston (preceding HepVis 2001) • Informal meetings (e.g during Geant4 meetings and video conferences) • June 2002 – CERN • Interfaces have been designed by discussion and (eventual) consensus • Takes some time, but result is well though out and robust Andreas Pfeiffer, CERN/IT-API, andreas.pfeiffer@cern.ch

  12. Organization - Developers • No formal collaboration/author list. • Some people who have contributed (ideas, code, etc). • Guy Barrand, Pavel Binko, Mark Donszelmann, Wolfgang Hoschek, Tony Johnson, Emmanuel Medernach, Dino Ferrero Merlino, Lorenzo Moneta, Jakub Moscicki, Ioannis Papadopoulos, Andreas Pfeiffer, Max Sang, Victor Serbo, Max Turri • Apologies to people accidentally missed Andreas Pfeiffer, CERN/IT-API, andreas.pfeiffer@cern.ch

  13. Organization – Code, Documentation • AIDA – open source project • CVS repository: cvs.freehep.org • “anonymous” download available • Web page: http://aida.freehep.org • General information, relevant links • Tutorial, users’ guide, examples • Downloads and web-browsable source code • Test cases (coming soon) Andreas Pfeiffer, CERN/IT-API, andreas.pfeiffer@cern.ch

  14. Current Status • AIDA Version 2.2 released (December 2001) • First “End User” release • Three implementations of AIDA exist • Anaphe/Lizard (C++) • http://anaphe.web.cern.ch/anaphe • Open Scientist (C++) • http://www.lal.in2p3.fr/OpenScientist • JAIDA/JAS (Java) + AIDA-JNI 1.0 (C++) • http://java.freehep.org/lib/freehep/doc/aida • GEANT4 adopted AIDA for analysis • AIDA 3 Currently under discussion • Release foreseen for Sep 2002 Andreas Pfeiffer, CERN/IT-API, andreas.pfeiffer@cern.ch

  15. AIDA Interfaces Summary • AIDA Factories • ITuple • IHistogram • ICloud • ITree Andreas Pfeiffer, CERN/IT-API, andreas.pfeiffer@cern.ch

  16. AIDA Design Andreas Pfeiffer, CERN/IT-API, andreas.pfeiffer@cern.ch

  17. ITreeFactory IAnalysisFactory IPlotterFactory (from AIDA-2.2) IStore (from AIDA-2.2) (from AIDA-2.2) (from AIDA-2.2-dev) <<virtual>> ~ITreeFactory() <<abstract>> create() <<virtual>> ~IStore() <<abstract>> close() IPlotter <<abstract>> commit() (from AIDA-2.2) IManagedObject ITree <<abstract>> select() (from AIDA-2.2) (from AIDA-2.2) <<abstract>> write() IDevTree (from AIDA-2.2-dev) <<virtual>> ~IDevTree() <<abstract>> add() <<abstract>> store() IAnnotation (from AIDA-2.2) IHistogram ICloud (from AIDA-2.2) (from AIDA-2.2) IEvaluator IFunction (from AIDA-2.2) (from AIDA-2.2) IAxis ITuple (from AIDA-2.2) (from AIDA-2.2) IFilter (from AIDA-2.2) ICloud1D ICloud3D IHistogram1D IHistogram3D (from AIDA-2.2) (from AIDA-2.2) (from AIDA-2.2) (from AIDA-2.2) IHistogram2D ICloud2D (from AIDA-2.2) IFitFunction (from AIDA-2.2) (from AIDA-2.2) IEvaluatorFactory (from AIDA-2.2) IHistogramFactory IFunctionFactory ICloudFactory ITupleFactory (from AIDA-2.2) (from AIDA-2.2) (from AIDA-2.2) (from AIDA-2.2) IFilterFactory (from AIDA-2.2) Aida design (details) File: /data/pfeiffer/Rose/AIDA-2.2.mdl Sun Jun 30 17:18:39 2002 Class Diagram: Logical View / Main Page 1 Andreas Pfeiffer, CERN/IT-API, andreas.pfeiffer@cern.ch

  18. Example Program (Java) • Create, fill, and view 1D and 2D histograms: Andreas Pfeiffer, CERN/IT-API, andreas.pfeiffer@cern.ch

  19. ITuple • ITuple - interface to the Data • “get/set” methods for double, float, int, … • Information about columns: min, max, mean, rms • Navigating: start(), next(), skip(int nRows) • Project ITuple into 1D, 2D, 3D histogram • New features for AIDA 2.3: • Support for complex internal structures (subfolders) • Merging and chaining of ITuples under discussion Andreas Pfeiffer, CERN/IT-API, andreas.pfeiffer@cern.ch

  20. Details - ITuple • Interface to the Data: Andreas Pfeiffer, CERN/IT-API, andreas.pfeiffer@cern.ch

  21. IHistogram (1D-3D) • Binned histogram: IHistogram1D, 2D, 3D • “fill” methods (with/without weight) • Histogram info: entries, mean, rms, axis • Bin info: centre, entries, height, error • Histogram arithmetic: add, multiply, divide • Convenience methods, like coordinate-to-index conversion Andreas Pfeiffer, CERN/IT-API, andreas.pfeiffer@cern.ch

  22. IHistogram: Common functionality for all histograms (like entries, label, dimension,) IHistogram1D IHistogram2D IHistogram3D Details – IHistogram (1D-3D) Andreas Pfeiffer, CERN/IT-API, andreas.pfeiffer@cern.ch

  23. ICloud • Unbinned collection of points: ICloud1D, 2D, 3D • Can represent scatter plot, dynamically rebinnable histogram • Can be converted to a binned histogram • Standard “get/set” methods for entries • Collection info: lower, upper, mean, rms Andreas Pfeiffer, CERN/IT-API, andreas.pfeiffer@cern.ch

  24. ICloud: Common functionality for all histograms (like entries, label, dimension,) ICloud1D ICloud2D ICloud3D Details - ICloud Andreas Pfeiffer, CERN/IT-API, andreas.pfeiffer@cern.ch

  25. IFunction and Fitting • Fitting: IFunction, IFitFunction • IFunction – simple interface, allows to set parameters and get function value • IFitFunction – fit function to a histogram • Extends IFunction • Various fit control methods: step size, bounds, etc. • Allows to perform fit and get results • AIDA 2.2 fitting functionality fairly limited • AIDA 2.3 (Under discussion) extended functionality Andreas Pfeiffer, CERN/IT-API, andreas.pfeiffer@cern.ch

  26. ITree • ITree • directory-like structure (Unix directory convention) • Methods like: cd, ls, mkdir, etc. • AIDA analysis objects (tuples, histograms, clouds, ets.) exist within ITree directories • “save/restore” functionality, hides storage details from the user • Compatible with database or file storage • Can support multiple file formats • Mount/Unmount functionality (like unix) allows multiple stores to be seamlessly merged • AIDA XML format is defined for data interchange Andreas Pfeiffer, CERN/IT-API, andreas.pfeiffer@cern.ch

  27. Details - ITree • Directory-like structures: ITree Andreas Pfeiffer, CERN/IT-API, andreas.pfeiffer@cern.ch

  28. June 2002 Developers Workshop • 2 Day “Users Workshop” • 3 Day “Developer Workshop” • Items under discussion • Fitting (two proposals, from SLAC, CERN) • Similar but different, differences need to be resolved • Improved plotting (IPlotter) • Graph (XYData) • Tuple chaining, merging • Small updates/extensions to API • Input/Participation from people always welcome ! Andreas Pfeiffer, CERN/IT-API, andreas.pfeiffer@cern.ch

  29. Ongoing work • Refining Fitter and Plotter components • IFitter, IOptimizer, IFitResult • Iplotter, I*Styles for plotting • Developer-level interfaces • Code sharing • More robust operation • Put AIDA-based utilities in CVS • Utility to test AIDA implementations • User contributions • Common binary storage format Andreas Pfeiffer, CERN/IT-API, andreas.pfeiffer@cern.ch

  30. Summary • Abstract Interfaces de-couple components of frameworks • Weakly coupled components and frameworks have large advantages • User code needs no change if changing implementation • Even across “language boundaries” (JAIDA) • Ease of re-use of a component • Flexibility through independence of implementation • Maintainability through independent evolution of components • Example using Geant-4 and AIDA compliant analysis tools (see tutorial) Andreas Pfeiffer, CERN/IT-API, andreas.pfeiffer@cern.ch

More Related