Download
some thoughts for the future may be n.
Skip this Video
Loading SlideShow in 5 Seconds..
Some thoughts for the future (may be) PowerPoint Presentation
Download Presentation
Some thoughts for the future (may be)

Some thoughts for the future (may be)

397 Vues Download Presentation
Télécharger la présentation

Some thoughts for the future (may be)

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

  1. Some thoughtsfor the future(may be) ROOT team meeting 27 January 2006 René Brun CERN Some thoughts

  2. Observations • A considerable amount of time is spent in installing software (up to one day for an expert). • Porting to a new platform is non trivial. • Dependency problems in case many packages must be installed. • Only a small subset of the software is used. • The installation may require a huge amount of disk space. Users are scared to download a new version. • This is not fitting well with the GRID concept. • The GRID should be used to simplify this process and not to make it more complex. Some thoughts

  3. Atlas packages with > 10000 lines 211677 dice fortran=211641 187691 atrecon fortran=138126,cpp=49354 129793 MuonSpectrometer fortran=121321,python=3715,csh=2613,sh=2136 118504 Tools cpp=67337,ansic=19012,python=13770,sh=7373,yacc=5659, fortran=3024,lex=1971 116327 PhysicsAnalysis cpp=107348,python=6070,sh=1649,csh=1260 115143 geant3 fortran=115040,ansic=67 112445 TileCalorimeter cpp=108580,python=2209,csh=920,sh=736 108200 atutil fortran=108000,ansic=164 80866 Applications fortran=71764,cpp=6961,ansic=1865 74721 Calorimeter cpp=65917,python=7854,sh=490,csh=460 67822 atlfast fortran=67786 64838 Tracking cpp=60255,python=2092,csh=1380,sh=1104 59429 Generators fortran=28136,cpp=25538,python=4123,sh=872,csh=760 49926 graphics java=40719,cpp=8312,python=321,sh=255,csh=220 40058 AtlasTest cpp=25159,python=5131,sh=4815,perl=4145,csh=517 39576 Control cpp=22030,python=15904,sh=907,csh=693 31192 DetectorDescription ansic=29540,csh=680,sh=562,python=343 29500 TestBeam cpp=27433,python=1491,csh=320,sh=256 25001 Reconstruction sh=10297,fortran=7559,python=5393,csh=1667 18989 atlsim fortran=17561,cpp=1380 18328 InnerDetector python=11466,csh=2860,sh=2641,ansic=1343 17291 Simulation python=13653,sh=2126,csh=1302,fortran=169 16139 Database perl=8310,sh=4299,java=2209,csh=709,python=566 14250 Event cpp=13522,python=296,csh=240,sh=192 12930 gcalor fortran=12894 11955 Trigger python=7860,csh=1780,sh=1673,perl=634 11195 LArCalorimeter python=6133,ansic=2045,csh=1620,sh=1347 3 million lines of code 1200 packages Some thoughts

  4. Alice packages with > 10000 lines 398742 PDF fortran=398729,ansic=13 146414 PYTHIA6 fortran=140748,cpp=5413,ansic=153,pascal=100 128337 HLT cpp=127601,ansic=605,sh=100,csh=31 128103 ITS cpp=128010,sh=93 105763 MUON cpp=105673,sh=90 94548 DPMJET fortran=94267,cpp=281 72400 STEER cpp=72400 52443 HBTAN cpp=51260,fortran=1183 51489 TPC cpp=51479,sh=10 50932 PHOS cpp=50639,csh=293 46176 TRD cpp=46176 41998 ISAJET fortran=40483,cpp=1494,pascal=21 39407 RALICE cpp=29764,ansic=9355,sh=288 35916 EMCAL cpp=35410,fortran=383,csh=123 31820 ANALYSIS cpp=31820 27751 HERWIG fortran=27246,cpp=477,ansic=28 27025 FMD cpp=27021,sh=4 26667 TOF cpp=26667 24258 EVGEN cpp=24258 21588 HIJING fortran=21099,cpp=489 20562 JETAN cpp=19687,fortran=875 18344 RAW cpp=18344 15232 STRUCT cpp=15232 13142 PMD cpp=13142 12945 RICH cpp=12945 10966 FASTSIM cpp=10966 10944 MONITOR cpp=10944 10659 ZDC cpp=10659 1.5 million lines of code Some thoughts

  5. Fraction of code really used in one program %functions used %classes used Some thoughts

  6. LHC software Some thoughts

  7. ROOT source, bins, dict,libs *.h 153 kl 6.4 Mb SLC3/gcc3.2.3 Windows/vc++7.1 rootcint –cint 56s, 71s rootcint –reflex 58s, 71s rootcint –gccxml 300s, 100s *.cxx 855 kl 100 Mb Xdict_c.cxx 704 kl Xdict_r.cxx 623 kl Xdict_g.cxx 623kl c++ 338s, 90s c++ 420s, 417s c++ 427s, 421s c++ 2640s, 1614s *.o 41 Mb, 114 Mb Xdict_c.o 44 Mb, 53 Mb Xdict_r.o 51Mb, 65 Mb Xdict_g.o 51Mb, 65 Mb ld 15s, 45s *.so, .lib 88 Mb, 71 Mb Some thoughts

  8. Source of inefficiencieswhen compiling • Always compile your dictionaries with –O0. It does not make any difference at execution time if dictionaries are compiled with –O0, -O1 or –O2. • Example, time to compile G__Base1.cxx • -O0 20s • -O1 40s • -O2 60s • Always use local files • Use forward declarations as much as possible. • Abuse of templates/STL is a real killer (see later) Some thoughts

  9. Serious problem with STL • STL containers are nice. However they have a high cost in a real large environment. • Compiling code with STL is much much slower • Object modules are bigger • The compiler or linker is able to eliminate duplicate code in ONE object file or shared lib, not across libraries. • If you have 100 shared libs, it is likely that you have the code for std:vector push_back or iterators 100 times! • Inlining is nice if used with care (or toy benchmarks). It may have an opposite effect, generating more cache misses in a real application. • Templates are statically defined and difficult to use in an dynamic interactive environment. Some thoughts

  10. Some thoughts

  11. Source of inefficiencieswith shared libs • fPIC (Position Independent Code) introduces a 20 per cent degradation (10 to 30%) • In case of many shared libs, the percentage of classes and code used is small =>swapping (20%) • Because shared libs are generated for maximum portability, one cannot use the advanced features of the local processor when compiling. • The same optimization level is used everywhere • But a very large fraction of the code does not need to be optimized :no gain at execution, big loss when compiling • A small fraction of the code should be compiled with the highest possible optimization (10%) • May be a factor 2 loss !!! Some thoughts

  12. Can we gain something with a better packaging? • Yes and no • 1 shared lib per class implies more administration, more dictionaries, more dependencies. • 80 shared libs for ROOT is already a lot • 500 would be non sense • Plug-in Manager helps Some thoughts

  13. Shared libs vs Archive libs • In the Fortran era, often one subroutine/file • Loader takes only the subroutines really referenced. However the percentage of referenced but not used code has increased with time. • Shared libs were efficient at a time where code could be shared between different tasks on time sharing systems. • Shared libs have solved partially the link time problem. • Shared libs are not a solution for the long term. • Archive libs are unusable in a large system, but nice to build static modules • What to do ? Some thoughts

  14. Some thoughts

  15. Some thoughts

  16. memory Cint 10000 l/s c++ 800 l/s ld myapp *.cxx, *.h 70 Mb *.o 110 Mb *.so 76 Mb Some thoughts

  17. Proposal for a new scenario Introducing BOOT A Software Bootstrap system Some thoughts

  18. What is BOOT? • A small, easy to install, standalone executable module ( < 5 Mbytes) • One click in the web browser • It must be a stable system that can cope with old and new versions of other packages including ROOT itself. • It includes: • A subset of ROOT I/O, network and Core classes • A subset of Reflex • A subset of CINT (could also have a python flavour) • Possibly a GUI object browser Some thoughts

  19. BOOT and existing applications • BOOT must be able to run with the existing systems, may be with reduced possibilities. • In the next slides, I show a few use cases to illustrate the ideas. • Do not take the syntax as a final word. Some thoughts

  20. BOOT: Use Case 1 • Assumes BOOT already installed on your machine user@xxx.yyy.zzz • Nothing else on the machine except the compiler (no ROOT, etc) • Import a ROOT file containing histograms, Trees and other classes (usecase1.root) • Browse contents of file • Draw an histogram Some thoughts

  21. h.Draw() local mode CINT libX11 ------- … drawline drawtext … libCore ------- … I/O TSystem … libGpad ------- … TPad TFrame … pm pm pm libGraf ------- … TGraph TGaxis TPave … libHist ------- … TH1 TH2 … libHistPainter ------- … THistPainter TPainter3DAlgorithms … pm pm Some thoughts

  22. Use Case 1 Usecase1.root (2 Mbytes) Contains references (URL) to classes in namespace ROOT http://root.cern.ch/coderoot.root This is a compressed ROOT file containing the full ROOT source tree automatically built from CVS (25 Mbytes) + ROOT classes dictionary DS generated by Reflex (5 Mbytes) + The full classes documentation Objects generated by the source parser (5 Mbytes) Local cache with the source of the classes really used + binaries for the classes or functions that are automatically generated from the interpreter (like ACLIC mechanism) user@xxx.yyy.zzz pcroot@cern.ch Some thoughts

  23. Use Case 1 pictures usecase1.root code.root Some thoughts

  24. Use Case 2 • BOOT already installed • Want to write the shortest possible program using some classes in namespace ROOT and some classes from another namespace YYYY //This code can be interpreted line by line //executed as a script or compiled with C/C++ //after corresponding code generation use ROOT, YYYY=http://cms.cern.ch/packages/yyyy h = new TH1F(“h’,”example”,100,0,1); v = new LorentzVector(….); gener = new myClass(v.x()); h.Fill(gener.Something()); h.Draw(); Some thoughts

  25. Use Case 3 • A variant of Use Case 2 • A bug has been found in class LorentzVector of ROOT and fixed in new version ROOT6 use ROOT, YYYY=http://cms.cern.ch/packages/yyyy use ROOT6=http://root.cern.ch/root6/code.root use ROOT6::LorentzVector h = new TH1F(“h’,”example”,100,0,1); v = new LorentzVector(….); gener = new myClass(v.x()); h.Fill(gener.Something()); Some thoughts

  26. Use Case 4 • High Level ROOT Selector understanding named collections in memory (ROOT,STL) or collections in ROOT files. use ROOT use ATLFAST=http://atlas.cern.ch/atlfast/atlfastcode.root TFile f(“mcrun.root”); for each entry in f.Tree for each electron in Electrons h.Fill(electron.m_Pt); h.Draw Some thoughts

  27. Use Case 5: Event Displays • In general Event Displays require the full experiment infrastructure (Pacific, Obelix, WonderLand,Crocodile). • This is complex and not good for users and OUTREACH. • A data file with the visualization scripts is far more powerful • This implies that the GUI must be fully scriptable. This is the case for ROOT GUI. data scripts Some thoughts

  28. Requirements: work to do • libCore has already all the infrastructure for client-server communications and for accessing remote files on the GRID. • We must understand how to use subsets of the compilers and linkers to bypass disk I/O. • We must understand how to emulate a dynamic linker using pre-compiled objects in memory. • We have to investigate various code generation tools and the coupling with an extended version of CINT (and possibly python). • We must understand how to use the STL functionality without its penalty. Dynamic templates are also necessary. Some thoughts

  29. Procedure • These are just ideas. Making a firm proposal requires more investigations and prototyping. • It must be clear that the top priority is the consolidation of ROOT to be ready for LHC data taking. This should not be an excuse to not look forward. • It is my intention to continue this work as a background activity. Some thoughts