1 / 11

TOC

TOC. - Introduction (where the hardware is coming from) - Time-table, (arrival, setting up the cluster, when it can be used) and problems we’ll face during installation of hardware - Current status of existing cluster in Belgrade

sirvat
Télécharger la présentation

TOC

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. TOC • - Introduction (where the hardware is coming from) • - Time-table, (arrival, setting up the cluster, when it can be used) • and problems we’ll face during installation of hardware • - Current status of existing cluster in Belgrade • - Current status of usage of IT resources for production in NA61 • Possible usage

  2. Introduction • As a process of bringing closer CERN and Serbia, Sergio Betrolucci, Director for Research and Scientific Computing at CERN,visitied several institutions in our country. • One of the results of his visit is that now we are looking at the CERN’s donation to Serbia in form of IT equipment. • The part of donation will go to Belgrade SHINE group • The hardware we are expecting to see in the Institute of Physics, Belgrade is;32 server units (512 cores)+4 spare units, and 6 Disk storage units. • This hardware was used in CERN, but we are not certain about exact specification of this hardware, until it arrives in Belgrade • . • Time-table • Problems • New space for cluster approved by IF, under construction on IF campus. Better wiring and cooling. • Additional expences – rack-s –possiblity to build our own. • up to 4 network switches • Current status of existing cluster in Belgrade • Current status of IT resources of NA61 • Possible sage • Plans for software

  3. Time table The initial CERN paperwork, customs formalities, shipping will be completed till the end of October The initial installation (placement of cluster) will be in specially adapted room. this will be finished till the end of November We are still missing racks for purpose of placing the units. The Institute of Physics will provide new better equipped space to store a cluster, and this space is currently under construction. The improved power installation, along with power and network cabling will be finished in December. We are still missing several network switches. And the software installation we plan to finish till February.

  4. Current status of Belgrade cluster • Sadasniresursi: • ================================================================== • FF (CORSAIR) • 1 webserver / vnc server (slc5 1core) • 1 file server (slc5 4 cores) • 2 working node-a (4 cores)1 slc5 (kinez 4 cores) i1 ubuntu 32b (4 cores)* na61/SHINE software slc5 i SHINE naunubtu, corsika slc5, wine slc5, geant4 slc5* Kopijafajlovakosmikeiz IF-a, KorsitiDraganubuntumasinu. web server. • ================================================================== • IF (Enterprise) • 1 server / working node (slc5) (4 cores) • 1 NAS file server (4 cores) • 1 working node (4 cores) • 1 web server (Dejanovracunar) (4 cores)* na61/SHINE software slc5, corsika slc5, wine slc5, geant4 slc5* koristizaprebacivanjeneutronskihpodataka, MihailoiMarijankoriste SHINE, geant4 simulacijeniskoenergetskekosmike, simulacijecorsika.* Dejanovracunar - web server, Kosmic robot - stalnaobradaodbrojakosmikeupodzemljuinapovrsiniistalnakorekcijanapritisak. Simulacijaskyshine-a. Corsikasimulacije. UrQMD simulations. • Hardware; • 1 server/web server • 4 working nodes • 2 file servers • Sofware • includes SHINE on ubuntu 32, SHINE on SLC5, cernVM • Also; CORSIKA, UrQMD, Gean4, web robot for cosmic ray monitoring, • We store mini-shoe files for our analysis needs, locally produced data, local experimental data.

  5. prodna61 last year’s lxbatch usage Last year’s number of jobs for prodna61 account shows that during the August we had a peak in number of concurrent running jobs. But during the year we had not so much resources on lxbatch available. Also, we can not expect them in the future, and we will complain about lower number of concurrent jobs (on Calib meetings) for future productions too. There are 40k job slots on lxplus. Main users cms and atlas t0 processing..: Atlas and cms t0 processing are have bigger priority, and faster taking the lxbatch resources, but lower priority queues are also significant (prodna61 uses 1nh queue with priority 42. grid queues similar: 40) I’d like to conclude that, depending on other user’s jobs, we can not expect stable number of possible concurrent jobs running. Especially, during the peak of activities at CERN.

  6. Szymon’s presentation

  7. Possible benefits of Belgrade cluster • Having dedicated resources for production, other than lxbatch, will help us to have stable and incrased MC production. Also, we will be able to plan better the production, and to be able to finish the production within time planed. • If MC production was to be done at external cluster. • Let’s assume for a moment that we could run 100 concurent jobs for 5M events producton, i.e. 5k 1000 events jobs (Szymon’s figures). We could expect that per 1h there would be produced 12GB of files for transfer to lxplus (cern’s castor). • Since we have an optical links in the institute, installed and optimised for already woringcms and atlas grid sites operational requirements, network can support the transfer of all created files with no back log – all files produced for an hour will be transferred within an hour. • It is very interesting that the transfer from outside to CERN is significantlly greater that transfer from CERN to other institute. • In out instant tests, CERN to Belgrade transfer rate was 3MB/s (10.8GB/h) but the transfers from Belgrade to CERN is 11.2MB/s (36GB/1h). • So, the 12GB produced for an 100 jobs running for an hour could be transfered for a 20 minutes. • This means that if you run 300 jobs concurently, you will have sustainable transfer (all files created for an hour will be transferred within an hour) of produced files to cern’slxplus. If we allow some delay in transfers we could have more than 400 concurent jobs running for MC production. • To conclude, if the MC production is done on our future cluster, this would give us increased and stable minimum number (let’s say 300) of concurent jobs and all the files would be accessible on cern’s castor shortly after production is done.

  8. Darko’s presentation

  9. Possible usage - It will serve for testing the NA61 virtualization project - It will process cosmic and neutron data we are collecting in the Institute of Physics in Belgrade - For MC simulations of cosmic rays (CORSIKA+Geant4) - Training of our students without lxplus account We would like to see that cluster is used for - MC production for NA61 - Tests of the reconstruction of experimental data using virtual machines. - SHINE Legacy on virtual machines. - Users of SHINE software doing analysis on data duplicated from castor

  10. Thank you!

More Related