1 / 15

CERN openlab for DataGrid applications

Sverre Jarp CERN openlab CTO IT Department, CERN. CERN openlab for DataGrid applications. CERN openlab. Department’s main R&D focus Framework for collaboration with industry Evaluation, integration, validation of cutting-edge technologies Initially a 3-year lifetime

nbryson
Télécharger la présentation

CERN openlab for DataGrid applications

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Sverre Jarp CERN openlab CTO IT Department, CERN CERN openlab for DataGrid applications SJ – June 2003

  2. CERN openlab • Department’s main R&D focus • Framework for collaboration with industry • Evaluation, integration, validation • of cutting-edge technologies • Initially a 3-year lifetime • Later: Annual renewals LCG CERN openlab 02 03 04 05 06 07 08 SJ – June 2003

  3. Openlab sponsors • 5 current partners • Enterasys: • 10 GbE core routers • HP: • Integrity servers (103 * 2-ways, 2 * 4-ways) • Two fellows (co-sponsored with CERN) • IBM: • Storage Tank file system (SAN FS) w/metadata servers and data servers (currently with 28 TB) • Intel: • 64-bit Itanium processors & 10 Gbps NICs • Oracle: • 10g Database software w/add-on’s • Two fellows • One contributor • Voltaire • 96-way Infiniband switch SJ – June 2003

  4. The opencluster in its new position in the Computer Centre SJ – June 2003

  5. Integration with the LCG testbed 10GE WAN connection 180 IA32 CPU Server (2.4 GHz P4, 1 Gb mem) ENTERASYS N7 56 IA64 Server (1.3/1.5 GHz Itanium2, 2 GB mem) GE per node 10GE per node 10GE GE per node multi GE connections to the backbone GE per node 20 Tape Server 28 IA32 Disk Server (~ 1TB disk space each) New High Througput Prototype ( Feb. 2004) SJ – June 2003

  6. Recent achievements(selected amongst many others) • Hardware and software • Key ingredients deployed in Alice Data Challenge V • Internet2 land speed record between CERN and CalTech • Porting and verification of CERN/HEP software on 64-bit architecture • CASTOR, ROOT, CLHEP, GEANT4, ALIROOT, etc. • Parallel ROOT data analysis • Port of LCG software to Itanium SJ – June 2003

  7. ADC V - Logical Model and Requirements Detector Digitizers Front-end Pipeline/Buffer Trigger Level 0,1 Decision Readout Buffer Trigger Level 2 Decision 25 GB/s 2.50 GB/s 1.25 GB/s Detector Data Link (DDL) Testedduring ADC Sub-event Buffer Local Data Concentrators (LDC) High-Level Trigger Decision Data Event-Building Network Event Buffer Global Data Collectors (GDC) Storage network Transient Data Storage (TDS) SJ – June 2003 Permanent Data Storage (PDS)

  8. Achievements(as seen by Alice) • Sustained bandwidth to tape: • Peak 350 MB/s • Reached production-quality level only last week of testing • Sustained 280 MB/s over 1 day but with interventions [goal was 300] • IA-64 from openlab successfully integrated in the ADC V Goal for ADC VI: 450 MB/s SJ – June 2003

  9. 10 Gbps WAN tests • Initial breakthrough during Telecom-2003 • with IPv4 (single/multiple) streams: 5.44 Gbps • Linux, Itanium-2 (RX 2600), Intel 10Gbps NIC • Also IPv6 (single/multiple) streams • In February • Again IPv4, but multiple streams (DataTag + Microsoft): 6.25 Gbps • Windows/XP, Itanium-2 (Tiger-4), S2IO 10 Gbps NIC • In June (not yet submitted) • Again IPv4, and single stream (Datatag + Openlab): 6.55 Gbps • Linux, Itanium-2 (RX2600), S2IO NIC openlab still has a slightly better result than NewiSys Opteron 4-way box and a heavily tuned Windows/XP SJ – June 2003

  10. Cluster parallelization • Parallel ROOT Facility (PROOF): • Excellent scalability with 64 processors last year • Tests in progress for 128 (or more) CPUs • MPI software installed • Ready for tests with BEAMX (similar program to Sixtrack) • Alinghi software also working • Collaboration with team at EPFL • Uses Ansys CFX • distcc installed and tested • Compilation time reduced both for GNU and Intel compiler SJ – June 2003

  11. Gridification • A good success story: • Starting point: The software chosen for LCG (VDT + EDG) had been developed only with IA32 (and specific Red Hat versions) in mind • Consequence: Configure-files and make-files not prepared for multiple architectures. Source files not available in distributions (often not even locatable) • Stephen Eccles, Andreas Unterkircher worked for many months to complete the porting of LCG-2 • Result: All major components now work on Itanium/Linux: • Worker Nodes, Compute Elements, Storage Elements, User Interface, etc. • Tested inside EIS Test Grid • Code, available via Web-site, transferred to HP sites (Initially Puerto Rico and Bristol) • Changes given back to developers • VDT now built also for Itanium systems • Porting experience summarized in white paper (on the Web) From now on the Grid is heterogeneous! SJ – June 2003

  12. Storage Tank • Random Access test (mid-March) • Scenario: • 100 GB dataset, randomly accessed in ~50kB blocks • 1 – 100 2 GHz P4-class clients, running 3 – 10000 “jobs” • Hardware • 4 IBM x335 metadata servers • 8 IBM 200i controllers, 336 SCSI disks • Added 2 IBM x345 servers as disk controllers after the test • Results • Peak data rate: 484 MB/s (with 9855 simultaneous “jobs”) • After the test, special tuning, 10 servers, smaller number of clients: • 705 MB/s Ready to be used in Alice DC VI SJ – June 2003

  13. Next generation disk servers • Based on state-of-the-art equipment: • 4-way Itanium server (RX4640) • Two full-speed PCI-X slots • 10 GbE and/or Infiniband • Two 3ware 9500 RAID controllers • In excess of 400 MB/s RAID-5 read speed • Only 100 MB/s for write w/RAID 5 • 200 MB/s RAID 0 • 24 * S-ATA disks with 74 GB • WD740 “Raptor” @ 10k rpm • Burst speed of 100 MB/s Goal: Saturate 10GbE card for reading (at least 500 MB/s with standard MTU and 20 streams). Writing as fast as possible. SJ – June 2003

  14. Data exportto LCG Tier-1/-2 Data distribution ~70 Gbits/sec MSU RAL IC IN2P3 IFCA UB FNAL Cambridge • Tests (initially) between CERN and Fermilab + NIKHEF • Multiple HP Itanium servers with dual NICs • Disk to disk transfers via GridFTP • Each server: • 100 MB/s IN + 100 MB/s OUT • Aggregation of multiple streams across 10 GbE link • Similar tuning as Internet2 tests • Possibly try the 4-way 10GbE server and Enterasys X-series router Budapest CNAF Prague FZK Taipei PIC TRIUMF ICEPP BNL Legnaro CSCS Rome CIEMAT USC Krakow NIKHEF “Service Data Challenge” Stability is paramount – no longer just “raw” speed SJ – June 2003

  15. Conclusions • CERN openlab: • Solid collaboration with our industrial partners • Encouraging results in multiple domains • We believe sponsors are getting good “ROI” • But only they can really confirm it • No risk of running short of R&D • IT Technology is still moving at an incredible pace • Vital for LCG that the “right” pieces of technology are available for deployment • Performance, cost, resilience, etc. 6 students, 4 fellows SJ – June 2003

More Related