1 / 16

Network Measurement & Characterisation and the Challenge of SuperComputing SC200x

Network Measurement & Characterisation and the Challenge of SuperComputing SC200x. Bandwidth Lust at SC2003. Working with S2io Cisco & folks. The SC Network. At the SLAC Booth Running the BW Challenge. The Bandwidth Challenge at SC2003.

cato
Télécharger la présentation

Network Measurement & Characterisation and the Challenge of SuperComputing SC200x

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Network Measurement & Characterisation and the Challenge of SuperComputing SC200x ESLEA Bedfont Lakes Dec 04Richard Hughes-Jones

  2. Bandwidth Lust at SC2003 • Working with S2io Cisco & folks • The SC Network • At the SLAC BoothRunning theBW Challenge ESLEA Bedfont Lakes Dec 04Richard Hughes-Jones

  3. The Bandwidth Challenge at SC2003 • The peak aggregate bandwidth from the 3 booths was 23.21Gbits/s • 1-way link utilisations of >90% • 6.6 TBytes in 48 minutes ESLEA Bedfont Lakes Dec 04Richard Hughes-Jones

  4. Multi-Gigabit flows at SC2003 BW Challenge • Three Server systems with 10 Gigabit Ethernet NICs • Used the DataTAG altAIMD stack 9000 byte MTU • Send mem-mem iperf TCP streams From SLAC/FNAL booth in Phoenix to: • Pal Alto PAIX • rtt 17 ms , window 30 MB • Shared with Caltech booth • 4.37 Gbit HighSpeed TCP I=5% • Then 2.87 Gbit I=16% • Fall when 10 Gbit on link • 3.3Gbit Scalable TCP I=8% • Tested 2 flows sum 1.9Gbit I=39% • Chicago Starlight • rtt 65 ms , window 60 MB • Phoenix CPU 2.2 GHz • 3.1 Gbit HighSpeed TCP I=1.6% • Amsterdam SARA • rtt 175 ms , window 200 MB • Phoenix CPU 2.2 GHz • 4.35 Gbit HighSpeed TCP I=6.9% • Very Stable • Both used Abilene to Chicago ESLEA Bedfont Lakes Dec 04Richard Hughes-Jones

  5. Super Computing 2004 ESLEA Bedfont Lakes Dec 04Richard Hughes-Jones

  6. UKLight at SC2004 • UK e-Science Researchers from Manchester, UCL & ULCC involved in the Bandwidth Challenge • Collaborated with Scientists & Engineers from Caltech, CERN, FERMI, SLAC, Starlight, UKERNA & U. of Florida • Worked on: • 10 Gbit Ethernet link from SC2004 to ESnet/QWest PoP in Sunnyvale • 10 Gbit Ethernet link from SC2004 and the CENIC/NLR/Level(3) PoP in Sunnyvale  • 10 Gbit Ethernet link from SC2004 to Chicago and on to UKLight • UKLight focused on disk-to-disk transfers between UK sites and Pittsburgh • UK had generous support from Boston Ltd who loaned the servers • The BWC Collaboration had support from: • S2io NICs • Chelsio TOE • Sun who loaned servers • Essential support from Cisco ESLEA Bedfont Lakes Dec 04Richard Hughes-Jones

  7. Collaboration at SC2004 • Working with S2io, Sun, Chelsio • Setting up the BW Bunker • SCINet ESLEA Bedfont Lakes Dec 04Richard Hughes-Jones • The BW Challenge at the SLAC Booth

  8. The Bandwidth Challenge – SC2004 • The peak aggregate bandwidth from the booths was 101.13Gbits/s • Or 3 full length DVD per second • Saturated TEN 10GE waves • SLAC Booth: Sunnyvale to Pittsburgh, LA to Pittsburgh and Chicago to Pittsburgh (with UKLight). ESLEA Bedfont Lakes Dec 04Richard Hughes-Jones

  9. MB-NG Managed Bandwidth Amsterdam SC2004 UKLIGHT – Focused on Disk-to-Disk Manchester SLAC Booth SC2004 Cisco 6509 MB-NG 7600 OSR Caltech Booth UltraLight IP UCL network UCL HEP NLR Lambda NLR-PITT-STAR-10GE-16 ULCC UKlight K2 K2 Ci UKlight 10G Four 1GE channels Ci CERN 7600 UKlight 10G Surfnet/ EuroLink 10G Two 1GE channels Chicago Starlight K2 ESLEA Bedfont Lakes Dec 04Richard Hughes-Jones

  10. Transatlantic Ethernet: TCP Throughput Tests • Supermicro X5DPE-G2 PCs • Dual 2.9 GHz Xenon CPU FSB 533 MHz • 1500 byte MTU • 2.6.6 Linux Kernel • Memory-memory TCP throughput • Standard TCP • Wire rate throughput of 940 Mbit/s • First 10 sec • Work in progress to study: • Implementation detail • Advanced stacks • Packet loss • Sharing ESLEA Bedfont Lakes Dec 04Richard Hughes-Jones

  11. Transatlantic Ethernet: disk-to-disk Tests • Supermicro X5DPE-G2 PCs • Dual 2.9 GHz Xenon CPU FSB 533 MHz • 1500 byte MTU • 2.6.6 Linux Kernel • RAID0 (6 SATA disks) • Bbftp (disk-disk) throughput • Standard TCP • Throughput of 436 Mbit/s • First 10 sec • Work in progress to study: • Throughput limitations • Help real users ESLEA Bedfont Lakes Dec 04Richard Hughes-Jones

  12. 10 Gigabit Ethernet: UDP Throughput Tests • 1500 byte MTU gives ~ 2 Gbit/s • Used 16144 byte MTU max user length 16080 • DataTAG Supermicro PCs • Dual 2.2 GHz Xenon CPU FSB 400 MHz • PCI-X mmrbc 512 bytes • wire rate throughput of 2.9 Gbit/s • CERN OpenLab HP Itanium PCs • Dual 1.0 GHz 64 bit Itanium CPU FSB 400 MHz • PCI-X mmrbc 512 bytes • wire rate of 5.7 Gbit/s • SLAC Dell PCs giving a • Dual 3.0 GHz Xenon CPU FSB 533 MHz • PCI-X mmrbc 4096 bytes • wire rate of 5.4 Gbit/s ESLEA Bedfont Lakes Dec 04Richard Hughes-Jones

  13. mmrbc 512 bytes mmrbc 1024 bytes mmrbc 2048 bytes CSR Access PCI-X Sequence Data Transfer Interrupt & CSR Update mmrbc 4096 bytes 5.7Gbit/s 10 Gigabit Ethernet: Tuning PCI-X • 16080 byte packets every 200 µs • Intel PRO/10GbE LR Adapter • PCI-X bus occupancy vs mmrbc • Measured times • Times based on PCI-X times from the logic analyser • Expected throughput ~7 Gbit/s • Measured 5.7 Gbit/s ESLEA Bedfont Lakes Dec 04Richard Hughes-Jones

  14. 10 Gigabit Ethernet: SC2004 TCP Tests • Sun AMD opteron compute servers v20z • Chelsio TOE Tests between Linux 2.6.6. hosts • 10 Gbit ethernet link from SC2004 to CENIC/NLR/Level(3) PoP in Sunnyvale  • Two 2.4GHz AMD 64 bit Opteron processors with 4GB of RAM at SC2004 • 1500B MTU, all Linux 2.6.6 • in one direction 9.43G i.e. 9.07G goodput • and the reverse direction 5.65G i.e. 5.44G goodput • Total of 15+G on wire. • 10 Gbit ethernet link from SC2004 to ESnet/QWest PoP in Sunnyvale • One 2.4GHz AMD 64 bit Opteron each end • 2MByte window, 16 streams, 1500B MTU, all Linux 2.6.6 • in one direction 7.72Gbit/s i.e. 7.42 Gbit/s goodput • 120mins (6.6Tbits shipped) • S2io NICs with Solaris 10 in 4*2.2GHz Opteron cpu v40z to one or more S2io or Chelsio NICs with Linux 2.6.5 or 2.6.6 in 2*2.4GHz V20Zs • LAN 1 S2io NIC back to back: 7.46 Gbit/s • LAN 2 S2io in V40z to 2 V20z : each NIC ~6 Gbit/s total 12.08 Gbit/s ESLEA Bedfont Lakes Dec 04Richard Hughes-Jones

  15. UKLight and ESLEA • Collaboration forming for SC2005 • Caltech, CERN, FERMI, SLAC, Starlight, UKLight, … • Current Proposals include: • Bandwidth Challenge with even faster disk-to-disk transfers between UK sites and SC2005 • Radio Astronomy demo at 512 Mbit user data or 1 Gbit user dataJapan, Haystack(MIT), Jodrell Bank, JIVE • High Bandwidth linkup between UK and US HPC systems • 10Gig NLR wave to Seattle • Set up a 10 Gigabit Ethernet Test Bench • Experiments (CALICE) need to investigate >25 Gbit to the processor • ESLEA/UKlight need resources to study: • New protocols and congestion / sharing • The interaction between protcol processing, applications and storage • Monitoring L1/L2 behaviour in hybrid networks ESLEA Bedfont Lakes Dec 04Richard Hughes-Jones

  16. ESLEA Bedfont Lakes Dec 04Richard Hughes-Jones

More Related