1 / 28

MB-NG & DataTAG: High Bandwidth, High Throughput in e-Science Projects

This paper discusses the topology of the MB-NG & DataTAG network, as well as tests on end hosts, data transfers, interrupt coalescence, and network bottlenecks. It also compares TCP protocol stacks and application throughput in e-Science projects.

efrat
Télécharger la présentation

MB-NG & DataTAG: High Bandwidth, High Throughput in e-Science Projects

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. MB - NG High Bandwidth High Throughput in the MB-NG & DataTAG Projects Richard Hughes-Jones, Stephen Dallison , Gareth Fairey Dept. of Physics and Astronomy, University of ManchesterRobin TaskerDaresbury Laboratory CLRCMiguel Rio, Yee Ting LiDept. of Physics and Astronomy, University College London e-Science All Hands Meeting 1-4 Sep 03 R. Hughes-Jones Manchester

  2. MB - NG man03 lon02 lon01 lon03 man01 man02 Topology of the MB – NG Network UCL Domain Manchester Domain Boundary Router Cisco 7609 Boundary Router Cisco 7609 UKERNA DevelopmentNetwork Edge Router Cisco 7609 Key Gigabit Ethernet 2.5 Gbit POS Access MPLS Admin. Domains RAL Domain e-Science All Hands Meeting 1-4 Sep 03 R. Hughes-Jones Manchester

  3. Datatag Testbed v10chi v11chi v12chi v13chi VTHD/INRIA w01gva w02gva w03gva w04gva w05gva w06gva w20gva v02gva v03gva w03chi w04chi w05chi w01chi w02chi w06chi stm16 (FranceTelecom) ONS15454 SURFNET CESNET r06gva Alcatel7770 ONS15454 stm64 (GC) w03 STM64 3x 2x 3x SURF NET CNAF GEANT cernh7 8x 7x 2x r06chi-Alcatel7770 w01bol stm16(Colt) backup+projects 2x Alcatel 1670 Alcatel 1670 r05gva-JuniperM10 r05chi-JuniperM10 stm16 (DTag) r04gva Cisco7606 r04chi-Cisco7609 s01gva Extreme S1i CANARIE 2x s01chi Extreme S5i DataTAG s02gva Cisco5505-management 3x ONS15454 Chicago Geneva CERN/Caltech production Network 1000baseSX 1000baseT 10GbaseLX GEANT SWITCH SDH/Sonet Stm16(Swisscom) CCC tunnel stm4(DTag) Cisco2950-management cernh4-Cisco7609 ar3-chicago -Cisco7606 cernh7-Cisco7609 edoardo.martelli@cern.ch last update: 20030701 e-Science All Hands Meeting 1-4 Sep 03 R. Hughes-Jones Manchester

  4. End Hosts how good are they really ? e-Science All Hands Meeting 1-4 Sep 03 R. Hughes-Jones Manchester

  5. End Hosts b2b & end-to-end UDP Tests • Test with UDPmonSupermicro P4DP6 • Max throughput 975Mbit/s • 20% CPU utilisation receiver packets > 1000 bytes • 40% CPU utilisation smaller packets • PCI:64 bit 66 MHz • Latency 6,1ms & well behaved • Latency Slope 0.0761 µs/byte • B2B Expect: 0.0118 µs/byte • PCI 0.00188 • GigE 0.008 • PCI 0.00188 • 6 routers • Jitter small 2-3 µs FWHM e-Science All Hands Meeting 1-4 Sep 03 R. Hughes-Jones Manchester

  6. Data Transfers Send setup Send PCI Receive PCI Receive Transfers Data Transfers Send setup Send PCI Receive PCI Receive Transfers Signals on the PCI bus • 1472 byte packets every 15 µs Intel Pro/1000 • PCI:64 bit 33 MHz • 82% usage • PCI:64 bit 66 MHz • 65% usage • Data transfers half as long e-Science All Hands Meeting 1-4 Sep 03 R. Hughes-Jones Manchester

  7. Interrupt Coalescence Investigations • Kernel parameters forSocket Buffer size rtt*BW • TCP mem-mem lon2-man1 • Tx 64 Tx-abs 64 • Rx 0 Rx-abs 128 • 820-980 Mbit/s +- 50 Mbit/s • Tx 64 Tx-abs 64 • Rx 20 Rx-abs 128 • 937-940 Mbit/s +- 1.5 Mbit/s • Tx 64 Tx-abs 64 • Rx 80 Rx-abs 128 • 937-939 Mbit/s +- 1 Mbit/s e-Science All Hands Meeting 1-4 Sep 03 R. Hughes-Jones Manchester

  8. txqueuelen-vs-sendstalls • Tx Queue located betweenIP stack & NIC driver • TCP treats ‘Queue full’ as congestion ! • Results for Lon  Man • Select txqueuelen =2000 e-Science All Hands Meeting 1-4 Sep 03 R. Hughes-Jones Manchester

  9. Network Investigations e-Science All Hands Meeting 1-4 Sep 03 R. Hughes-Jones Manchester

  10. Network Bottlenecks • Backbones 2.5 and 10 Gbit – usually good (in Europe) • Access links need care GEANT-NRN and Campus – SuperJANET4 • NNW – SJ4 Access: given as example of good forward planning: 10 November 2002 1 Gbit link 24 February 2003 26 Feb 2003 Upgraded to 2.5 Gbit • Trunking – use of multiple 1 Gbit Ethernet links e-Science All Hands Meeting 1-4 Sep 03 R. Hughes-Jones Manchester

  11. 24 Hours HighSpeed TCP mem-mem • TCP mem-mem lon2-man1 • Tx 64 Tx-abs 64 • Rx 64 Rx-abs 128 • 941.5 Mbit/s +- 0.5 Mbit/s e-Science All Hands Meeting 1-4 Sep 03 R. Hughes-Jones Manchester

  12. TCP sharing man1-lon2 • 1 stream every 60 s: • man1 lon2 • man2 lon2 • man3 lon2 • Sample every 10ms • 1 Stream: • Average 940 Mbit/s • No Dup ACKs • No SACKs • No Sendstalls • 2 Streams: • Average ~500 Mbit/s • Many Dup ACKs • Cwnd reduced • 2 Streams: • Average ~300 Mbit/s e-Science All Hands Meeting 1-4 Sep 03 R. Hughes-Jones Manchester

  13. 2 TCP streams man1-lon2 • 2Streams: • Dips in throughput due to Dup ACK • ~4 losses /sec • A bit regular ? • Cwnd decreases: • 1 point 33% • Ramp starts at 62% • Slope 70Bytes/us 1 sec e-Science All Hands Meeting 1-4 Sep 03 R. Hughes-Jones Manchester

  14. TCP Protocol Stack Comparisons • Standard TCPHighSpeed TCPScalable TCP • kernel on the receiver dropped packets periodically • MB-NG Network rtt 6.2 ms.Recovery time 1.6s • DataTAG Network rtt 119 ms. Recovery time 590s 9.8 min • Throughput of the DataTAG network was factor ~5 lower than that on the MB-NG network MB-NG DataTAG e-Science All Hands Meeting 1-4 Sep 03 R. Hughes-Jones Manchester

  15. Application Throughput e-Science All Hands Meeting 1-4 Sep 03 R. Hughes-Jones Manchester

  16. MAN MCC OSM-1OC48-POS-SS Gigabit Ethernet 2.5 Gbit POS Access 2.5 Gbit POS core MPLS Admin. Domains SJ4 Dev SJ4 Dev SJ4 Dev PC PC SJ4 Dev MB - NG UCL OSM-1OC48-POS-SS 3ware RAID0 3ware RAID0 PC PC PC PC MB – NG SuperJANET4 Development Network e-Science All Hands Meeting 1-4 Sep 03 R. Hughes-Jones Manchester

  17. Gridftp Throughput HighSpeedTCP • RAID0 Disk Tests: • 120 Mbytes/s Read • 100 Mbytes/s Write • Int Coal 64 128 • Txqueuelen 2000 • TCP buffer 1 M byte(rtt*BW = 750kbytes) • Interface throughput • Data Rate: 520 Mbit/s • Same for B2B tests • So its not that simple! TCP ACK traffic Data traffic e-Science All Hands Meeting 1-4 Sep 03 R. Hughes-Jones Manchester

  18. Gridftp Throughput + Web100 • Throughput Mbit/s: • See alternate 600/800 Mbitand zero • Cwnd smooth • No dup Ack / send stall /timeouts e-Science All Hands Meeting 1-4 Sep 03 R. Hughes-Jones Manchester

  19. http data transfers HighSpeed TCP • Bulk data moved by web servers • Apachie web server out of the box! • prototype client - curl http library • 1Mbyte TCP buffers • 2Gbyte file • Throughput ~720 Mbit/s • Cwnd - some variation • No dup Ack / send stall /timeouts e-Science All Hands Meeting 1-4 Sep 03 R. Hughes-Jones Manchester

  20. BaBar Case Study: Disk Performace • BaBar Disk Server • Tyan Tiger S2466N motherboard • 1 64bit 66 MHz PCI bus • Athlon MP2000+ CPU • AMD-760 MPX chipset • 3Ware 7500-8 RAID5 • 8 * 200Gb Maxtor IDE 7200rpm disks • Note the VM parameterreadahead max • Disk to memory (read)Max throughput 1.2 Gbit/s 150 MBytes/s) • Memory to disk (write)Max throughput 400 Mbit/s 50 MBytes/s)[not as fast as Raid0] e-Science All Hands Meeting 1-4 Sep 03 R. Hughes-Jones Manchester

  21. BaBar Case Study: Throughput & PCI Activity • 3Ware forces PCI bus to 33 MHz • BaBar Tyan to MB-NG SuperMicroNetwork mem-mem 619 Mbit/s • Disk – disk throughput bbcp40-45 Mbytes/s (320 – 360 Mbit/s) • PCI bus effectively full! Read from RAID5 Disks Write to RAID5 Disks e-Science All Hands Meeting 1-4 Sep 03 R. Hughes-Jones Manchester

  22. Conclusions The MB-NG Project has achieved: • Continuous memory to memory data transfers with an average user data rate of 940 Mbit/s for over 24 hours using the HighSpeed TCP stack. • Sustained high throughput data transfers of 2 GByte files between RAID0 disk systems using Gridftp and bbcp. • Transfers of 2 GByte files using the http protocol from the standard apache Web server and HighSpeed TCP that achieved data rates of ~725 Mbit/s. • Ongoing operation and comparison of different Transport Protocols- Optical Switched Networks • Detailed investigation of Routers, NICs & end-host performance. • Working with e-Science groups to get high performance to the user. • Sustained data flows at Gigabit rates are achievable • Use Server quality PCs not Supermarket PCs + care with interfaces • Be kind to the Wizards ! e-Science All Hands Meeting 1-4 Sep 03 R. Hughes-Jones Manchester

  23. More Information Some URLs • MB-NG project web site:http://www.mb-ng.net/ • DataTAG project web site: http://www.datatag.org/ • UDPmon / TCPmon kit + writeup: http://www.hep.man.ac.uk/~rich/net • Motherboard and NIC Tests: www.hep.man.ac.uk/~rich/net/nic/GigEth_tests_Boston.ppt& http://datatag.web.cern.ch/datatag/pfldnet2003/ • TCP tuning information may be found at:http://www.ncne.nlanr.net/documentation/faq/performance.html& http://www.psc.edu/networking/perf_tune.html e-Science All Hands Meeting 1-4 Sep 03 R. Hughes-Jones Manchester

  24. Backup Slides e-Science All Hands Meeting 1-4 Sep 03 R. Hughes-Jones Manchester

  25. Data over TCP Streams Raid0 Disk Raid0 Disk GridFTP GridFTP EU Review Demo Consisted of: Dante Monitoring Site Monitoring Node Monitoring e-Science All Hands Meeting 1-4 Sep 03 R. Hughes-Jones Manchester

  26. Throughput on the day ! TCP ACKs Data ~400 Mbit/s e-Science All Hands Meeting 1-4 Sep 03 R. Hughes-Jones Manchester

  27. Some Measurements of Throughput CERN -SARA • Using the GÉANT Backup Link • 1 GByte file transfers • Blue Data • Red TCP ACKs • Standard TCP • Average Throughput 167 Mbit/s • Users see 5 - 50 Mbit/s! • High-Speed TCP • Average Throughput 345 Mbit/s • Scalable TCP • Average Throughput 340 Mbit/s e-Science All Hands Meeting 1-4 Sep 03 R. Hughes-Jones Manchester

  28. What the Users Really find: • CERN – RAL using production GÉANT • CMS Tests 8 streams • 50 Mbit/s @ 15 MB buffer • Firewall 100 Mbit/s • NNW – SJ4 Access • 1 Gbit link e-Science All Hands Meeting 1-4 Sep 03 R. Hughes-Jones Manchester

More Related