120 likes | 251 Vues
The SC2004 Bandwidth Challenge showcased groundbreaking networking technologies, achieving a sustained throughput of 101.13 Gbps. The collaborative efforts of SLAC, Caltech, FNAL, and other leading institutions demonstrated effective data transfer capabilities using advanced hardware including Sun Opteron systems, Chelsio NICs, and Cisco routers. This presentation highlights the experimental setup, challenges faced—including equipment limitations and configuration issues—and the impressive results achieved during the event, underscoring the evolution of high-performance computing networking.
E N D
Experiences from SLAC SC2004 Bandwidth Challenge Les Cottrell, SLAC www.slac.stanford.edu/grp/scs/net/talk03/bwc-04.ppt
SLAC/SC04 Bandwidth Challenge (plan C) PSC SciNet 2 Sun Opteron/Chelsio-10GE SLAC/FNAL booth (2418) 6 Boston file servers 1 GE 2 Sun file server 1 GE 2 Sun Opteron/S2io-10GE Loaned Cisco Rtr 6 Sun Opteron/Chelsio-10GE 10Gbps from NLR (via SEA, DEN, CHI) SLAC Cisco Rtr NLR-PITT-SUNN-10GE-17 NLR-PITT-SUNN-10GE-17 Juniper T320 NLR demarc NLR demarc 1 Sun file server 1 GE 15808 15808 15540 15454 ESnet/QWest OC192/SONET Sunnyvale/Level(3)/NLR 1 Sun Opteron/Chelsio-10GE 1380 Kifer Sunnyvale/Qwest/ESnet 1400 Kifer
SC2004: Tenth of a Terabit/s Challenge • Joint Caltech, SLAC, FNAL, CERN, UF, SDSC, BR, KR, …. • 10 10 Gbps waves to HEP on show floor • Bandwidth challenge: aggregate throughput of 101.13 Gbps • FAST TCP
Components 10 Gbps NICs v20z Chelsio SR XENPAK S2io 1982 10Mbps 3COM v40z SVL/NLR 3510 disk array
Challenge aggregates from SciNet • Aggregate Caltech & SLAC booth, in & out • 7 lambdas to Caltech, 3 ro SLAC
Challenge aggregates from MonALISA • Sustained ~ 10Gbps for extended periods
To/From SLAC booth • NLR: 9.43Gbps (9.07 goodput) + 5.65Gbps (5.44Gbps goodput) in reverse • Two hosts to two hosts • ESnet: 7.72Gbps (7.43Gbps goodput) • Only one 10Gbps host at SVL • Single V40Z host with 2*10GE NICs to 2*V20Z across country got 11.4Gbps • S2io and Chelsio (& Cisco & Juniper) all interwork • Chelsio worked stably on uncongested paths
TOE • Chelsio had TCP Offload Engine • Utilization factor of throughput & parallel streams • Reduced cpu c.f. S2io non0TOE by factor ~ 3
Challenges • Could not get 10Gbps waves to SLAC only SVL • Equipment in 3 locations • Keeping configs in lock-step (no NFS, no name service) • Security concerns, used iptables • Machines only available 2 weeks before, some not until we got to SC04 • Jumbo frames not configured correctly at SLAC booth, used 1500B frames mainly • Mix of hdw/swr: Opterons with various GHz & disks, Xeons; Solaris 10, Linux 2.4, 2.6 • Coordination between booths (sep by 100 yds) • Everything state of art (Linux 2.6.6, SR XENPAKs, NICs