70 likes | 157 Vues
Explore the Net100 project funded by DOE to measure and enhance network and application performance, leveraging active network probes and sensors. Join principal investigators to tune network flows and monitor TCP variables for optimal performance.
E N D
Net100: developing network-aware operating systems • New (9/01) DOE-funded (Office of Science) project ($1M/yr, 3 yrs) • Principal investigators • Matt Mathis, PSC (mathis@psc.edu) • Brian Tierney, LBNL (bltierney@lbl.gov) • Tom Dunigan, ORNL (thd@ornl.gov) • Objective: • measure and understand end-to-end network and application performance • tune network applications (grid and bulk transfer) • Components • active network probes and passive sensors (leverage Web100) • network metrics data base • tuning daemon (WAD) to tune network flows based on network metrics www.net100.org
Net100: applied Web100 • Web100 • Linux 2.4 kernel mods • 100+ TCP variables per flow • Net100 • Add Web100 to iperf/ttcp • Monitoring/tuning daemon • Java applet bandwidth/client tester • fake WWW server provides html and applet • applet connects to bwserver • 3 sockets (control, bwin, bwout) • server reports Web100 variables to applet (window sizes, losses, RTT) • Try it http://firebird.ccs.ornl.gov:7123
Net100 network measurement • Active measurement • Net100 probes at LBL, ORNL, NCAR, PSC, NERSC • scheduled set of path probes (iperf with Web100 mods, traceroute, pipechar) • local and centralized database (netlogger) • interface to other probers (NIMI, surveyor, Pinger, ?) • Passive measurement • Web100 daemon records TCP info on designated flows • Web100 data collected when flow terminates • Web100 TCP info: losses, timeouts, reordering, cwnd, ssthresh, RTT,… • use netlogger to report to central data base • other passive sensors (SNMP data, LBL’s tcpdump monitor, ?) • Query tools • for dynamic application tuning • for network engineering and statistical studies
Net100: tuning • Work-around Daemon (WAD) Version 0 • use network performance data to tune flows • tune unknowing sender/receiver • config file with “tuning info” ? • Based on Web100/Linux 2.4 • To be done • “applying” measurement info • adding more knobs to kernel • tune on non-Linux OS • Related work • Feng’s Dynamic Right Sizing • Linux 2.4 auto-tuning/caching • Mathis TCP buffer tunning
TCP losses • TCP is lossy by design • Changing: bandwidths • 9.6 Kbs… 1.5 Mbs ..45 …100…1000…? Mbs • Unchanging: • speed of light (RTT) • MTU (still 1500 bytes) • TCP congestion avoidance • recovery after a loss can be very slow on today’s high delay/bandwidth links • proportional to MSS/RTT2 Linear recovery at 0.5 Mb/s! Instantaneous bandwidth Early startup losses Average bandwidth
Net100 tuning • Avoid losses • use “optimal” buffer sizes determined from network measurements • ECN capable routers/hosts • reduce bursts (TCP Vegas, ?) • Faster recovery • bigger MSS (jumbo frames) • speculative recovery (D-SACK) • modified congestion avoidance? • Autotune (WAD variables) • Buffer sizes • Dupthresh (reordering resilience) • Del ACK, Nagle • AIMD • Virtual MSS • initial window, ssthresh • non-TCP solutions (rate-based, ?) (tests with TCP-over-UDP, atou, NERSC to ORNL)
Net100 status • Completed • network probes at ORNL, PSC, NCAR, LBL, NERSC • preliminary schema for network data • initial Web100 sensor daemon and tuning daemon • In progress • TCP tuning extensions to Linux/Web100 kernel • analysis of TCP tuning options • deriving tuning info from network measurements • Future • interactions with other network measurement sources • multipath/parallel path selection/tuning www.net100.org