Download
slide1 n.
Skip this Video
Loading SlideShow in 5 Seconds..
End-to-end Monitoring of High Performance Network Paths PowerPoint Presentation
Download Presentation
End-to-end Monitoring of High Performance Network Paths

End-to-end Monitoring of High Performance Network Paths

319 Vues Download Presentation
Télécharger la présentation

End-to-end Monitoring of High Performance Network Paths

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

    Slide 1: End-to-end Monitoring of High Performance Network Paths

    Les Cottrell, Connie Logg, Jerrod Williams, Jiri Navratil, SLAC, for the ESCC meeting, Columbus Ohio, July 2004 www.slac.stanford.edu/grp/scs/net/talk03/escc-jul04.ppt Partially funded by DOE/MICS Field Work Proposal on Internet End-to-end Performance Monitoring (IEPM Modern data intensive science such as HENP requires the ability to copy large amounts of data between collaborating sites. This in turn requires high-performance reliable end-to-end network paths and the ability to take advantage of them. End-users thus need both long-term and near real-time estimates of the netrwork and application performance of such paths for planning, setting expectations, and trouble-shooting. The IEPM-BW (Internet End-to-end Performance Monitoring - BandWidth) project was instigated in 2001 to meet the above needs for the BaBar HENP community. This produced a toolkit for monitoring Round Trip Times (RTT), TCP throughput (iperf), file copy throughput (bbftp, bbcp and GridFTP), traceroute and more recently lightweight cross-traffic and available bandwidth measurements (ABwE). Since then it has been extended to LHC, CDF, D0, ESnet, Grid, and high performance network Research & Education sites, about 60-70 paths are now being monitored (including about 50 remote sites) and the monitoring toolkit has been installed at ten sites and is in production at three or four sites, in particular FNAL (for CMS, CDF and D0) and SLAC (for BaBar and PPDG). Each monitoring site is relatively independent and the monitoring is designed to map to the design of modern HENP tiering of sites, i.e. it is hierarchical rather than full mesh. The monitoring toolkit is installed at the site and its contact chooses the remote hosts it wishes to monitor. Current work is in progress to analyze and visualize the traceroute meaurements and to automatically detect anomalous step down changes in bandwidth. Modern data intensive science such as HENP requires the ability to copy large amounts of data between collaborating sites. This in turn requires high-performance reliable end-to-end network paths and the ability to take advantage of them. End-users thus need both long-term and near real-time estimates of the netrwork and application performance of such paths for planning, setting expectations, and trouble-shooting. The IEPM-BW (Internet End-to-end Performance Monitoring - BandWidth) project was instigated in 2001 to meet the above needs for the BaBar HENP community. This produced a toolkit for monitoring Round Trip Times (RTT), TCP throughput (iperf), file copy throughput (bbftp, bbcp and GridFTP), traceroute and more recently lightweight cross-traffic and available bandwidth measurements (ABwE). Since then it has been extended to LHC, CDF, D0, ESnet, Grid, and high performance network Research & Education sites, about 60-70 paths are now being monitored (including about 50 remote sites) and the monitoring toolkit has been installed at ten sites and is in production at three or four sites, in particular FNAL (for CMS, CDF and D0) and SLAC (for BaBar and PPDG). Each monitoring site is relatively independent and the monitoring is designed to map to the design of modern HENP tiering of sites, i.e. it is hierarchical rather than full mesh. The monitoring toolkit is installed at the site and its contact chooses the remote hosts it wishes to monitor. Current work is in progress to analyze and visualize the traceroute meaurements and to automatically detect anomalous step down changes in bandwidth.

    Slide 2:Need

    Data intensive science (e.g. HENP) needs to share data at high speeds Needs high-performance, reliable e2e paths and the ability to use them End users need long and short term estimates of network and application performance for: Planning, setting expectations & trouble shooting You cant manage what you cant measure

    Slide 3:IEPM-BW

    Toolkit: Enables regular, E2E measurements with user selectable: Tools: iperf (single & multi-stream), bbftp, bbcp, GridFTP, ping (RTT), traceroute Periods (with randomization) Remote hosts to monitor Hierarchical to match the tiered approach of BaBar & LHC computation / collaboration infrastructures Includes: Auto-clean up of hung processes at both ends Management tools to look for failures (unreachable hosts, failing tools etc.) Web navigation of results Visualization of data as time-series, histograms, scatter plots, tables Access to data in machine readable form Documentation on host etc. requirements, program logic manuals, methods

    Slide 4:Requirements

    Requires: Monitoring toolkit installed on Linux monitoring host Host provided & administered by monitoring site personnel No need for root privileges Appropriate iperf, bbftp etc. ports to be opened SLAC can do initial install & configuration for monitoring host 50 line configuration file for each remote host, tells where directories, applications are located, options for various tools etc (mainly defaults) Small toolkit installed at remote (monitored hosts) Ssh access to an account at remote hosts This is the biggest problem with deployment

    Slide 5:Achievable throughput & file transfer

    IEPM-BW High impact (iperf, bbftp, GridFTP ) measurements 90+-15 min intervals Select focal area Fwd route change Rev route change Min RTT Iperf bbftp iperf1 abing Avg RTT

    Slide 6:Visualization: traceroutes

    Compact table to see correlations between many routes Identify significant changes in routes Differences in > 1 hop, NOT same first 3 octets, NOT same AS Report all traceroute pathologies: ! Annotations, ICMP checksum errs, non-responding interfaces, unreachable end host, stutters, multi-homed end host Note, we observe: most route changes (>98%) do not result in significant performance changes Many performance changes (~50+-20%) are NOT due to route changes Applications, host congestion, level 2 changes etc.

    Slide 7:Route table Example

    Compact so can see many routes at once History navigation Multiple route changes (due to GEANT), later restored to original route Available bandwidth Raw traceroute logs for debugging Textual summary of traceroutes for email to ISP Description of route numbers with date last seen User readable (web table) routes for this host for this day Route # at start of day, gives idea of root stability Mouseover for hops & RTT

    Slide 8:Another example

    TCP probe type Host not pingable Intermediate router does not respond ICMP checksum error Level change Get AS information for routes

    Slide 9:Topology (INCITE)

    Choose times and hosts and submit request DL CLRC CLRC IN2P3 CESnet ESnet JAnet GEANT Nodes colored by ISP Mouseover shows node names Click on node to see subroutes Click on end node to see its path back Also can get raw traceroutes with AS Alternate rt SLAC Alternate route Hour of day

    Slide 10:Demo Topology: PlanetLab nodes

    Click

    Slide 11:Zoom in: ESnet access

    Europe/GEANT Greece

    Slide 12:Routers along route

    Slide 13:Delays along route

    Slide 14:Data Access

    Interactive web accessible Most data can be downloaded in space or comma separated etc. (accessible via link or to program (e.g. using lynx to access URL)) However non standard Web services (GGF NMWG definitions) Working (with Warren Matthews/GATech/I2) on defining / providing access to traceroutes for AMP & IEPM-LITE MonALISA is accessing data via Web services Web services have steep learning curve, tools to facilitate are not available, definitions are not firm, so can exepect to change code accessing dataWeb services have steep learning curve, tools to facilitate are not available, definitions are not firm, so can exepect to change code accessing data

    Slide 15:IEPM-BW HENP Deployment June 2004

    Measurements from SLAC & FNAL BaBar, CMS, D0, CDF + 60-70 remote hosts in 12 countries Toolkits needed in monitor & remote hosts Range of bandwidths:500Kbps to 1 Gbps

    Slide 16:Working on:

    Provide more options for security for remote hosts Web services API access to data Provide & integrate low network utilization tool: ~ 25-30% of Abilene traffic is net measurement Automate detection of anomalous step changes in performance Evaluate using QOS or HSTCP-LP to reduce impact of iperf traffic (with ESnet) Evidence that causes packet loss (ESnet/FNAL/SLAC)

    Slide 17:Simplify remote security

    Currently use ssh to start, kill servers, check things etc. Instead run servers all time at remote host Check & restart with cron job Also kill hung processes with cron jobs More work for remote admin More difficult to check why things not working Looking at BWCTL from I2 E2E PiPES

    Slide 18:Low impact bandwidth measurement integration

    Developed in DoE funded INCITE program Goals: Make a measurement in < second rather than tens of seconds Injects little network traffic Provide reasonable agreement with more intense methods (e.g. iperf) Enables: Measurements of low performance links (e.g. to developing countries) Helps avoid need for scheduling More frequent measurements (minutes vs. hours) Lower impact more friendly

    Slide 19:Low impact Bandwidth

    Use 20 packet pairs to roughly estimate dynamic bw Capacity & Xtraffic, then Available = Capacity Xtraffic Capacity min pair separation; Xtraffic packet pair dispersion Dynamic bandwidth capacity (DBC) Available bandwidth = DBC X-traffic Cross-traffic Iperf ABwE SLAC to Caltech Mar 19, 2004

    Slide 20:Anomalous Event Detection

    Too many graphs to scan by hand, need to automate SLAC Caltech link performance dropped by factor 5 for ~ month before noticed, fixed within 4 hours of reporting Looking for long-term step down changes in bandwidth Use modified plateau algorithm from NLANR Divide data into history & trigger buffer If y < mh b * sh then trigger, else history (b = 2) When trigger buffer fills: if mt < d * mh, then have an event

    Slide 21:Anomalous Event Detection

    Length of trigger buffer (t) determines how long a step down must last before being interesting, we use 1 to 3 hours E.g. 20 mins saw 9 events, 40mins saw 3, 60mins none Works well unless strong (>40%) diurnal changes Next step incorporate diurnal checks Events caused by application on Caltech host (not network related) 8% of hosts have strong diurnal variations8% of hosts have strong diurnal variations

    Slide 22:Future plans

    Integrate it all Improve distribution and management tools Add monitoring sites e.g. HENP tier 0 & 1 sites such as CERN, BNL, IN2P3, DESY ; ESnet, StarLight, Caltech Integrate with PiPES Add extra functionality: Improved event detection include diurnals, multivariate Filter alerts Upon detecting anomaly gather relevant information (network, host etc.) including on-demand measurements (e.g. NDT) and prepare web page & email Improved web services access Study I2 & NREN/SURFnet Detectives Adding hosts requires commitment of some resources (10-20% FTE) at monitoring siteStudy I2 & NREN/SURFnet Detectives Adding hosts requires commitment of some resources (10-20% FTE) at monitoring site

    Slide 23:Thanks: Development

    Paola Grosso (SLAC) & Warren Matthews (GATech) - web services Maxim Grigoriev (FNAL) event detection, IEPM visualization, major monitoring site Ruchi Gupta (Stanford) event visualization Prof Arshad Ali & Fahad Khalid (NIIT, Pakistan) data collection after event

    Slide 24:Thanks: on-going

    Foreign: Andrew Daviel (TRIUMF), Simon Leinen (SWITCH), Olivier Martin (CERN), Sven Ubik (CESnet), Kars Ohrenberg (DESY), Bruno Hoeft (FZK), Dominique (IN2P3), Fabrizio Coccetti (INFN), Cristina Bulfon (INFN), Yukio Karita (KEK), Takashi Ichihara (RIKEN), Yoshinori Kitasuji (APAN), Antony Antony (NIKHEF), Arshad Ali (NIIT), Serge Belov (BINP), Robin Tasker (DL & RAL), Yee Ting Lee (UCL), Richard Hughes-Jones (Manchester) US Shawn McKee (Michigan), Tom Hacker (Michigan), Eric Boyd (I2), Stanislav Shalunov (I2), George Uhl (GSFC), Brian Tierney (LBNL), John Hicks (Indiana), John Estabrook (UIUC), Maxim Grigoriev (FNAL), Joe Izen (UT Dallas), Chris Griffin (U Florida), Tom Dunigan (ORNL), Dantong Yu (BNL), Suresh Singh (Caltech), Chip Watsom (JLab), Robert Lukens (JLab), Shane Canon (NERSC), Kevin Walsh (SDSC), David Lapsley (MIT/Haystack/ISI-E)

    Slide 25:More information

    IEPM-BW home page http://www-iepm.slac.stanford.edu/bw/ ABwE lightweight bandwidth estimation http://www-iepm.slac.stanford.edu/abing/ Traceroute graphical tool http://www-iepm.slac.stanford.edu/tools/TRgb/_READMEaboutTRgb Anomalous Event Detection www.slac.stanford.edu/grp/scs/net/papers/sigcomm2004/nts26-logg.pdf IEPM Web Services http://www-iepm.slac.stanford.edu/tools/web_services/

    Slide 26: Extra Slides

    Slide 27:Putting it together

    P P P P P ESnet CENIC Abilene SLAC Supernet SOX

    Slide 28:Data Access

    Interactive web accessible Most data can be downloaded in space or comma separated etc. (accessible via link or to program (e.g. using lynx to access URL)) However non standard Web services (GGF NMWG definitions) Working (with Warren Matthews/GATech/I2) on defining / providing access to traceroutes for AMP & IEPM-LITE MonALISA is accessing data via Web services Web services have steep learning curve, tools to facilitate are not available, definitions are not firm, so can exepect to change code accessing dataWeb services have steep learning curve, tools to facilitate are not available, definitions are not firm, so can exepect to change code accessing data

    Slide 29:Web Services

    See http://www-iepm.slac.stanford.edu/tools/web_services/ Working for: RTT, loss, capacity, available bandwidth, achievable throughput No schema defined for traceroute (hop-list) PingER Definition WSDL http://www-iepm.slac.stanford.edu/tools/soap/wsdl/PINGER_profile.wsdl path.delay.roundTrip ms (min/avg/max + RTTs), path.loss.roundTrip IPDV(ms), <definitions name="PINGER" targetNamespace="http://www-iepm.slac.stanford.edu/tools/soap/wsdl/PINGER_profile.wsdl"> <message name="GetPathDelayRoundTripInput"> <part name="startTime" type="xsd:string"/> <part name="endTime" type="xsd:string"/> <part name="destination" type="xsd:string"/> </message> Also dups, out of order, IPDV, TCP thru estimate Require to provide packet size, units, timestamp, sce, dst path.bandwidth.available, path.bandwidth.utilized, path.bandwidth.capacity Mainly for recent data, need to make real time data accessible Used by MonALISA so need coordination to change definitions

    Slide 30:Perl access to PingER

    Slide 31:PingER WSDL

    Slide 32:Output from script

    Slide 33:Perl AMP traceroute

    Slide 34:AMP traceroute output

    Slide 35:Intermediate term access

    Provide access to analyzed data in tables via .tsv format download from web pages.

    Slide 36:Bulk Data

    For long term detailed data, we tar and zip the data on demand. Mainly for PingER data.

    AbWE Iperf 28 days bandwidth history. During this time we can see several different situations caused by different routing from SLAC to CALTECH Drop to 100 Mbits/s by Routing (BGP) errors Drop to 622 Mbits/s path back to new CENIC path New CENIC path 1000 Mbits/s Reverse Routing changes Forward Routing changes Scatter plot graphs of Iperf versus ABw on different paths (range 20800 Mbits/s) showing agreement of two methods (28 days history) RTT Bbftp Iperf 1 stream

    Slide 37:ABwE also works well on DSL and wireless networks.ABwE also works well on DSL and wireless networks.

    Changes in network topology (BGP) can result in dramatic changes in performance Snapshot of traceroute summary table Samples of traceroute trees generated from the table ABwE measurement one/minute for 24 hours Thurs Oct 9 9:00am to Fri Oct 10 9:01am Drop in performance (From original path: SLAC-CENIC-Caltech to SLAC-Esnet-LosNettos (100Mbps) -Caltech ) Back to original path Changes detected by IEPM-Iperf and AbWE Esnet-LosNettos segment in the path (100 Mbits/s) Hour Remote host Dynamic BW capacity (DBC) Cross-traffic (XT) Available BW = (DBC-XT) Mbits/s Notes: 1. Caltech misrouted via Los-Nettos 100Mbps commercial net 14:00-17:00 2. ESnet/GEANT working on routes from 2:00 to 14:00 3. A previous occurrence went un-noticed for 2 months 4. Next step is to auto detect and notify Los-Nettos (100Mbps)