Scalable Integrated Performance Analaysis of Multi-Gigabit Networks

Scalable Integrated Performance Analaysis of Multi-Gigabit Networks Ezra Kissel, U. Delaware Ahmed El-Hassany, Guilherme Fernandes, Martin Swany, Indiana U. Dan Gunter, Taghrid Samak, LBNL Jen Schopf, WHOI

What I hope you learn • Why we care about bulk data transfer at multi-gigabit rates • Why and how detailed monitoring is helpful • How dynamic control of monitoring is related to Session Layer protocols

Bulk data transfer needs • Some domains of interest: • Climate simulation (Earth System Grid) • Genomics (JGI) • High-energy physics (Large Hadron Collider) • Astronomy (Large Synoptic Survey Telescope) • Astrophysics (FLASH) Analysis sites Huge data

Multi-gigabit rates • Networks connecting national labs and universities have 10Gb/s and soon 100Gb/s capability. one PB = one day at 100Gb/s • Rarely achieved due to bottlenecks: • Host: Application or Disks • Campus/local networks • Wide area networks • Hard to tell why, where, or even if there is a problem

Solution Monitor all the time Analyze all the time .. but much more when something interesting is happening Use analysis results as feedback

System components • eXtensible Session Protocol (XSP) • Associate multiple TCP connections, L2 circuits, as a "session" • Provide channels for bi-directional metadata • NL-Calipers • Summarize in situ timings of every read/write • BLiPP • Host and TCP stack info. using XSP channels • PerfSONAR • Standard information formats and exchange protocols

Dynamic Session Monitoring Look at the performance Network engineer User (1) Start xfer (3) NL-calipers data 3) data (2) Open session (5) data (4) Signal TCP (5) data (4) Signal TCP

Bottleneck detection Triangles give "instantaneous" throughput Instrumentation On fixed intervals, summarize all measurements into mean, min, max, variance for both rate and #bytes Analysis: pick lowest mean value as bottleneck, apply t-test

TCP throughput Time series of throughput* for representative TCP experiments: (a) 1 stream memory-to-disk with 100ms latency, (b) 1 stream memory-to-memory with no latency, (c) 1 stream disk-to-disk with no latency, (d) 4 streams memory-to-disk with 100ms latency and 1% loss added at 60 seconds.

UDT throughput Time series of throughput* for representative UDT experiments: (a) 4 streams memory-to-disk with 100ms latency, (b) 4 streams memory-to-disk with 100ms latency and 1% loss added at 60 seconds, (c) 4 streams disk-to-disk with 100ms latency, (d) 4 streams memory-to-memory with 100ms latency.

Wait, what?

Variance Half as many read()s. Others return zero, not counted Less work being done

Review • Why we care about bulk data transfer at multi-gigabit rates • Why and how detailed monitoring is helpful • How monitoring is related to Session Layer protocols • and how that might integrate with a management framework • Questions?

Related projects • NetLogger netlogger.lbl.gov • perfSONAR perfsonar.org • XSP damsl.cis.udel.edu/ • GENI geni.net • CEDPS cedps-scidac.org

Topology-aware Monitoring

Scalable Integrated Performance Analaysis of Multi-Gigabit Networks

Scalable Integrated Performance Analaysis of Multi-Gigabit Networks

Presentation Transcript

Scalable Content-Addressable Networks

Scalable Multi-Label Annotation

High-Speed Transmission Evaluation of Gigabit Local Area Networks by Limiting Performance Model

Performance of Optical Burst Swatching Techniques in Multi-Hop Networks

Scalable Multi-core Sonar Beamforming with Computational Process Networks

INTEGRATED REAL-TIME PERFORMANCE MONITORING OF OBSERVING NETWORKS

Scalable TCP: Improving Performance in Highspeed Wide Area Networks

Scalable Performance Models for Networks with Correlated Traffic

Medical Image Analaysis

TCP Performance in Wireless Multi-hop Networks

INTEGRATED MULTI-FACTOR RISK MANAGEMENT AND PERFORMANCE ATTRIBUTION

e-VLBI: Science with Multi-Gigabit Global Networks

Evaluating System Performance in Gigabit Networks

MRNet: From Scalable Performance to Scalable Reliability

Multi-Gigabit transmission BLT  GOLD

“ Multi-functional Mesoscale Observing Networks in Support of Integrated Forecasting Systems”

Gigabit Multi-Homing VPN Security Router

SmartARP: Making Gigabit Networks Cheap

Performance of Neural Networks

SmartARP: Making Gigabit Networks Cheap

INTEGRATED REAL-TIME PERFORMANCE MONITORING OF OBSERVING NETWORKS

Scalable Interconnection Networks