1 / 35

Networking for the Grid

Networking for the Grid. Yee-Ting Li eScience Summer School @ Edinburgh. What the GRID is. Worldwide Distributed System Interconnected with ‘networks’ Balancing processors, storage and network utilization Networking is important to make GRID work. Networking Important!.

riona
Télécharger la présentation

Networking for the Grid

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Networking for the Grid Yee-Ting Li eScience Summer School @ Edinburgh

  2. What the GRID is • Worldwide Distributed System • Interconnected with ‘networks’ • Balancing processors, storage and network utilization • Networking is important to make GRID work

  3. Networking Important! • Only way two grid nodes can communicate with each other • Need ways of determining how ‘efficiently’ they talk • Focus on: • The characterising how they talk • The language they use to talk

  4. Part 1 • Networking • Networking Monitoring • Networks are also transient • Network performance also varies as you’re sharing with n million other users • Sometimes you can notice periodic patterns – sometimes you can’t • Difficult to analyse and create trends/predictions • Show steps towards…

  5. Networking 101 • Networking straight forward • Just connect to the network and it works! • HA!

  6. Networking • Complex? Get’s more complex! • Each node has it’s own scheduling priorities • Routers must serve trillions of data units per second!

  7. Networking • Complex stack from which data has to flow to get onto network • Each node on the network also has their own stacks • Routers have IPR on stacks – no one knows what Cisco stuff looks like!

  8. Example Metrics • Connectivity • Delay • One-way delay • Two-way delay • Throughput / goodput • Network path • Loss • Jitter

  9. Metrics Example • Video Conferencing • Needs predictable bit rate • Doesn’t usually matter if bit rate changes too much • Needs constant jitter • Low one-way delay preferable • FTP • Needs reliable transport • Throughput depends on urgency of data • Jitter and delay don’t matter

  10. Network Monitoring Uses • Monitoring is measuring over long periods of time • Gives an indication of network performance over time – a baseline • Allows comparison of different tools for analysis • Allows analysis of how different protocols behave in different conditions – in real life • Allows ‘tuning’ of existing protocols to make most out of network

  11. Possible Users of a NM Web Service • Network Managers • See how much bandwidth is being used • Network Analysts • Make things faster and better! • Resource Brokers • Broker to determine where to send jobs – Network Cost • Bandwidth Brokers • Allocate bandwidth depending on current network state • Replication Managers • Distribute data only when network is not busy • QoS Brokers (aka Managed bandwidth Services) • Universal language for intercommunication..? • Next Generation FTP • First look up historical throughputs before sending to determine best path

  12. GridNM • Architecture for monitoring the network • Backend – collects data for presentation • Logs metrics in ASCII log files on a single host • Allows mesh measurements – all nodes performs measurements to al other nodes • Uses standard UNIX infrastructure – ssh • Should be easily adaptable to using Globus certifications once interactive processing is introduced in EDG.

  13. GridNM (cont…) • Uses existing (and future tools) to collect metrics • Modular - uses XML to describe available resources • Hosts • Tools • Locks hosts if under measurement – prevents other tests affecting metrics • Currently monitoring 6 sites around Europe using 5 tools

  14. GridNM ‘plot’

  15. Web Service Network Monitoring • GridNM just one Network Monitoring Program • Many different programs out there! • Unify data exchange between different monitoring infrastructures

  16. piPEs • Internet2 e2ePI Architecture for network monitoring • Defines information flow to diagnose networks and hosts performance – white paper • Incorporates a ‘finger pointing’ mechanism to identify poor performers • Ideal starting point! • BUT… found out about it too late… • Currently investigating implementation with SLAC software + web service as possible implementation of piPEs software

  17. GGF NMWG • Defines characteristics that are just the values that we are interested in • Defines classes of metrics, e.g. bandwidth, delay etc. that these characteristics report • Defines singleton and derived characteristics • Defines samples of data and their inherent sampling patterns • Timestamps • Still in draft form…

  18. GGF NMWG cont. / Schema Design • As it’s all in XML, designing a XML schema to describe ‘objects’ to be passed around • XML Schema Document (XSD) • Focusing actually implementing what the NMWG document says… and doesn’t say… • Note: We are also tackling this from a pure OO design too – however, due to technical differences between objects in C++, Java and SOAP/XML then there may be issues to overcome…

  19. Part 2 • Network Communication Languages • Known as transport protocols - determines how applications put traffic into the network • Sits on top of IP – common language of the internet

  20. Transport Level Protocols • TCP (HTTP, FTP, GridFTP) used for file transfer • Gives guarantee on delivery • All data is copied precisely • Performance can be poor • Respects other internet users • UDP (Real, H323) used for video conferencing • Gives no guarantees on delivery • Data may be incomplete • Performance good • Doesn’t respect other internet users

  21. UDP vs TCP • Udp: min=274, max=565, ave=493, stdev=43 • Tcp: min=37, max=292, ave=195, stdev=40 • Summary: tcp is rubbish! – why?

  22. Memory and Disk transfers Fast Ethernet Over 60Mbits/s iperf >> file copy OC3 Disk limited File copy disk-to-disk Iperf TCP Mbits/s Les Cottrell, SLAC

  23. What does TCP do? Socket buffer size • TCP retransmits lost data • Even retransmits data it ‘thinks’ has been lost! • Needs and uses a ‘windowing’ system • Uses ACKnowledgements from reciever • Grows a Congestion Window ‘cwnd’ to determine the size of window • Model: • Tap is independent of Tank size • Tank filled by application • Valve opening (data rate) determined by feedback from network • Small tanks mean small data rate • Large tanks mean larger data rate TCP Protocol Network

  24. TCP socket buffer sizes • Iperf observations: 490 • Standard socket buffer graph • Shows linear(ish) region followed by plateau • Optimal socket buffer size just over 2mB

  25. Retransmitted Data • Graph shows the amount of retransmitted data against the throughput • Retransmitted data is due to loss on the network • General case ACK’s have to timeout before resending • We get more retransmitted data for low throughputs with large windows

  26. Measuring Performance of Transport Level Protocols • Need to identify what we want to measure – the metrics. • Dependant on the use of the transport protocol. Need to analyse application level usage • For Grid: • Movement of ‘transient’ data • File Transfer and Replication • process jobs or ‘sandboxes’ • Movement of Real-Time Data • Video Conferencing – Access Grid • Real-Time applications

  27. Web 100 & TCP • OSI states that we should not know anything about the separate layers • How do we know something is going wrong? – your throughput decreases! • Prevents congestion collapse! • Need Web100! Allows in depth tcp stack analysis per flow • Kernel patch – 2.4.16, alpha1.2 • New version – 2.4.19 alpha2.0pre1 • Using program to grab web100 results - logvars

  28. Reliability of Web100 results… • Still alpha… but reliable • Graph against iperf throughputs correlate very well • At least as reliable as the result offered by iperf!

  29. Congestion Window • Looking at the max_cwnd achieved for each measurement… • Appears to be two regions • with high correlation of throughput and max cwnd • A linear region where we get the a range of throughputs for same max_cwnd • Cwnd never grows beyond 1500kbytes!

  30. Bandwidth Delay Product • Window = bandwidth * delay • We want • Bandwidth = 1,000,000,000 bit/sec • We have • Delay = 19ms • Window needs to be an average of… • =1e+9 * 19e-3 / 8 bytes • =2.25mbytes! • We only achieve ~1.5mbytes max! • Need to implement some monitoring of the degree of the average and variation of cwnd for each tcp connection…

  31. TCP Optimisation • It’s actually TCP that is limiting our transfer rates! • All applications use it! • Understandable as TCP hasn’t changed much for the last 15-20 years! • When standard link was about 56kbit/sec! • Solution: Need new TCP implementations!

  32. What is High Speed TCP? • Changes the way TCP behaves at high speed (ie large cwnd) • Standard TCP has two modes • Slow start (not very slow…) • Congestion Avoidance • Focuses on Congestion Avoidance Region – ie when TCP knows (thinks it knows…) how well the network behaves… • BUT only when we are at high speeds, else do what normal Standard TCP does… • Readily deployable 1st step towards Equation Based Congestion Control

  33. What does it do? • Standard TCP uses two parameters • Increase parameter, a • Decrease parameter, b • i.e. AIMD( a,b ) • Standard TCP uses • a=1 • b=0.5 • High Speed TCP introduces • a->a(cwnd) • b->b(cwnd) • i.e. The value of a and b depends on the current congestion window size • If we increase a more with larger cwnd we can get back up to our ‘optimal’ cwnd size for the network path • If we decrease b less we don’t lose as much bandwidth due to a small congestion window

  34. What exactly does it do? • Based on the TCP response function • Relates loss and throughput • Uses the TCP response function to investigate certain parameters • High_Window, High_Loss; largest cwnd needed for x throughput and the required loss for that throughput • Low_Window, Low_Loss; smallest cwnd when we actually switch from Standard TCP and the required loss rate for that cwnd size • High_B; the smallest decrease in b when we are at a large cwnd • Equations to transform this information into a table for a(cwnd) and b(cwnd)

  35. Transport Protocols ‘NG’

More Related