1 / 64

The importance of the network

The importance of the network. Physical Network View. Overlay View. From a distributed systems standpoint, the physical network provides the backbone for overlays.

morrison
Télécharger la présentation

The importance of the network

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The importance of the network Physical Network View Overlay View • From a distributed systems standpoint, the physical network provides the backbone for overlays. • Distributed systems developers take for granted that a node can talk (reliably, if need be) to any other node in the same physically connected component via some identifier (say, IP address) • Tools, such as network coordinates, can help developers

  2. CS525: What about the network? End-to-End Arguments in System Design J.H. Saltzer D.P. Reed D.D. Clark

  3. Overview • The end-to-end argument • Examples/applications of the argument • Discussion

  4. Function placement • Should functions that end users/applications perform be implemented at lower or higher levels? • If we want to transfer a file reliably, what should be the job of each computer subsystem? • How about encryption? Delivery Acknowledgement? Duplicate message suppression?

  5. End-to-end argument The function in question can completely and correctly be implemented only with the knowledge and help of the application standing at the end points of the communication system. Therefore, providing that questioned function as a feature of the communication system itself is not possible. (Sometimes an incomplete version of the function provided by the communication system may be useful as a performance enhancement.)

  6. Careful file transfer Computer B Computer A

  7. Careful file transfer Computer B Computer A F • File transfer program on A asks file system to read F from disk

  8. Careful file transfer Computer B Computer A F • File transfer program on A asks file system to read F from disk • File transfer program on A asks communication system to send file

  9. Careful file transfer Computer B Computer A • File transfer program on A asks file system to read F from disk • File transfer program on A asks communication system to send file • Communication system transmits packets

  10. Careful file transfer Computer B Computer A F • File transfer program on A asks file system to read F from disk • File transfer program on A asks communication system to send file • Communication system transmits packets • Communication system gives F to file transfer program on B

  11. Careful file transfer Computer B Computer A F • File transfer program on A asks file system to read F from disk • File transfer program on A asks communication system to send file • Communication system transmits packets • Communication system gives F to file transfer program on B • File transfer program on B asks file system to write F to disk

  12. What can go wrong? Computer B Computer A A A • Reading to and writing from file system

  13. What can go wrong? Computer B Computer A A A B B • Reading to and writing from file system • Breaking up file / reassembling file

  14. What can go wrong? Computer B Computer A A A C B B • Reading to and writing from file system • Breaking up file / reassembling file • Transmitting file over communication system

  15. Possible solution #1 • Ensure each step by some form of error checking: duplicate copies, redundancy, timeout and retry, etc. • Packet error checking at each hop • Send every packet three times • Acknowledge packet reception at each hop

  16. Problems with this solution • Not complete; still requires application level checking • May not be economical Computer B Computer A A A B B

  17. Possible solution #2 • “End-to-end check and retry” • Application commits or retries based on checksum value. • If errors along the way are rare, this will most likely finish on first try.

  18. Performance • Lower levels can be reliable as a performance booster • Transferring large files • Regardless of data communication, end-to-end check must be done • Tradeoff based on performance, not correctness • Is the amount of effort put into the reliability worth the performance gain?

  19. Delivery guarantee Computer A Computer B Computer A Computer B • ARPANET returns RFNM to acknowledge successful message delivery • Is this really useful to end application? message message RFNM got it

  20. Data encryption • Communication system needs keys • Cleartext at host, before application • Authenticity check must be performed Computer B Computer A

  21. Data encryption • Keys are maintained by end application • Ciphertext before application • Authenticity by default (assuming both keys are private) Computer B Computer A

  22. Identifying the ends • Low level bit checking is bad for real-time voice transfer: high level error checking is better. • However, low level reliability measures may be fine is voice is being stored.

  23. Discussion: Layering model Computer A Computer B • TCP (usually) runs only at end hosts • Does TCP violate end-to-end by being below application? • Is giving the application the option of TCP or UDP the way to go? Application Application Router Transport Transport Network Network Network Data Link Data Link Data Link Physical Physical Physical

  24. Discussion: TCP splitting • Performance much better in wired section • Intermediate node acts as end host • What else can we do? Computer B Computer A

  25. Discussion: Spam • The end user for email is generally considered to be a human. • By the end-to-end argument, the network should deliver all mail to the user. • Are spam control mechanisms therefore in violation of the end-to-end argument? • If so, is it an appropriate violation?

  26. Discussion: End-to-end today • Is the end-to-end argument still valid today? • Is hardware good enough that we don’t have to worry about end checks? • Applications are becoming more and more complex. • Do P2P systems, such as Chord, violate end-to-end? • Does in-network aggregation, such as in sensor networks, violate end-to-end?

  27. Stable and Accurate Network Coordinates J. Ledlie et al. (Harvard University) In International Conference on Distributed Computing Systems (ICDCS’06) Some slides taken from the author’s presentation

  28. Outline • Background • Two Practical Problems • Latencies are not static • Changing coordinates is expensive • Proposed Solutions • Latency Filter • Update Filter • Conclusion

  29. Outline • Background • Two Practical Problems • Latencies are not static • Changing coordinates is expensive • Proposed Solutions • Latency Filter • Update Filter • Conclusion

  30. Motivation of Network Coordinates (-15,20) (-40,20) E (20,20) C D Player Game Server (0,8) A B (25,8) F (-39,7) RTTAB I Direct measurement is not scalable! Predict latency by coordinates G (20,-15) (-25,-17) H (9,-20) Pick server with lowest mean latency for all players. Use centroid of network coordinates! –Server A

  31. Benefits of NCs • Estimate/Predict RTT without direct probing • Scalability • Make well-understood geometric algorithms applicable to distributed systems problems • Powerful abstraction

  32. How Network Coordinates Work A • A starts measurement to B. • B replies with its coord. A deduces RTT. • A computes estimate and error. • A moves toward ideal coord, relative to B. • Repeat with C, D, E. • Predict to X. (103,84) C A A (100,80) A A RTT=60ms 60ms Coord? E D X at (140,20) B (70,40) Estimate=|(100,80)-(70,40)|=50ms Error=RTT-Estimate=60-50=10ms Goal: minimize global prediction error X

  33. Vivaldi Network Coordinates • Simple • Adaptive • Periodic RTT measurements with neighbors • Refine coordinates (pulled or pushed by each neighbor) • Decentralized • Works well… in simulation

  34. Outline • Network Coordinates • Two Practical Problems • Latencies are not static • Changing coordinates is expensive • Proposed Solutions • Latency Filter • Update Filter • Conclusion

  35. Problem #1: Latencies are not Static • Raw latency data have errors and change RTTAC=5ms,5ms,6ms,40ms,41ms,40ms RTTAB=60ms,60ms,59ms,1000ms,70ms,60ms A B C

  36. Problem #1(a): Errors are Unpredictable Three hours of measurements from berkeley to uvic.ca 82% of measurements within 1ms of median

  37. Problem#1(b): Latencies can Change Three days of measurements from ntu.edu.tw to 6planetlab.edu.cn Need to remove noise, but remain adaptive

  38. Outline • Network Coordinates • Two Practical Problems • Latencies are not static • Changing coordinates is expensive • Proposed Solutions • Latency Filter • Update Filter • Conclusion

  39. Solution #1: Latency Filter Problem: Latencies are not Static • Filtering with histories • Minimum of previous four samples works best. Time newest oldest t0 Receives 1000ms RTT t1 t2 How do they find out?

  40. Solution #1: Latency Filter • General Moving Percentile (MP) filter • h: size of the history window • p: percentile returned as the prediction • e.g. “Minimum of previous four samples” • h=4, p=25% • Run experiments on the 3-day trace, varying h and p • Evaluation metric: Relative Error • “h=4,p=25%” achieves the lowest error Relative Error = (|RTT-Estimate|)/RTT

  41. Latency Filter in the Big Picture Simple Thresholds Sliding Windows

  42. Latency Filter in Practice 226 PlanetLab nodes (coord in 3D Space) Latency Filter (h=4,p=25%) Raw Coordinates Latency Filters eliminate outliers that cause distortions of many coords all at once (e.g., minute 38 of the video)

  43. Outline • Network Coordinates • Two Practical Problems • Latencies are not static • Changing coordinates is expensive • Proposed Solutions • Latency Filter • Update Filter

  44. Problem #2:Changing Coordinates is Expensive • Frequent coord change, even with Latency Filter • App-specific cost • e.g., cascading heavyweight process migration in streaming DBs • Most apps would prefer to be notified only when significant change occurs • Is it possible to tell apps less frequently and retain high accuracy?

  45. Outline • Network Coordinates • Two Practical Problems • Latencies are not static • Changing coordinates is expensive • Proposed Solutions • Latency Filter • Update Filter

  46. Solution #2: Update Filter Problem: Changing Coordinates is Expensive • Distinguish system-level coordinatesCs from application-level coordinates Ca Simple Thresholds Sliding Windows

  47. Solution #2: Window-based Update Filter • Keep history of recent coordinates • Divide history into two windows (sets): current (newest) and start (oldest) • When current and startdiverge (by some metric), update application with new coordinate • Two Metrics • Local Relative Distance • Energy

  48. Update Filters:Local Relative Distance • Remember nearest known neighbor • Add coords to start and current windows • Compare centroids of two windows B dmin A C C0 C1 C2 C3

  49. Update Filters:Local Relative Distance • Remember nearest known neighbor • Add coords to start and current windows • Compare centroids of two windows B B d dmin A A C Start Window Ws C4 C0 C1 C2 C3 Current Window Wc

  50. Update Filters:Local Relative Distance • Remember nearest known neighbor • Add coords to start and current windows • Compare centroids of two windows B B d dmin A A C Start Window Ws C4 C5 C0 C1 C2 C3 Current Window Wc If Centroid(Ws)-Centroid(Wc) > d x e

More Related