1 / 33

Paper Presented by Nicholis Bufmack

Constructing Internet Coordinate System Based on Delay Measurement Hiyuk Lim, Jennifer C. Hou, and Chong-Ho Coi. Paper Presented by Nicholis Bufmack. Introduction. The Problem: Estimating network distances between arbitrary Internet hosts.

skyler
Télécharger la présentation

Paper Presented by Nicholis Bufmack

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Constructing Internet Coordinate System Based on Delay MeasurementHiyuk Lim, Jennifer C. Hou, and Chong-Ho Coi Paper Presented by Nicholis Bufmack Nicholis Bufmack - CS 622

  2. Introduction • The Problem: Estimating network distances between arbitrary Internet hosts. • The Solution: Represent the locations of Internet hosts in a Cartesian coordinate system. • Topology provides useful for nearby server selection, overlay network construction, routing path construction, and peer-to-peer computing. Nicholis Bufmack - CS 622

  3. Estimating Distances Between Hosts • Construct network topology without direct measurement between hosts. • Several network properties may be used: bandwidth, round-trip time (RTT), packet loss • Framework consists of a common architecture consisting of beacon nodes. • Hosts only determine distance to beacon nodes. Nicholis Bufmack - CS 622

  4. Other Approaches • IDMaps: use distance to beacon nodes to represent location of host. • GNP: transform original distance data space into coordinate system and use coordinate system to represent the location of host. • GNP is superior, but flawed: no guarantee that host has unique coordinate. Nicholis Bufmack - CS 622

  5. Related Work – IDMapsInternet Distance Map Service • Developed by Francis, et al. – beacons measure distance to IP address prefixes close to itself and then use spanning tree algorithm to find shortest distance between measured hosts. • Does not analyze delay measurements or infer network topology. • Performance depends heavily on placement and number of beacons: small in number and dispersed provides poor performance. Nicholis Bufmack - CS 622

  6. Related Work – GNPGlobal Network Positioning • Developed by Ng – represents the location of each host in geometric space. • Distance between hosts is defined by a geometric function. • Has advantage of being able to extract network topology information from measured network distances. • Has disadvantage of not ensuring that each host will have a unique coordinate. Nicholis Bufmack - CS 622

  7. ICS – Internet Coordinate System • Infers the network topology based on delay measurement. • Estimates the distance between hosts without direct measurement. • Less susceptible to the distance metrics used in representing topological information. • Uses a smaller set of uncorrelated bases to represent the Cartesian space. Nicholis Bufmack - CS 622

  8. Base Correlation • Principle component analysis is used to minimize the large number of (possibly) correlated variables. • Consider a principle component to a collection of orthogonal projections representing the direction of maximum variance. • Singular Decomposition is used to determine each PCA. Nicholis Bufmack - CS 622

  9. Nicholis Bufmack - CS 622

  10. Effect of Distance Metrics on ICS • Independence of ICS on distance metric used is a consequence of PCA. • Representing the topological information as orthogonal vectors maximizing variance removes reliance on the details of the underlying data set. Nicholis Bufmack - CS 622

  11. How Small Should the Number of Dimensions Be? • Usually defined as the cumulative percentage of variance that selected principle components contribute. • A threshold is established (80%) and the number of dimensions is • Tk = 100 x (∑j=1k varj) / (∑j=1m varj) • K is the number of principle components that cause cumulative variance to reach established threshold. • It is the number of dimensions in the new coordinate system. Nicholis Bufmack - CS 622

  12. Definitions • Raw Distance Space: m hosts measure RTT to other hosts using ping or traceroute. • The coordinate of a host Hiin an m-dimensional system • di = [di1, …, dim]T • dij does not equal dji because forward and reverse paths may be different • The overall m x m system of distances where each column i is the host Hi • D = [d1, …, dm] Nicholis Bufmack - CS 622

  13. Nicholis Bufmack - CS 622

  14. Overview • Beacon nodes periodically measure RTTs to other beacon nodes and construct a coordinate system. • Coordinates of beacon nodes are calculated from raw distance space. • Ordinary hosts determine location by measuring delay to entire or partial set of beacon nodes to obtain a distance vector. • Location is determined by multiplying distance vector with a transformation matrix. Nicholis Bufmack - CS 622

  15. Calculating the Coordinates of the Beacon Nodes • Beacon nodes independently determine d • Aggregate information and determines D • Apply primary component analysis (PCA) to obtain the transformation matrix U; the orthogonal bases of the new subspace. • Determine dimension of the coordinate system using cumulative percentage variation. • Calculate the transformation matrix Un Nicholis Bufmack - CS 622

  16. Determining the Coordinates of A Host • Host obtains the list of beacon nodes and the transformation matrix Un . • Measure network distances to all beacon nodes using RTT from ping or traceroute. • Ia = [Ia1, …, Iam]T • Calculate the coordinate xa by multiplying the measured distance vector Ia with the transformation matrix • xa = UnT.Ia Nicholis Bufmack - CS 622

  17. Empirical Study • Compared against IDMaps and GNP. • Used 2 datasets • National Laboratory for Applied Network Research(NLANR) – contains RTT, packet loss, topology, and on demand throughput measurements from 113 monitors. • GT-ITM topology generator – synthetic generator using 3 <= m <= 30 beacon nodes. • Used proximity as the comparison metrix, defined as the distance from the closest calculated host to the actual closest host. Nicholis Bufmack - CS 622

  18. Comparison in Terms of Estimation Error • IDMaps has large estimation error that decreases with the number of beacon nodes. • GNP has an increase in estimation error as the number of beacon nodes increases. • ICS has less estimation errors than IDMaps and outperforms GNP for # hosts >= 15. Nicholis Bufmack - CS 622

  19. Nicholis Bufmack - CS 622

  20. Effect of the Coordinate System Dimension on the Performance • Estimation error of ICS is largest when n = 2 and improves as n increases, leveling off when n >= 6. • Estimation error of GNP is smallest when n = 4 and much larger when n = 6. • GNP is slightly better when 5 <= m <= 16 • ICS does not show a significant increase in estimation error when the number of measurements is small. GNP does. Nicholis Bufmack - CS 622

  21. Nicholis Bufmack - CS 622

  22. Comparison Between ICS and GNP in Terms of Computational Costs • As number of beacon nodes increases, the computation time of GNP for calculating coordinates of beacon nodes exponentially increases. • ICS has a maximal computation time of approximately 17.1 ms compared to GNP’s maximum of 884.06 s (~ 15 minutes). Nicholis Bufmack - CS 622

  23. Nicholis Bufmack - CS 622

  24. N-hierarchical Network Topology • Network topology is a tree. • Each level represents a distance from the root node. • Higher levels also represent much more complexity in the network topology, even if the number of nodes happens to be the same (as is the case for the empirical study). Nicholis Bufmack - CS 622

  25. Effect of Topology Complexity on Performance • In two-level hierarchical topology, ICS performed better than GNP and IDMaps. • In three-level hierarchical topology, ICS gave almost the same performance. • In both instances, estimation error for ICS remained relatively constant. Nicholis Bufmack - CS 622

  26. Nicholis Bufmack - CS 622

  27. Enhancements • ICS can be enhanced by clustering. • Well distributed and selected placement of beacon nodes increases performance across the board. • Performance enhancement comes from the fact that the basis of the coordinate system is measurements between beacon nodes. Node distance is measured in-cluster and between clusters. • Increases performance of partial measurements where only a limited number of beacon nodes are used by a host to calculate Ia Nicholis Bufmack - CS 622

  28. Nicholis Bufmack - CS 622

  29. Conclusion • ICS can effectively extract topological information from delay measurements between beacon hosts. • A coordinate system of much smaller dimensions can be extracted from raw data space using ICS enabling end hosts to obtain a unique location with a small number of measurements. • ICS makes accurate estimates, is much less computationally expensive, and is much less dependent on the number of beacon nodes, coordinate system dimension, or complexity of topology. Nicholis Bufmack - CS 622

  30. Assessment of ICS • ICS provides a promising solution for determining network distances (nearest server problem). (Napster, online gaming). • Geographical distance does not equal network distance (due to routing polices, network connectivity). (PlanetLabs) • Could be applicable to network construction of peer-to-peer systems and routing in mobile ad-hoc networks where number of hosts can change. (Promise P2P) Nicholis Bufmack - CS 622

  31. Future Work • Visual tools need to be developed. • More research needs to be done on the ideal number of dimensions necessary to represent the internet (6-8?). Nicholis Bufmack - CS 622

  32. References Nicholis Bufmack - CS 622

  33. References (cont.) Nicholis Bufmack - CS 622

More Related