1 / 111

Internet Measurements

Internet Measurements. CS 401/601 Computer Network Systems Mehmet Gunes. Web of interconnected networks Grows with no central authority Autonomous Systems optimize local communication efficiency The building blocks are engineered and studied in depth

tea
Télécharger la présentation

Internet Measurements

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Internet Measurements CS 401/601 Computer Network Systems Mehmet Gunes

  2. Web of interconnected networks Grows with no central authority Autonomous Systems optimize local communication efficiency The building blocks are engineered and studied in depth Global entity has not been characterized Most real world complex-networks have non-trivial properties. Global properties can not be inferred from local ones Engineered with large technical diversity Range from local campuses to transcontinental backbone providers Internet

  3. Role of Internet Directories and Databases • Address registries • Domain Name System (DNS) • Internet Address and Routing Registries • Internet Assigned Numbers Authority (IANA) • Internet Routing Registry • Clearinghouse for AS number mapping • Regional Internet Registries (RIR)

  4. Role of Internet Directories and Databases

  5. Internet Measurements • Need for Internet measurements arises due to commercial, social, and technical issues • Realistic simulation environment for developed products • Improve network management • Robustness with respect to failures/attacks • Comprehend spreading of worms/viruses • Know social trends in Internet use • Scientific discovery • Scale-free (power-law), Small-world, Rich-club, Dissasortativity,…

  6. Challenges to measurement “Poor Observability” • Reasons for this: • Core simplicity • Layered architecture • Hidden pieces • Administrative barriers

  7. Internet Measurements are anything but straightforward… • Internet Measurement is key to designing the next generation communication network • Fundamental design principles of the current internet make it harder for measuring various aspects of it • Preliminary research has resulted in a set of basic tools and methods to measure aspects like topology, traffic etc. • There is still a lot of ground to cover in this direction

  8. Where Can Measurements Be Made? IXP

  9. Measurement Types

  10. Topology Measurements

  11. Properties to Measure • Topology Properties • Autonomous System (AS) • Point of Presence (PoP) • Router • Interface

  12. Longitudinal comparison Sources: 1971 - "Casting the Net", page 64; 1980 - http://mappa.mundi.net/maps/maps_001/ http://personalpages.manchester.ac.uk/staff/m.dodge/cybergeography/atlas/historical.html

  13. Internet Topology CAIDA 2006

  14. Internet Topology Measurement CAIDA 2006

  15. Internet Topology Measurement CAIDA 2006

  16. IPv4 address space (2010) Ant Census Data researchers have been collecting data about the Internet address space  ~ 3.5 B IPs ~ 250 M replies browse historical

  17. Active Measurement Tools • Methods that involve adding traffic to the network for the purposes of measurement Ping: Sends ICMP ECHO_REQUEST and captures ECHO_REPLY • Useful for measuring RTTs • Only sender needs to be under experiment control One-Way Active Measurement Protocol (OWAMP): A daemon running on the target which listens for and records probe packets sent by the sender • Useful for measuring one-way delay • Requires both sender and receiver to be under experiment control • Requires synchronized clocks or a method to remove clock offset

  18. Probing • Direct probing • Indirect probing IPD Vantage Point IPD TTL=64 B C D A IPB IPC Vantage Point B C D IPD TTL=1 IPD TTL=2 A

  19. Traceroute • Useful for determining path from a source to a destination • Uses the TTL (Time To Live) field in the IP header in a clever but distorted way • Large scale measurement systems use traceroute to discover network topology

  20. Traceroute • Probe packets are carefully constructed to elicit intended response from a probe destination • traceroute probes all nodes on a path towards a given destination • TTL-scoped probes obtain ICMP error messages from routers on the path • ICMP messages includes the IP address of intermediate routers as its source • Merging end-to-end path traces yields the network map IPB IPA IPC IPD Vantage Point Destination TTL=1 TTL=4 TTL=2 TTL=3 A B C D S

  21. IP protocol version number 32 bits total datagram length (bytes) type of service head. len header length (bytes) ver length for fragmentation/ reassembly fragment offset “type” of data flgs 16-bit identifier max number remaining hops (decremented at each router) time to live upper layer Internet checksum 32 bit source IP address 32 bit destination IP address upper layer protocol to deliver payload to E.g. timestamp, record route taken, specify list of routers to visit. Options (if any) data (variable length, typically a TCP or UDP segment) IP Header and the TTL field

  22. TTL normal usage • TTL is initialized by the sender and decremented by one each time the packet passes through a router • If it reaches zero before reaching the destination, IP protocol requires that the packet be discarded and an error message be sent back to the sender • Error message is an ICMP “time exceeded” packet

  23. Traceroute Problem • Suppose the path between A and D is to be determined using traceroute X Y D A B C

  24. Traceroute Process X Y D A B: “time exceeded” Dest = D TTL = 1 B C

  25. Traceroute Process X Y D A C: “time exceeded” Dest = D TTL = 2 B C

  26. Traceroute Process X Y D A D: “echo reply” Dest = D TTL = 3 B C

  27. Internet Topology Measurement Internet2 backbone S s.3 s.2 s.2 n.1 n.3 n.3 N c.2 w.2 w.1 u.1 c.1 W C c.3 w.3 w.3 u.2 U c.4 k.1 k.2 K u.3 l.1 k.3 Trace to NY a.1 a.2 l.2 L A l.3 l.3 a.3 a.3 h.2 Trace to Seattle H h.3 h.1 h.4 h.4 h.4 d

  28. Internet Topology Measurement s.1 f e S s.3 n.2 s.2 n.1 n.3 N c.2 w.2 w.1 c.1 u.1 W C c.3 w.3 u.2 U c.4 k.1 k.2 K u.3 l.1 k.3 a.1 a.2 l.2 L A l.3 a.3 • Traces • d - H - L - S - e • d - H - A - W - N - f • e - S - L - H - d • e - S - U - K - C - N - f • f - N - C - K- H - d • f - N - C - K - U - S - e h.2 H h.3 h.1 h.4 d

  29. Challenges • Infrastructural Issues • Sampling • Vantage Points and Destination List • Probing Overhead • Inter- and Intra-monitor Redundancy • Responsiveness of Routers • ICMP, UDP, TCP • Load Balancing Routers • Per destination, per flow, per packet

  30. Traceroute issues • Path Asymmetry • Destination -> Source need not retrace Source -> Destination • Unstable Paths and False Edges • Aliases • Measurement Load

  31. Unstable Paths and False Edges Inferred path: A -> B -> Y Y: “time exceeded” Dest = D TTL = 2 X Y D A B: “time exceeded” Dest = D TTL = 1 B C

  32. Topology Sampling: Issues • Sampling to discover networks • Infer characteristics of the topology • Different studies considered • Effect of sample size • Sampling bias • Path accuracy • Sampling approach • Utilized protocol • ICMP echo request • TCP syn • UDP port unreachable • ~ 10% of routers are unresponsive

  33. Measurement Load • Traceroute inserts considerable load on network links if attempting a large-scale topology discovery • Optimizations reduce this load considerably • If single source is used, instead of going from source to destination, a better approach is to retrace from destination to source • If multiple sources and multiple destinations are used, sharing information among these would bring down load considerably

  34. Intra-monitor redundancy Destination 2 Destination 1 Destination 3 Monitor 1

  35. Inter-monitor redundancy Destination 1 Monitor 2 Monitor 1 Monitor 3

  36. y y S S L 1 2 H H x x Unresponsive Routers • Unresponsive routers do not respond to traceroute probes and appear as  in traceroute output • Same router may appear as  in multiple traces. y y: S – L – H – x y: S –  – H – x S L H x: H – L – S – y x: H –  – S – y x

  37. Unresponsive Router Resolution f Internet2 backbone e S N C W U K L A H • Traces • d -  - L - S - e • d -  - A - W -  - f • e - S - L -  - d • e - S - U -  - C -  - f • f -  - C -  -  - d • f -  - C -  - U - S - e d

  38. Common Structures due to ARs  y1 y1   A C x y2  y3 y3 Parallel -substring A C x y2 D w A A D x x w A D x w  E  z  C y C y C y F v Clique Complete Bipartite Star A D x w A D x w  C D E w  y z C E y z A x   E z E z      E  z     C y  F v

  39. IP Alias Resolution .33 .5 • Each interface of a router has an IP address. • A router may respond with different IP addresses to different queries. • Alias Resolution is the process of grouping the interface IP addresses of each router into a single node. • Inaccuracies in alias resolution may result in a network map that • includes artificial links/nodes • misses existing links .18 Denver .7 .13

  40. IP Alias Resolution s.1 f e S s.3 n.2 s.2 n.1 N n.3 c.2 u.1 w.1 w.2 c.1 W C c.3 u.2 w.3 U k.1 c.4 k.2 K u.3 k.3 l.1 a.1 l.2 a.2 L A l.3 a.3 h.2 • Traces • d - h.4 - l.3 - s.2 - e • d - h.4 - a.3 - w.3 - n.3 - f • e - s.1 - l.1 - h.1 - d • e - s.1 - u.1 - k.1 - c.1 - n.1 - f • f - n.2 - c.2 - k.2 - h.2 - d • f - n.2 - c.2 - k.2 - u.2 - s.3 - e H h.3 h.1 h.4 d

  41. IP Alias Resolution Approaches • Source IP Address Based Method • Relies on a particular implementation of ICMP error generation. • IP Identification Based Method (ally) • Relies on a particular implementation of IP identifier field, • Many routers ignore direct probes. • DNS Based Method • Relies on similarities in the host name structures sl-bb21-lon-14-0.sprintlink.net sl-bb21-lon-8-0.sprintlink.net • Works when a systematic naming is used. • Record Route Based Method • Depends on router support to IP route record processing B Dest = A A A B B A, ID=100 Dest = A Dest = B B, ID=99 B, ID=103 Dest = B

  42. Subnet Inference • Subnet resolution • Identify IP addresses that are connected over the same medium • Improve the quality of resulting topology map A B C D IP1 IP1 IP2 IP3 IP2 IP3 A B C D A B A B C D C D (underlying topology) (observed topology) (inferred topology)

  43. Subnet Inference Approach 129.110.0.0/16 129.110.1.1 129.110.1.2 129.110.2.0 129.110.2.1 129.110.4.1 129.110.4.83 129.110.4.217 129.110.12.1 129.110.12.2 129.110.12.6 129.110.17.1 129.110.17.135 129.110.219.1 2 3 3 4 2 1 2 4 5 5 4 5 3 V.P. 129.110.1.0/31 129.110.219.0/24 /24 129.110.2.0/30 129.110.4.0/24 /24 129.110.12.0/29 129.110.4.0/24 /30 129.110.1.0/30 /29 129.110.2.0/31 /31 129.110.12.0/29 129.110.6.0/28 129.110.17.0/24 129.110.17.0/24 /28 /24 Subnet-level Internet mapping : Subnet Inference

  44. Analytical IP Alias Resolution no response UTD 129.110.95.1 no response 129.110.5.1 206.223.141.74 206.223.141.73 206.223.141.69 Aliases 129.110.5.1 - 206.223.141.74 206.223.141.73 - 206.223.141.69 206.223.141.70 - 198.32.8.33 … 206.223.141.70 198.32.8.33 198.32.8.34 198.32.8.65 198.32.8.66 198.32.8.85 198.32.8.84 192.5.89.10 192.5.89.89 192.5.89.9 192.5.89.90 18.168.0.27 18.7.21.1 18.168.0.25 MIT 18.7.21.84

  45. Sample AS backbones

  46. Geolocation • Given the network address of a target host, what is the host’s geographic location? • The answer to this is useful for a wide variety of social, economic and engineering purposes • The actual location of network infrastructure sheds light on how it relates to population, social organization and economic activity

  47. Geolocation methods • Name Based Geolocation • Extracting location details from ISPs domain names • Location Databases • Delay Based Geolocation • Best Landmark • Constraint-based

  48. Landmark based geolocation • In best landmark approach, minRTT between each of the identified landmarks is measured and stored • Then the same metric is calculated between the node in question and each of the landmarks. • The landmark with the best matching values of minRTT is the closest to the node

  49. Constraint based geolocation • The distances of target location from sufficient number of fixed points are calculated and using multilateration • Used in GPS • However, Internet delay is affected by many factors (i.e., non-linear)

  50. Passive Measurements • Methods that capture traffic generated by other users and applications • Routeview repository collects BGP views (routing tables) from a large set of ASes • Similarly, OSPF LSAs can be captured and processed to generate router graphs within an AS

More Related