180 likes | 304 Vues
GEOHYBRID: A HIERARCHICAL APPROACH FOR ACCURATE AND SCALABLE GEOGRAPHIC LOCALIZATION. ITU-T Kaleidoscope 2010 Beyond the Internet? - Innovations for future networks and services. Dr. Bamba Gueye Joint work with Ibrahima Niang and Bassirou Kasse University Cheikh Anta Diop of Dakar
E N D
GEOHYBRID: A HIERARCHICAL APPROACH FOR ACCURATE AND SCALABLE GEOGRAPHIC LOCALIZATION ITU-T Kaleidoscope 2010Beyond the Internet? - Innovations for future networks and services Dr. Bamba Gueye Joint work with Ibrahima Niang and Bassirou Kasse University Cheikh Anta Diop of Dakar bamba.gueye@ucad.edu.sn http://edmi.ucad.sn/~gueye
Motivations • New class of location-aware applications • Web services: targeted advertising, locating cyber-criminality, restricted content delivery • Location-based security check • Geographic information of the Internet routes • Analyze the geographic behavior of network routing, internet topology mapping • Optimization of the decision taking process of a Grid Resource Broker
Problem statement • IP-location mapping: Given the IP address of an Internet host, can we estimate its geographic location? AS1 AS3 AS2
How to locate Internet host ? • Passive measurements approach • Use databases and exhaustive tabulation • Ex: GeoBytes, GeoURL, GeoIP • Active measurements approach • Round-trip time-based or/and Traceroute-based • Ex: GeoPing, CBG, GeoBuD, GeoTrack, SarangWorld project • Hybrid approach • Ex: GeoHybrid
Passive measurements approach DNS-based • Incorporating location information in DNS records • RFC 1876 Whois-based approach • Use location information recorded in the Whois database • 141.152.24.9: where am I? • Response = the location of the ISP Verizon (141.149.0.0/13), Reston, VA
Drawbacks Information recorded in Whois database may be inaccurate or stale Large ISPs advertise only aggregate prefix for reasons of scalability Single prefix with multiple locations [Feamster et al. IMC05, Gueye et al. PAM08]
Active measurement approach DNS-based approach • Observation: • Recognizable host: geographically meaningful names • Ex: bcr1-so-2-0-0.Paris.cw.net • Use the reports of “traceroute” • Location of a target host = location of the last recognizable router on the path
DNS-based approach drawbacks • No rules for naming the routers [Rexford et al. USENIX06] • charlotte.ucsd.edu – San Diego, CA (not Charlotte, NC) • dnverng-kscyng.abilene.ucaid.edu – Denver, CO (not Kansas, KS) • The last recognizable router can be located far
Active measurements approach GeoPing-like [Padmanabhan et al. SIGCOMM01] The number of possible locations of a given host is equal to the number of landmarks (discrete space of answers)
CBG: A continuous response space via multilateration • Multilateration [Gueye et al. ToN06] • Estimates position using distance estimates from some fixed points • Similar to GPS • Active measurements • From a set of landmarks to a target host • Select minimum RTT • Transformation of RTT measurements into distance
Constraint-Based Geolocation (CBG) : locating Internet host d2 + d’2 d2 P2 d1 + d’1 d1 d3 d3 + d’3 P1 P3 • Multilateration using distance constraints • Distance constraints are overestimated • Assigns confidence region to each location estimate • Intersection • Estimated location of target host • Confidence region
Contributions • GeoHybrid • Combination of active and passive measurements • Reduce the number of measurements injected in the network • Geolocation service for grid computing middleware • Useful for the optimization of the decision taking process of a Grid Resource Broker
GeoHybrid: A comprehensive technique for geolocalization Server Database (IP to location mappings) Geolocation queries Geolocation answers Scripts for handlings active measurements • GeoHybrid is based on: • active measurements • CBG approach • passive measurements • Database with exhaustive tabulation • Heuristic implemented • Find the nearest set of landmarks for a given target host
Experimental setup • 74 PlanetLab nodes as landmarks • 127 hosts (AMP and RIPE nodes) as targets • GeoIP’s database [MaxMind LLC] • 1,876,596 blocks of IP addresses • Each block has its own geographic location such as country, region, city or latitude/longitude
Results Heuristic approach Median error: 175 km Random approach Median error: 400 km Random approach: 30 samples for each landmarks
Summary of GeoHybrid • GeoHybrid reduces: • Number of measurements • Response time • A set of 20 nearest landmarks are sufficient to locate Internet hosts • Exhaustive tabulation is difficult to manage and keep updated
Conclusions • GeoHybrid allows to reduce the number of measurements • The proposed measurement middleware service brings benefits for the area of grid computing • Mitigates the amount of traffic exchanged across the grid • We plan to implement this middleware in the Research Education Network of Senegal