1 / 24

IIT RTC Conference October 15 - 17, 2013

Of maps and costs : Aggregating large-scale broadband measurements for the Application Layer Traffic Optimization (ALTO) protocol. IIT RTC Conference October 15 - 17, 2013. David Goergen 1 Vijay K. Gurbani 2 Radu State 1. OUTLINE. Premise ALTO: background FCC dataset Processing

noma
Télécharger la présentation

IIT RTC Conference October 15 - 17, 2013

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Of maps and costs: Aggregating large-scale broadband measurements for the Application Layer Traffic Optimization (ALTO) protocol IIT RTC Conference October 15 - 17, 2013 David Goergen1 Vijay K. Gurbani2 Radu State1

  2. OUTLINE • Premise • ALTO: background • FCC dataset • Processing • Evaluation and discoveries IIT RTC conference

  3. Premise • Essential to study trends and derive network analytics • Twoextremesexist • Complete and highlydetailsraw data • Userslost in details • High amount of data • Highlyaggregated and summerized reports • Humanreadable format • i.e. charts, presentations, reports • Oftencannotbefurtherinvestigated •  There is a need for an intermediateway • ALTO Protocol seems a good choice. IIT RTC conference

  4. ALTO Introduction ALTO solves the general rendezvous problem: Given a choice of resources, which one is the best candidate? Recurring pattern in many domains: Peer-to-peer (BitTorrent) Which peers are close to me? Which peers have high upload bandwidth? Content delivery networks (CDN) Rendezvous me with nearest surrogate Network routing and distance calculation Shortest path computation Data centers and cloud computing Where is my nearest data center? Which server is lightly loaded? Which data center has the lowest network utilization? IIT RTC conference

  5. ALTO Introduction History Circa 2008 --- Comcast and BitTorrent P2P traffic dominates the Internet Internet Service Providers wanted a well-behaved network ISPs wanted to reduce transit costs. BitTorrent traffic exhibits greedy behaviour to optimize local maxima at the expense of other time-sensitive traffic. May 2008 IETF Workshop on P2P Infrastructure held in MIT to arrive at mitigating solutions Outcome: 2 Working Groups LEDBAT: Low Effort Extra Delay Background Transport ALTO: Application Layer Traffic Optimization IIT RTC conference

  6. ALTO Introduction ALTO is: An Application Layer Traffic Optimization Protocol An IETF Working Group An IETF (soon-to-be) standard RFC A restful API that provides topology maps and cost maps to clients A restful API that provides building blocks to construct: Ranking service Endpoint cost service Endpoint property service Map Filtering service What is an endpoint? An IP address, a MAC address, an aggregation of IP addresses, ... IIT RTC conference

  7. ALTO Introduction ALTO Architecture ISP Provisioning policies Routing protocols Dynamic network information ALTO client ALTO server ALTO service discovery External interfaces Standardized protocol Not subject to standardization Third parties, content providers, ... IIT RTC conference

  8. ALTO Introduction 2 main abstractions: Network Map Cost Map Network specified in terms of Partition/Provider ID (PID): aggregation of endpoints identified by a provider-defined network location identifier. Costs are normalized and have two attributes: Type: What does the cost represent? Air-miles, hop count, ... Mode: How to interpret the cost. Numerical (mathematical operations) Ordinal(position-based preferences) These abstractions help! IT, meet NOC. NOC, meet IT! IIT RTC conference

  9. Network map ALTO Introduction: Maps (Network and cost) Datacenter 2 Datacenter 1 Datacenter 3 Problem: Complexity and network structure exposed. Graphics sources: http://pubs.vmware.com/vi301/intro/images/Introduction_chapter.3.2.1.jpg IIT RTC conference

  10. Network map Hides complexity behind “partition IDs” ALTO Introduction: Maps (Network and cost) Datacenter 2 PID 2 Datacenter 1 PID 1 Datacenter 3 PID 3 Graphics sources: http://pubs.vmware.com/vi301/intro/images/Introduction_chapter.3.2.1.jpg IIT RTC conference

  11. Cost map  Network cost of linking the partitions ALTO Introduction: Maps (Network and cost) Datacenter 2 PID 2 Datacenter 1 PID 1 20 1 10 30 Datacenter 3 22 PID 3 5 Graphics sources: http://pubs.vmware.com/vi301/intro/images/Introduction_chapter.3.2.1.jpg IIT RTC conference

  12. ALTO Introduction: Example ALTO maps Cost map Network map IIT RTC conference

  13. FCC Dataset specification • One country • Time Period: 01.01.2012 to 31.12.2012 • 7,782anonymised volunteers spread across the country • Each hourly triggers a defined set of common web sites • i.e. Google, YouTube, CNN, … • 75-78 million records per month • 6-7 GB of data per month IIT RTC conference

  14. FCC Dataset specification • Consists of several files organized per month • Linked together through unit_id field • For our first evaluation we use curr_dns file •  extract distinct unit_id which are consistent over a certain period • Use these to create a topology map for the ALTO protocol IIT RTC conference

  15. FCC Dataset specification IIT RTC conference

  16. Processing • Find a stable set of unit_id • DNS resolver appears in every file • Location is fixed. • Location is resolved using geo-ip database • Unit_id close to DNS resolver location IIT RTC conference

  17. Hadoop cluster specs • Hadoop 2.0.0-cdh 4.3.0 • 4 nodes • hexacore 2.4GHz Xeon • 120 GB RAM • HDFS 27.54 TB • 2 x 1GB Ethernet bonded IIT RTC conference

  18. Hadoop job process IIT RTC conference

  19. Outcome • Output contains • unit_id • DNS Resolver IP • Occurrence • Geo. location • Post process • Filter all non stable unit_id • Occurrence < 12 month IIT RTC conference

  20. Interesting Observation • Someunit_id are locatedoutside US • Assume user has manuallyconfigured DNS resolver • OpenDNS and Google DNS resolverswereignored • Large convergence to single point (Potwin,KS) • Potwinis the geographical center of the US • ISPs generally locate their primary or secondary DNS name servers • continue to further investigate on minimizing the impact • Someunit_id change ISP and/or location IIT RTC conference

  21. Stable unit_id IIT RTC conference

  22. Next steps • Attempt to create network map • Rough PID groupings accomplished by unit IDs belonging to same ISP. • More formal PID groupings for further study (e.g., group by bandwidth speed irrespective of ISP, lowest jitter, …). • Attempt to create a cost map • Different cost maps for different applications (e.g., use udp latency or jitter as a cost metric for VoIP applications). • Cross-reference with other dataset (e.g., US Census Dataset). IIT RTC conference

  23. Next steps • Using stable unit IDs as landmarks in a virtual coordinate system. IIT RTC conference

  24. Thank you for your attentionQUESTIONS? IIT RTC conference

More Related