360 likes | 517 Vues
Application Layer Overlays. IS250 Spring 2010 John Chuang. Application Layer Overlay. The Internet infrastructure, based on TCP/IP, provides: Global reachability Reliable end-to-end transport Highly successful in supporting one-to-one (unicast) communication
 
                
                E N D
Application Layer Overlays IS250 Spring 2010 John Chuang
Application Layer Overlay • The Internet infrastructure, based on TCP/IP, provides: • Global reachability • Reliable end-to-end transport • Highly successful in supporting one-to-one (unicast) communication • But there are some limitations: • Difficult to deploy new network services (e.g., IP multicast, IP anycast, QoS, IPv6) • Lack of support for one-to-many (multicast) or even many-to-many (“peer-to-peer”) communication • End hosts have no control over what goes on in the network (e.g., no source routing or user-directed routing) John Chuang
Application Layer Overlay • One strategy: build an overlay network at the application layer • End hosts gain control over topology formation, routing, to meet specific application needs • New applications and services can be deployed without changes to the TCP/IP infrastructure John Chuang
Logical topology Self-organized Dynamic Application specific Overlay Networks Application layer overlay Network layer John Chuang
Early Examples • Domain Name Service (DNS) • 6bone: IPv6 over IPv4 • Mbone: multicast over unicast IP • X-Bone John Chuang http://graphics.stanford.edu/papers/mbone/morepix/world-6bone.jpeg http://www.mbone.cl.cam.ac.uk/mbone/mbone-small.gif
Some Overlay Networks • Web Caching and Content Distribution Networks (CDNs) • Application Layer Multicast (ALM) • User Directed Routing • Anonymous Routing • Resilient overlay network • Peer-to-Peer (P2P) • Unstructured P2P: gnutella, FreeNet, kazaa,… • Structured P2P: Distributed Hash Tables (DHTs) John Chuang
proxy cache reverse proxy cache network caches Web Caching • Improves download latency, content availability by storing local copy of popular web objects • Web caches are L7 boxes web server client John Chuang
Content Delivery Networks • Clients are intelligently redirected to nearest CDN server to download publisher content • IP anycast (if it exists) could accomplish this easily… • In the absence of IP anycast, companies like Akamai constructs CDNs as application layer overlay networks web server CDN servers client John Chuang
Method 1: DNS Redirect Step 1: client queries DNS for IP address of www.publisher.com; based on client’s IP address, reconfigured publisher DNS returns IP address of replica closest to client Local DNS publisher DNS publisher client Nearest replica John Chuang
Method 1: DNS Redirect Step 2: client contacts replica for object Local DNS publisher DNS publisher client Nearest replica John Chuang
Method 2: URL Redirect Step 1: client queries DNS for IP address of www.publisher.com Local DNS publisher client CDN DNS CDN server John Chuang
Method 2: URL Redirect Step 2: client contacts publisher; publisher returns HTML with embedded objects’ URLs pointing to best CDN server Local DNS publisher client CDN DNS CDN server John Chuang
Method 2: URL Redirect Step 3: client queries DNS for IP address of CDN server Local DNS publisher client CDN DNS CDN server John Chuang
Method 2: URL Redirect Step 4: client contacts CDN server; CDN server returns embedded objs Local DNS publisher client CDN DNS CDN server John Chuang
Some Overlay Networks • Web Caching and Content Distribution Networks (CDNs) • Application Layer Multicast (ALM) • User Directed Routing • Anonymous Routing • Resilient overlay network • Peer-to-Peer (P2P) • Unstructured P2P: gnutella, FreeNet, kazaa,… • Structured P2P: Distributed Hash Tables (DHTs) John Chuang
IP Multicast • Network routers must implement IP Multicast to construct delivery tree and forward packets to multicast group receivers routers server client John Chuang
Application Layer Multicast • End hosts self-organize to construct multicast delivery tree; messages sent using IP unicast • Sacrifice some efficiency (latency stretch) for deployability • Various systems: ESM, Overcast, Promise, Scattercast, SplitStream, Yoid, … routers server client John Chuang
Some Overlay Networks • Web Caching and Content Distribution Networks (CDNs) • Application Layer Multicast (ALM) • User Directed Routing • Anonymous Routing • Resilient overlay network • Peer-to-Peer (P2P) • Unstructured P2P: gnutella, FreeNet, kazaa,… • Structured P2P: Distributed Hash Tables (DHTs) John Chuang
IP source route IP Source Route • IP source route allows end hosts to exercise some degree of route control • However, many ISPs turned off IP source routing option for security reasons routers server client default route John Chuang
User Directed Routing • Some applications would benefit from having some degree of control over route selection • Resiliency: e.g., resilient overlay network (RON), Detour • Anonymity: onion routing, MIX-nets, … routers server client John Chuang
Onion Routing • Application layer overlay for anonymous routing • Existence of communication between Alice and Bob not revealed to any 3rd party • Alice constructs onion where message is successively encrypted with keys of intermediate routing nodes • Each intermediate node ‘peels’ one layer of onion and forward to next node • Example system: Tor http://tor.eff.org/overview.html.en John Chuang
Some Overlay Networks • Web Caching and Content Distribution Networks (CDNs) • Application Layer Multicast (ALM) • User Directed Routing • Anonymous Routing • Resilient overlay network • Peer-to-Peer (P2P) • Unstructured P2P: gnutella, FreeNet, kazaa,… • Structured P2P: Distributed Hash Tables (DHTs) John Chuang
P2P • Self-organized overlay network to support distributed storage, search and retrieval of content • The killer-app: free music and movies • Individual peers contribute resources • Content • Network management (e.g., forwarding query messages) • Desirable properties: • Scalability • Performance (latency, recall) • Robustness • Anonymity, censorship-resistance • Design challenges: • Dynamic membership • Various forms of attacks • Free-riding behavior John Chuang
P2P File-Sharing Networks • 1st generation: centralized index • e.g., Napster • 2nd generation: decentralized indices • e.g., Gnutella v0.4, Freenet • 3rd generation: hierarchical • e.g., FastTrack (KaZaA, Grokster, Morpheus), eDonkey2000, Gnutella v0.6 • 4th generation: • Structured topologies using DHTs, e.g., eMule, Overnet, BitTorrent • Parallel downloads, e.g., BitTorrent, Avalanche • Darknets, e.g., WASTE for small-scale “F2F” networks John Chuang
E? E E? m5 Napster m5 • Maintains a centralized index that maps files to machines • How to find a file • Query the index system  return a list of peers that store the requested file • Transfer the file directly from peer(s) • Advantage: • Simplicity: easy to implement sophisticated search engines on top of the index system • Disadvantage: • Single point of failure E m6 F D m1 A m2 B m3 C m4 D m5 E m6 F m4 C A m3 B m1 m2 John Chuang Slide adapted from Ion Stoica, Nicolas Christin
E? E E? E? E? Gnutella (v0.4) • Flood the request • How to find a file: • Send request to all neighbors • Neighbors recursively propagate the request • Eventually a machine that has the file receives the request, and it sends back the answer • Advantages: • Totally decentralized, highly robust • Disadvantages: • The entire network can be swamped with a request • Can be alleviated using TTLs, but can then fail to locate files (and still high resource usage) m5 E m6 F D m4 C A B m3 m1 m2 Assume: m1’s neighbors are m2 and m3; m3’s neighbors are m4 and m5;… John Chuang Slide adapted from Ion Stoica, Nicolas Christin
F? F F? F? Hierarchical Networks F • Use two-level hierarchy • Some nodes are elected as “super nodes” or “ultra-peers” • Each ultra-peer serves as centralized index for a portion of the network • If an ultra-peer does not know where to find an item, query is forwarded to other ultra-peers • Advantage: • Reduce the amount of network traffic compared to “naïve” flooding • Disadvantage: • Ultra-peers vulnerable to attacks • Potential convergence problems when ultra-peers leave abruptly • Used in FastTrack (KaZaA, Grokster, Morpheus), eDonkey2000, Gnutella v0.6 E m4 D C A B m3 m1 m2 Assume red nodes are ultra-peers John Chuang Slide adapted from Ion Stoica, Nicolas Christin
Structured Topologies • Gnutella and KaZaA topologies are unstructured • Neighbor selection largely random • No guarantee that a file can be located, even if it exists in the network • Distributed hash tables (DHTs) offer to solve this problem by constructing highly structured topologies John Chuang
Distributed Hash Table (DHT) • Applications: distributed search (e.g., p2p, CDNs, cooperative caching), application layer overlays for multicast, anycast, etc. • Similar to traditional hash table data structure, except data is stored in distributed peer nodes • Each node is analogous to a bucket in a hash table • Put(), Get() interface like a regular hash table: • put(id, item); • item = get(id); • Designed to scale to large numbers of nodes and to handle continual node arrivals, departures, or failures. • Various DHT designs: • CAN, Chord, Kademlia, Pastry, Tapestry, Viceroy, etc. John Chuang
DHT Example: Chord • Associate each node and item to a unique identifier in a one-dimensional space (0..2m) • Each node x maintains a finger table • Fingers are neighbors • i-th entry in finger table is the first node that succeeds or equals x + 2i • An item identified by id is stored on the successor node of id • Properties • Routing table size O(log(N)) , where N is the total number of nodes • Guarantees that a file (if it exists) is found in O(log(N)) steps John Chuang Slide adapted from Ion Stoica, Nicolas Christin
Finger Table 0 i id+2i succ 0 2 1 1 3 1 2 5 1 1 7 6 2 5 3 4 Chord Example • Assume m = 3, i.e., an identifier space 0..7 • Node n1:(1) joins John Chuang Slide adapted from Ion Stoica, Nicolas Christin
Chord Example • Assume m = 3, i.e., an identifier space 0..7 • Node n1:(1) joins • Node n2:(2) joins Finger Table 0 i id+2i succ 0 2 2 1 3 1 2 5 1 1 7 6 2 Finger Table i id+2i succ 0 3 1 1 4 1 2 6 1 5 3 4 John Chuang Slide adapted from Ion Stoica, Nicolas Christin
Chord Example Finger Table • Assume m = 3, i.e., an identifier space 0..7 • Node n1:(1) joins • Node n2:(2) joins • Nodes n3:(0), n4:(6) join i id+2i succ 0 1 1 1 2 2 2 4 6 Finger Table 0 i id+2i succ 0 2 2 1 3 6 2 5 6 1 7 Finger Table i id+2i succ 0 7 0 1 0 0 2 2 2 6 2 Finger Table i id+2i succ 0 3 6 1 4 6 2 6 6 5 3 4 John Chuang Slide adapted from Ion Stoica, Nicolas Christin
Insertion Finger Table Items 7 i id+2i succ 0 1 1 1 2 2 2 4 6 • Items inserted: f1:(7), f2:(1) 0 Finger Table Items 1 1 i id+2i succ 0 2 2 1 3 6 2 5 6 7 6 2 Finger Table i id+2i succ 0 7 0 1 0 0 2 2 2 Finger Table i id+2i succ 0 3 6 1 4 6 2 6 6 5 3 4 John Chuang Slide adapted from Ion Stoica, Nicolas Christin
Query Finger Table Items 7 i id+2i succ 0 1 1 1 2 2 2 4 6 • Upon receiving a query for item id, a node • Checks if item is cached locally • If not, forwards the query to the largest node in its successor table that does not exceed id 0 Finger Table Items 1 1 i id+2i succ 0 2 2 1 3 6 2 5 6 7 query(7) 6 2 Finger Table i id+2i succ 0 7 0 1 0 0 2 2 2 Finger Table i id+2i succ 0 3 6 1 4 6 2 6 6 5 3 4 John Chuang Slide adapted from Ion Stoica, Nicolas Christin
Summary • Difficult to deploy new network services at network layer • Response: build overlay network at the application layer • End hosts gain control over topology formation, routing, to meet specific application needs • New applications and services can be deployed without changes to the TCP/IP infrastructure • Many flavors of application layer overlay networks: • Web Caching and Content Distribution Networks (CDNs) • Application Layer Multicast (ALM) • Anonymous Routing (Tor) • Resilient overlay network (RON) • P2P file-sharing networks • Distributed Hash Tables (DHTs) • … John Chuang