370 likes | 492 Vues
5. 3. 5. 2. 1. 2. 3. 1. 2. 2. 1. A. B. E. F. C. D. BANANAS: An Evolutionary Framework for Explicit and Multipath Routing in the Internet. Hema T. Kaur, Shiv Kalyanaraman, Andreas Weiss, Shifalika Kanwar, Ayesha Gandhi Rensselaer Polytechnic Institute
E N D
5 3 5 2 1 2 3 1 2 2 1 A B E F C D BANANAS: An Evolutionary Framework for Explicit and Multipath Routing in the Internet Hema T. Kaur, Shiv Kalyanaraman, Andreas Weiss, Shifalika Kanwar, Ayesha Gandhi Rensselaer Polytechnic Institute hema@networks.ecse.rpi.edu, shivkuma@ecse.rpi.edu http://www.ecse.rpi.edu/Homepages/shivkuma
Acknowledgements • Biplab Sikdar (faculty colleague) • Mehul Doshi (MS) • Niharika Mateti (MS) • Also thanks to: • Satish Raghunath (PhD) • Jayasri Akella (PhD) • Hemang Nagar (MS) • Work funded in part by Intel Corp and DARPA-ITO, NMS Program. Contract number: F30602-00-2-0537
TE Spectrum … Shortest Path MPLS Signaled TE BANANAS-TE The Question • Can we emulate a subset of MPLS properties without signaling in existing connectionless routing protocols? • Key: Can we do source routing ? • without signaling • without variable (and large) per-packet overhead • being backward compatible with OSPF & BGP • allowing incremental network upgrades
Links AB and BD are overloaded Links AC and CD are overloaded B 1 1 2 B B A 1 1 D 1 4 E 2 2 1 2 A C D D E E 1 2 Can not do this with OSPF 1 2 A C C Why cannot we do it today? • Connectionless TE today uses a parametric approach: • Eg: changing link weights in OSPF, IS-IS or parameters of BGP-4 (LOCAL_PREF, MED etc) • Performance limited by the single shortest/policy path • Alt: Connection-oriented/signaled approach (eg: MPLS) • Complex to extend MPLS-TE across multiple areas. • Not a solution for inter-AS issues. • MPLS also needs the support of all the nodes along the path
IP IP IP IP Label 0 120 1321 MPLS Signaling and Forwarding Model • MPLS label is swapped at each hop along the LSP • Labels = LOCAL IDENTIFIERS … • Signaling maps global identifiers (addresses, path spec) to local identifiers Seattle New York (Egress) 5 San Francisco (Ingress) 1321 120 Miami
Seattle 5 New York (Egress) 4 4 18 IP 27 3 10 San Francisco (Ingress) 1 9 Miami 5 IP IP IP IP 27 36 0 PathId Global Path Identifiers • Instead of using local path identifiers (labels in MPLS), consider the use of global path identifiers
IP IP PathId(i,j) PathId(1,j) Global Path Identifier: Key Ideas j 2 k wm w2 i w1 m-1 1 Key ideas: 1. Swap/process global pathids hop-by-hop instead of local labels! 2. Avoid inefficient encoding (IP) or signaling (MPLS) 3. Onlyupgraded nodes need to locally compute a subset of valid PathIDs.
j 2 wm w2 i w1 m-1 1 k Path suffix Global Path Identifier (continued) • Path = {i, w1, 1, w2, 2, …, wk, k, wk+1, … , wm, j} • Sequence of globally known node IDs & Link weights • Global Path ID is a hash of this sequence => locally computable without the need for signaling! • Potential hash functions: • [j, { h(1) + h(2) + …+h(k)+ … +h(m-1)} mod 2b ]: node ID sum • MD5 one-way hash, XOR (eg: LIRA), 32-bit CRC etc… • Canonical method: MD5 hashing of the subsequence of nodeIDs followed by a CRC-32 to get a 32-bit hash value (MD5+CRC) • Low collision (i.e. non-uniqueness) probability • Different PathID encodings have different architectural implications
PathID SuffixPathID H{k, k+1, … , m-1} H{k+1, … , m-1} j 2 wm w2 i w1 m-1 1 k Path suffix Abstract Forwarding Paradigm • Forwarding table (Eg; at Node k): • [Destination Prefix, ] [Next-Hop, ] • [j, ] [k+1, ] • Incoming Packet Hdr: Destination address (j) & PathID = H{k, k+1, … , m-1} • Outgoing Packet Hdr: [j, PathID =H{k+1, … , m-1} ] • Longest prefix match + exact label match + label swap! • PathID mismatch => map to shortest (default) path, and set PathID = 0 • No signaling because of globally meaningful pathIDs!
27 IP IP IP IP IP IP 27 36 0 0 5 PathId BANANAS TE: Explicit, Multi-Path Forwarding • Explicit source-directed routing: Not limited by the shortest path nature of IGP • Different PathIds => different next-hops (multi-paths) • No signaling required to set-up the paths • Traffic mapping is decoupled from route discovery Seattle 5 New York (Egress) 4 4 18 IP 3 10 San Francisco (Ingress) 1 9 Miami 5
IP IP IP IP IP IP 27 27 27 0 0 5 BANANAS TE: Partial Deployment • Only “red” routers are upgraded • Non-upgraded routers forward everything on the shortest path (default path): forming a “virtual hop” Seattle 5 New York (Egress) 4 4 28 IP 27 30 10 San Francisco (Ingress) 1 9 1 X 2 Miami 3 1
Simplistic Route Computation Strategy: All-Paths Under Partial Upgrades • Assume 1-bit in LSA’s to advertise that an upgraded router is “multi-path capable” (MPC) • Two phase algorithm: (assume m upgraded nodes) • 1. (N-m) Dijkstra’s for non-upgraded nodes or one all-pairs shortest path (Floyd Warshall) • 2. DFS to discover valid paths to destinations. • Explore all neighbors of upgraded nodes • Explore only shortest-path next-hop of non-upgraded nodes • Visited bit set to avoid loops • Computes all possible valid paths under PU constraints in a fully distributed manner (global consistency)
Simulation/Implementation/Testing Platforms MIT’s Click Modular Router On Linux: Forwarding Plane Modular Router Utah’s Emulab Testbed: Experiments with Linux/Zebra/Click implementation SSFnet Simulation for OSPF/BGP Dynamics
Zebra/Click Implementation on Linux (Tested on Utah Emulab) • Part of table at node1: (PathID= Link Weights, for simplicity) 75 13 3 9 6 21 4 53 45 51 83 3 4 1 2 7 93 38 67 51 5 67 5 10 8
SSFnet Simulation Results A B E C D Flat OSPF Area, 19 Nodes; Only 3 Active-MPC nodes
5 3 5 2 1 2 3 A D F B C E 1 2 2 1 Refinement 1: Heterogeneous Route Computation • Goal: Upgraded nodes (eg: A, D, E) can use any route computation algorithm, so long as it computes the shortest (default) path! • Eg: k shortest-paths from a given source s to each vertex in the graph, in total time O(E + V log V + kV): lower complexity than AP-PU • Issue: Forwarding for k-shortest paths may not exist • Need to validate the forwarding availability for paths!
Two-Phase Path Validation Algorithm • Concept: Forwarding for path exists only if the forwarding for each of its suffixes exists. • Phase 1 (cont’d): • compute {k-shortest} paths for all other upgraded nodes, and 1-shortest paths for non-upgraded nodes. • Sort computed paths by hopcount • Phase 2: Validate paths starting from hopcount = 1. • All 1-hop paths valid. • p-hop paths valid if the (p-1)-hop path suffix is valid • Throw out invalid paths as they are found • Polynomial complexity to discover all valid paths in the network & validation can be done in the background • Validation algorithm correct by mathematical induction
Linux/Zebra/Emulab Results B D C Flat OSPF Area, 3 Active-MPC nodes; Upto k-shortest, validated paths
PathID = concatenation of well-known local link ID hashes Globally-known link IDs can be locally hashed using a well-known function (eg: link ID index) Refinement 2: Index-based PathID Encoding • Issue: increase in computation/storage complexity at upgraded nodes • Question: Can we move complexity to the network “edges” and simplify “core” nodes ? • Ans: YES! • The key is to consider an alternative, global PathID encoding
Why is the Index-based Encoding Interesting? • Ans: Architectural flexibility and simplification • Core (interior) nodes: • Forwarding function dramatically simplified • Minimal state (only the index table) • No control-plane computation complexity at interior nodes • Edge nodes: • Path validation dramatically simplified • Edge-nodes can store an arbitrary subset of validated paths • Heterogeneous route computation algorithms can be used
Area 1 Area 2 5 Area 0 1 1 3 2 5 5 4 1 2 3 1 7 4 A H D G J B C I ABR1 ABR3 ABR4 ABR2 ABR5 2 1 7 2 2 2 2 1 1 4 1 4 2 Multiple Areas Red nodes: upgraded Green nodes: regular • PathID re-initialized after crossing area boundaries • Eg: From node A (area 1) to node I (area 2) • Available paths: A-B-C-ABR1-area2, A-B-C-ABR2-area2 etc • When the packet reaches area2, ABR3 may choose one of many paths to reach I. Eg: ABR3-H-I, ABR3-J-I, ABR3-H-G-I etc • Source-routing notion similar to, but weaker than PNNI
Inter-domain TE • Outbound TE: • Multi-exit (or Explicit-exit) routing • Useful to manage peering vs transitcosts • Goal: fine-grained traffic engineering policy • BANANAS Hash = (Exit ASBR, destination address) • Forwarding paradigm: Connectionless tunneling thru the AS • Inbound TE: • NOT ADDRESSED DIRECTLY • Multi-AS-Path or Explicit AS-Path routing: • Framework similar to IGP: e-PathID concept
Upgrade selected EBGP and IBGP routers • All BGP routers synchronize on the default policy route to every destination prefix (as usual) • Only upgraded IBGP routers and EBGP routers synchronize on a set of exits for chosen prefixes • Upgraded IBGP routers can independently choose any exit without further synchronization with other BGP nodes BGP Explicit-Exit Routing: Route Selection • Explicit-Exit routing is easier than Explicit-Path Routing • Only the “source” and “exit” nodes need upgrades ! • Explicit exit routing easily extended to “multi-exit” routing
When a packet matches the explicit route (policy definable): • Push its destination address into an Address Stack field • Replace destination address with Exit-ASBR address. • Emulates 1-levellabel-stacking (I.e. tunneling) • Exit-ASBR simply swaps back the destination address, before regular IP lookup => popping the stack BGP Explicit-Exit Routing: Forwarding • IBGP locally installs explicit & default exits for chosen prefix • Dest-Prefix Exit-ASBR Next-Hop • Dest-Prefix Default-Next-Hop • Next-hop refers to the IGP next hop to reach Exit-ASBR • Default-Next-Hop: regular IBGP function
AS2 ASBR1 ASBR4 AS4 Dest. d ABR2 ASBR2 ABR1 AS3 ASBR3 AS1 Explicit-Exit Routing Example • Default (AS Path , Exit) to d = (1-3-4, ASBR3) • Now, ABR1 can have explicit exits ASBR4 (implied ASPath = 1-2-4), ASBR2 (implied ASPath =1-3-4) as well!!
AS0 AS2 ASBR1 AS4 Dest. d ASBR2 AS3 AS1 ASBR3 Inter-AS Explicit AS-Path Choice • Allow AS0 to explicitly choose an AS-PATH: e.g. 0-1-2-4 or 0-1-3-4, • Explicit AS-Path choice encoded as an e-PathID= Hash{1,2,4} • e-PathID is updated only when the packet leaves the AS at Exit border routers. • At ASBR1, this explicit AS-path choice is mapped to an exit ASBR. • Within an upgraded AS, the packet is tunneled using the routing header as explained earlier. • Only selected EBGP nodes need be upgraded & synchronized
AS5 3 AS-paths to “d” (0 4) (0 3 4) (0 5 4) AS2 AS0 ASBR2 ASBR1 AS4 Dest. d 1 AS-path or 3 AS-paths to “d”?? AS1 AS3 iBG-1 iBG-3 Re-advertisements of Multi-AS-Paths • Issue: in path-vector algorithms, without re-advertisements (of a subset of paths), remote AS’s cannot see the availability of multiple paths • But, re-advertisements adds control traffic overhead • An AS may choose to re-advertise only, and not support multi-path forwarding (I.e. interpreting e-PathID or Address Stack fields)
FORWARDING Table in AS2 (node#5) Corresponding Changes in Packet Headers
ISP-1 Internet . . . AS1 ISP-n Future: Exploiting Multiplicity In The Internet Phone modem USB/802.11a/b 802.11a Firewire/802.11a/b WiFi (802.11b) Ethernet
Exploiting Multiplicity… • Unlike telephony, data networking can get statistical multiplexing gains from simultaneously using: • Multiple transmission modes (802.11a/b, 3G etc) • Multiple exits (USB, Firewire, Ethernet, modem) • Multiple paths (routes) • Lightweight distributed QoS on each path • Eg: OverQoS (UCB) or Closed-loop QoS (Dave Harrison’s work) • Scavenge performance from this path diversity to meet requirements of high-quality multimedia apps! • BANANAS concepts are generic • Can be applied for intra-domain, inter-domain, overlay routing, or ad-hoc peer-to-peer routing
“Slow” path “Fast” path P I Eg: Multipath MPEG using Multi-band 802.11a/b Community Wireless Networks
Signaled TE BANANAS-TE Summary • TE: “TowardsBetter routing performance”: • Key: Decoupling route availabilityand setup issues from traffic mapping issues, without signaling • BANANAS-TE can leverage the rich interconnectivity and multi-homed nature of the Internet, with manageable increase in complexity • Applicable to OSPF, BGP, geographical routing, large-scale overlay networks; tested on Emulab, SSFnet • Currently deploying BANANAS on Planetlab, a community wireless network in Troy, NY and in p2p streaming/videoconferencing TE spectrum … Shortest Path MPLS