1 / 30

Adaptive Routing

Adaptive Routing. Rienforcement Learning Approaches. Contents. Routing Protocols Reinforcement Learning Q-Routing PQ-Routing Ant Routing Summary. Routing Classification. Distributed. Centralized. A Main controller updates all node’s routing tables. Fault Tollerent.

johana
Télécharger la présentation

Adaptive Routing

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Adaptive Routing Rienforcement Learning Approaches

  2. Contents • Routing Protocols • Reinforcement Learning • Q-Routing • PQ-Routing • Ant Routing • Summary

  3. Routing Classification Distributed Centralized • A Main controller updates all node’s routing tables. • Fault Tollerent. • Suitable for small networks. • Route computation shared among nodes by exchanging. • Widley used.

  4. Routing Classification… Adaptive Static • Routing based only on source and destination. • Current network state - Ignored • Adapt policy to time and trafic. • More attractive. • Ossilations in path.

  5. Routing Classification Based on Optimization. Non-Minimal Minimal Optimal Shortest Path LS DV

  6. Short falls of Static Routing • Dynamic networks are subjected to the following changes. • Topologies changes, as nodes are added and removed • Traffic patterns change cyclically • Overall network load changes • So, routing algorithms that assume that the network is static don’t work in this setting

  7. Tackling Dynamic Networks • Periodic Updates? • Routing traffic? • When to update? Adaptive Routing’s the Answer?

  8. Reinforcement Learning Agent Playing against a player- Chess and Tic-Tac-Toe Learning a Value Function

  9. Learning Value Function • Temporal Difference • V(e) = V(e) + K [ V(g) – V(e) ] For K = 0.4 We have V(e) = 0.5 + 0.2 = 0.7 • Exploration Vs Exploitation • e and e* 0.5 1

  10. Rienfocement Learning -Networks S V(s) = V(s) + K [ V(s’) – V(s) ] R R R R R R D

  11. Q-Routing • Qx(d, y) is the time that node x estimates it will take to deliver a packet to node d through its neighbor y • When y receives the packet, it sends back a message (to node x), containing its (i.e. y’s) best estimate of the time remaining to get the packet to d, i.e. • t = min(Qy(d, z)) over all z neighbors( y ) • x then updates Qx(d, y) by: • [Qx(d, y)]NEW = [Qx(d, y)]OLD + K.(s+q+t - [Qx(d,y)]OLD) Where • s = RTT from x to y • q = Time spent in queue at x • T = new estimate by y DQ

  12. message message to d x y w min(Qy(d, zi)) = 13; RTT = s = 11 [Qx(d, y)] += (0.25).[(11+17) - 20] 22 Q-Routing… to d Qy(d, z1) = 25 Qy(d, z2) = 17 estimated RTT = 3 Qy(d, ze) = 70

  13. Results

  14. Short falls • Shortest path algorithm – better than Q Routing under low load. • Failure to converge back to shortest paths when network load decreases again. Failure to explore new shortcuts

  15. message message to d x y w Short falls… to d Qy(d, z1) = 25 Qy(d, z2) = 17 Qy(d, ze) = 70 20 Even if route via y reduces later, It never gets used untill route via W gets cunjusted

  16. Predictive Q-Routing • DQ = s+q+t - [Qx(d,y)]OLD • [Qx(d, y)]NEW = [Qx(d, y)]OLD + K.DQ • Bx(d,y) = MIN[Bx(d,y), Qx(d,y)] • If(DQ < 0) //Path is improving • DR = DQ/(currentTime – lastUpdatedTime) • Rx(d,y) = Rx(d,y) + B.DR //Decrease in R • Else • Rx(d,y) = G.Rx(d,y) //Increase of R • End If • lastUpdatedTime = currentTime

  17. PQ-Routing Policy… Finding neighbour y • For each neighbour y of x • Dt = currentTime – lastUpdatedTime • Qx-pred(d,y) = Qx(d,y) + Dt.Rx(d,y) • Choose y with MIN[Qx-pred(d,y)]

  18. PQ-Routing Results • Performs better than Q-Routing under low, high and varying network loads. • Adapts faster if “probing inactive paths” for shortcuts introduced. • Under high loads, behaves like Q-Routing. • Uses more memory than Q-Routing.

  19. Comparision – Low Load

  20. Ant- RoutingStigmergy - Inspirations From Nature… • Sorts brood and food items • Explore particular areas for food, and preferentially exploits the richest available food source • Cooperates in carrying large items • Leaves pheromones on their way back • Always finds the shortest paths to their nests or food source • Are blind, can not foresee future, and has very limited memory

  21. Ants • Each router x in the network maintains for each destination node d a list of the form: • <d, <y1, p1>, <y2, p2>, …, <ye, pe>>, • where y1, y2, …, ye are the neighbors of x, and • p1 + p2 + …+ pe = 1 • This is a parallel (multi-path) routing scheme • This also multiplies the number of degrees of freedom the system has by a factor of |E|

  22. Ants… • Every destination host hd periodically generates an “ant” to a random source host hs • An “ant” is a 3-tuple of the form: • < hd, hs, cost> • cost is a counter of the cost of the path the ant has covered so far

  23. Ant Routing Example 0 1 2 3 4 < 4,0,cost > Routing Table for 1

  24. 1+ p 1+ p normalizing sum of probabilities to 1 Ants: Updation When a router x receives an ant < hd, hs, cost> from neighbor yi, it: • Updates cost by the cost of traversing the link from xtoyi (i.e. the cost of the link in reverse) • Updates entry for host (<hd, <y1, p1>, <y2, p2>, …, <ye, pe>>) p = k / cost, for some k pi = pi+ p for j i, pj = pj

  25. Ants: Propagation • Two sub-species of ant: • Regular Ants: P( ant sent to yi ) = pi • Uniform Ants: P( ant sent to yi ) = 1 / e • Regular ants use learned tables to route ants • Uniform ants explore randomly

  26. Ants: Comparision

  27. Q-Routing vs. Ants • Q-Routing only changes its currently selected route when the cost of that route increases, not when the cost of an alternate route decreases • Q-Routing involves overhead linear in the volume of traffic in the network; ants are effectively free in moderate traffic • Q-Routing cannot route messages by parallel paths; uniform ants can

  28. Ants with Evoperation • Evaporation is a real life scenario - Where pheromone laid by real ants evaporates. • Link usage statistics are used to evaporate (E(x)). • It is the proportion of number of ants from node x over the total ants received by the current node.

  29. Summary • Routing algorithms that assume a static network don’t work well in real-world networks, which are dynamic • Adaptive routing algorithms avoid these problems, at the cost of a linear increase in the size of the routing tables • Q-Routing is a straightforward application of Q-Learning to the routing problem • Routing with ants is more flexible than Q-Routing

  30. Reference • Boyan, J., & Littman, M. (1994). Packet routing in dinamically changing networks: A rein-forcement learning approach. In Advances in Neural Information Processing Systems 6 (NIPS6), pp. 671-678. San Francisco, CA:Morgan Kaufmann. • Di Caro, G., & Dorigo, M. (1998). Two ant colony algorithms for best-eort routing in datagram networks. In Proceedings of the Tenth IASTED International Conference on Parallel and Distributed Computing and Systems (PDCS'98), pp. 541-546. IASTED/ACTA Press. • Choi, S., & Yeung, D.-Y. (1996). Predictive Q-routing: A memory-based reinforcement learning approach to adaptive trac control. In Advances in Neural Information Processing Systems 8 (NIPS8), pp. 945-951. MIT Press. • Dorigo, M., Maniezzo, V., & Colorni, A. (1996). The ant system: Optimization by a colony of cooperating agents. IEEE Transactions on Systems, Man, and Cybernetics-Part B, 26 (1), 29-41.

More Related