1 / 19

Towards Autonomous Data Ferry Route Design in Delay Tolerant Networks

This research focuses on developing an autonomous data ferry route design in delay tolerant networks using a queueing and MDP theory approach. The goal is to minimize average delay and improve the efficiency of data transfer in sparsely distributed sensor networks.

halladay
Télécharger la présentation

Towards Autonomous Data Ferry Route Design in Delay Tolerant Networks

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Towards AutonomousData Ferry Route Design inDelay Tolerant Networks Daniel Henkel, Timothy X Brown University of Colorado @ Boulder WoWMoM/AOC ‘08 June 23, 2008 TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: AAAA

  2. Familiar: Dial-A-Ride Dial-A-Ride:curb-to-curb, shared ride transportation service request 1 • Receives calls • Picks up and drops off passengers • Transport people quickly ! request 2 hospital request 3 The Bus request 4 school request 5 depot Optimal route not trivial !

  3. In context: Dial-A-UAV Complication: infinite data at sensors; potentially two-way traffic Delay tolerant traffic! Sensor-1 Sensor-3 Sensor-5 Monitoring Station Sensor-2 Sensor-6 Sensor-4 • Sparsely distributed sensors, limited radios • TSP solution not optimal • Our approach: Queueing and MDP theory

  4. TSP’s Problem pA pB B New: cycle defined by visit frequenciespi B Traveling Salesman Solution • One cycle visits every node • Problem: far-away nodes with little data to send • Visit them less often A B UAV hub dA dB fA fB

  5. Queueing Approach Goal Minimize average delay Idea: express delay in terms of pi, then minimize over set {pi} • pi as probability distribution • Expected service time of any packet • Inter-service time: exponential distribution with mean T/pi • Weighted delay: A B UAV fB fA pA pB dB dA pC C hub pD dC dD D fC fD

  6. Solution and Algorithm Improvement over TSP! Probability of choosing node i for next visit: Implementation: deterministic algorithm 1. Set ci= 0 2. ci = ci + pi while max{ci} < 1 3. k = argmax {ci} 4. Visit node k; ck = ck-1 5. Go to 2. • Pretty simplistic view of the world ! • Random selection ignores many parameters.

  7. There’s More to It! New perspective: States • # people waiting at location • Varying # of calls (daytime) • Current bus location Actions • Drive to a location Goal • Short passenger wait time request 1 request 2 request 3 request 4 request 5 depot  Generally unknown environment

  8. Promising Technique Reinforcement Learning (AI technique) • Learning what to do without prior training • Given: high-level goal; NOT: how to reach it • Improving actions on the go • Features: • Interaction with environment • Concept of Rewards & Punishments • Trial & Error Search • Example: riding a bike

  9. The Framework Agent • Performs Actions Environment • Gives Rewards • Puts Agent in situations called States Goal: • Learn what to do in a given state (Policy) The Beauty: Learns model of environment and retains it.

  10. Markov Decision Process r r r . . . . . . t +1 t +2 s s t +3 s s t +1 t +2 t +3 a a a a t t +1 t +2 t t +3 Series of States/Actions: Markov Property: reward and next state depend only on the current state and action, and not on the history of states or actions.

  11. MDP Terms • Policy: Mapping from set of States to set of Actions • Sum of Rewards (:=return): from this time onwards • Value function (of a state): Expected return when starting in s and following policy π. For an MDP: • Solution methods • Dynamic Programming, Monte Carlo simulation • Temporal Difference learning

  12. UAV Path Planning A B • State: tuple of accumulated node traffic, here • Actions: round trip through subset of nodes, e.g., A, B, C, D, AB, AC,…DCBA λA λB F H D C λD λC

  13. Reward Criterion Reward:

  14. Learning Extract policy from value function Temporal Difference Learning • Recursive state value approximation • Convergence to “true value” as

  15. Paths

  16. Simulation Results • RR = Round Robin (naive) • STO = Stochastic Modeling • TSP = Traveling Salesman solution • RL = Reinforcement Learning

  17. Conclusion/Extensions • Shown two algorithms to route UAVs • RL viable approach • Structured state space • Action space (options theory) • Hierarchical structure / peer-to-peer flows • Interrupt current action and start over • Adapt and optimize learning method Extensions:

  18. Soccer - Euro Cup 2008 Wednesday, 11:45am (PST) Germany – Turkey [ 4 : 1 ]

  19. Research & Engineering Center for Unmanned Vehicles (RECUV) Questions Research and Engineering Center for Unmanned Vehicles University of Colorado at Boulder http://recuv.colorado.edu The Research and Engineering Center for Unmanned Vehicles at the University of Colorado at Boulder is a university, government, and industry partnership dedicated to advancing knowledge and capabilities in using unmanned vehicles for scientific experiments, collecting geospatial data, mitigation of natural and man-made disasters, and defense against terrorist and hostile military activities.

More Related