190 likes | 194 Vues
This research focuses on developing an autonomous data ferry route design in delay tolerant networks using a queueing and MDP theory approach. The goal is to minimize average delay and improve the efficiency of data transfer in sparsely distributed sensor networks.
E N D
Towards AutonomousData Ferry Route Design inDelay Tolerant Networks Daniel Henkel, Timothy X Brown University of Colorado @ Boulder WoWMoM/AOC ‘08 June 23, 2008 TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: AAAA
Familiar: Dial-A-Ride Dial-A-Ride:curb-to-curb, shared ride transportation service request 1 • Receives calls • Picks up and drops off passengers • Transport people quickly ! request 2 hospital request 3 The Bus request 4 school request 5 depot Optimal route not trivial !
In context: Dial-A-UAV Complication: infinite data at sensors; potentially two-way traffic Delay tolerant traffic! Sensor-1 Sensor-3 Sensor-5 Monitoring Station Sensor-2 Sensor-6 Sensor-4 • Sparsely distributed sensors, limited radios • TSP solution not optimal • Our approach: Queueing and MDP theory
TSP’s Problem pA pB B New: cycle defined by visit frequenciespi B Traveling Salesman Solution • One cycle visits every node • Problem: far-away nodes with little data to send • Visit them less often A B UAV hub dA dB fA fB
Queueing Approach Goal Minimize average delay Idea: express delay in terms of pi, then minimize over set {pi} • pi as probability distribution • Expected service time of any packet • Inter-service time: exponential distribution with mean T/pi • Weighted delay: A B UAV fB fA pA pB dB dA pC C hub pD dC dD D fC fD
Solution and Algorithm Improvement over TSP! Probability of choosing node i for next visit: Implementation: deterministic algorithm 1. Set ci= 0 2. ci = ci + pi while max{ci} < 1 3. k = argmax {ci} 4. Visit node k; ck = ck-1 5. Go to 2. • Pretty simplistic view of the world ! • Random selection ignores many parameters.
There’s More to It! New perspective: States • # people waiting at location • Varying # of calls (daytime) • Current bus location Actions • Drive to a location Goal • Short passenger wait time request 1 request 2 request 3 request 4 request 5 depot Generally unknown environment
Promising Technique Reinforcement Learning (AI technique) • Learning what to do without prior training • Given: high-level goal; NOT: how to reach it • Improving actions on the go • Features: • Interaction with environment • Concept of Rewards & Punishments • Trial & Error Search • Example: riding a bike
The Framework Agent • Performs Actions Environment • Gives Rewards • Puts Agent in situations called States Goal: • Learn what to do in a given state (Policy) The Beauty: Learns model of environment and retains it.
Markov Decision Process r r r . . . . . . t +1 t +2 s s t +3 s s t +1 t +2 t +3 a a a a t t +1 t +2 t t +3 Series of States/Actions: Markov Property: reward and next state depend only on the current state and action, and not on the history of states or actions.
MDP Terms • Policy: Mapping from set of States to set of Actions • Sum of Rewards (:=return): from this time onwards • Value function (of a state): Expected return when starting in s and following policy π. For an MDP: • Solution methods • Dynamic Programming, Monte Carlo simulation • Temporal Difference learning
UAV Path Planning A B • State: tuple of accumulated node traffic, here • Actions: round trip through subset of nodes, e.g., A, B, C, D, AB, AC,…DCBA λA λB F H D C λD λC
Reward Criterion Reward:
Learning Extract policy from value function Temporal Difference Learning • Recursive state value approximation • Convergence to “true value” as
Simulation Results • RR = Round Robin (naive) • STO = Stochastic Modeling • TSP = Traveling Salesman solution • RL = Reinforcement Learning
Conclusion/Extensions • Shown two algorithms to route UAVs • RL viable approach • Structured state space • Action space (options theory) • Hierarchical structure / peer-to-peer flows • Interrupt current action and start over • Adapt and optimize learning method Extensions:
Soccer - Euro Cup 2008 Wednesday, 11:45am (PST) Germany – Turkey [ 4 : 1 ]
Research & Engineering Center for Unmanned Vehicles (RECUV) Questions Research and Engineering Center for Unmanned Vehicles University of Colorado at Boulder http://recuv.colorado.edu The Research and Engineering Center for Unmanned Vehicles at the University of Colorado at Boulder is a university, government, and industry partnership dedicated to advancing knowledge and capabilities in using unmanned vehicles for scientific experiments, collecting geospatial data, mitigation of natural and man-made disasters, and defense against terrorist and hostile military activities.