1 / 15

Decentralised Self-Managing Systems K-Components & Collaborative Reinforcement Learning

Decentralised Self-Managing Systems K-Components & Collaborative Reinforcement Learning. Jim Dowling, Vinny Cahill Distributed Systems Group Trinity College Dublin. Self-Managed Decentralised Systems. Characteristics of Dynamic Environments Lack of Global State Network Dynamism

Télécharger la présentation

Decentralised Self-Managing Systems K-Components & Collaborative Reinforcement Learning

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Decentralised Self-Managing SystemsK-Components & Collaborative Reinforcement Learning Jim Dowling, Vinny Cahill Distributed Systems Group Trinity College Dublin

  2. Self-Managed Decentralised Systems • Characteristics of Dynamic Environments • Lack of Global State • Network Dynamism • Uncertainty (of adaptation actions) • Distributed Agreement not Possible • What does Self-Managed mean to us? • System-wide self-* properties are established and maintained solely by the decentralised coordination and adaptation of components that execute using only a partial system view and without reference to the system-wide self-* properties • Self-* Properties are “Emergent” • Experimentation/Simulation for Evaluation Jim Dowling, WOSS 2004

  3. K-Component Model • Self-Adaptive Component • provides i/f • uses i/fs • state • Action • Adaptation Contract • Autonomous • Rule-Based, ECA • Unsupervised Learning • Architecture Meta Model • Auto-Generated • Partial System View Jim Dowling, WOSS 2004

  4. Model-Based Reinforcement Learning • Markov Decision Process = • {States }, {Actions}, • R(States,Actions)-> R 3. Next State Reward 1.Action Reward 2. State Transition Model Jim Dowling, WOSS 2004

  5. Decentralised System Optimisation • Coordinating the solution to a set of Discrete Optimisation Problems (DOPs) • Components have a Partial System View • Coordination Actions • Actions ={delegation} U {DOP actions} U {discovery} • Connection Costs Jim Dowling, WOSS 2004

  6. Collaborative Reinforcement Learning • Advertisement • Update Partial Views of Neighbours • Asynchronous, Synchronous • Decay • Negative Feedback on State Values in the Absence of Advertisements Cached Neighbour’s V-value State Transition Model Action Reward Connection Cost Jim Dowling, WOSS 2004

  7. Decentralised File Storage Application in K-Components component FileStorage { provides File; uses File n0, n1, n2; state load, buffered, stored, forwarded; action store, forward0, forward 1, forward2; }; incoming contract LoadBalance(FileStorage::File){ ListStates crl_states = {buffered, forwarded, stored}; ListActions crl_actions = {store, forward0, forward1, forward2}; decay fs_decay(forwarded, 1.05); crl_policy lb(crl_states,crl_actions, fs_decay); if (poll_event(buffered) > 0) start_mdp(lb); } Jim Dowling, WOSS 2004

  8. Software Engineering using CRL - Tuning & Experimentation • Advertisement = RPC | Events • Rate of decay, ρ • Connection Cost Model • rF, rS • Action Reward Model • R(buffered, forwardi) • R(buffered, store) • R(buffered, discover) • Temperature, T Maximise Resource Utilisation Jim Dowling, WOSS 2004

  9. C B Advertised -7 -4 Load Cost -6 (-2) -9 (-2) D A -6 -10 -12 (-2) -10 (0) Local Start State Terminal Storage Cost State Total (Connection) Cost Load Balancing in Collaborative Reinforcement Learning 1. File Arrives at A 2. A’s Local Storage Cost= -10. C’s cost is lower (-4 + -2) = -6. 3. File is forwarded to C. 4. A’s advertised cost =-6. Jim Dowling, WOSS 2004

  10. Experimental Setup Experiments • Homogeneous Components • File Server @ c20 • Three Load Generators • File Servers @ c20, c6 Peak Load Generator (Simulates Multimedia Traffic) Jim Dowling, WOSS 2004

  11. 1.Homogeneous Components Jim Dowling, WOSS 2004

  12. 2.Single Server at C20 Jim Dowling, WOSS 2004

  13. 3.Three Load Generators Jim Dowling, WOSS 2004

  14. 4.Two Servers, c6 and c20 Jim Dowling, WOSS 2004

  15. Analysis and Conclusions • Load Balancing system adapts and optimises to a changing environment • Positive Feedback • “Adaptation Agility” as important as Convergence in Decentralised Optimisation • Other Application Areas for CRL • Network Routing Protocol for MANETs • SAMPLE [WONS ’05] • Traffic Management • Peer-to-Peer Systems Jim Dowling, WOSS 2004

More Related