Decentralised Self-Managing Systems K-Components & Collaborative Reinforcement Learning

Decentralised Self-Managing SystemsK-Components & Collaborative Reinforcement Learning Jim Dowling, Vinny Cahill Distributed Systems Group Trinity College Dublin

Self-Managed Decentralised Systems • Characteristics of Dynamic Environments • Lack of Global State • Network Dynamism • Uncertainty (of adaptation actions) • Distributed Agreement not Possible • What does Self-Managed mean to us? • System-wide self-* properties are established and maintained solely by the decentralised coordination and adaptation of components that execute using only a partial system view and without reference to the system-wide self-* properties • Self-* Properties are “Emergent” • Experimentation/Simulation for Evaluation Jim Dowling, WOSS 2004

K-Component Model • Self-Adaptive Component • provides i/f • uses i/fs • state • Action • Adaptation Contract • Autonomous • Rule-Based, ECA • Unsupervised Learning • Architecture Meta Model • Auto-Generated • Partial System View Jim Dowling, WOSS 2004

Model-Based Reinforcement Learning • Markov Decision Process = • {States }, {Actions}, • R(States,Actions)-> R 3. Next State Reward 1.Action Reward 2. State Transition Model Jim Dowling, WOSS 2004

Decentralised System Optimisation • Coordinating the solution to a set of Discrete Optimisation Problems (DOPs) • Components have a Partial System View • Coordination Actions • Actions ={delegation} U {DOP actions} U {discovery} • Connection Costs Jim Dowling, WOSS 2004

Collaborative Reinforcement Learning • Advertisement • Update Partial Views of Neighbours • Asynchronous, Synchronous • Decay • Negative Feedback on State Values in the Absence of Advertisements Cached Neighbour’s V-value State Transition Model Action Reward Connection Cost Jim Dowling, WOSS 2004

Decentralised File Storage Application in K-Components component FileStorage { provides File; uses File n0, n1, n2; state load, buffered, stored, forwarded; action store, forward0, forward 1, forward2; }; incoming contract LoadBalance(FileStorage::File){ ListStates crl_states = {buffered, forwarded, stored}; ListActions crl_actions = {store, forward0, forward1, forward2}; decay fs_decay(forwarded, 1.05); crl_policy lb(crl_states,crl_actions, fs_decay); if (poll_event(buffered) > 0) start_mdp(lb); } Jim Dowling, WOSS 2004

Software Engineering using CRL - Tuning & Experimentation • Advertisement = RPC | Events • Rate of decay, ρ • Connection Cost Model • rF, rS • Action Reward Model • R(buffered, forwardi) • R(buffered, store) • R(buffered, discover) • Temperature, T Maximise Resource Utilisation Jim Dowling, WOSS 2004

C B Advertised -7 -4 Load Cost -6 (-2) -9 (-2) D A -6 -10 -12 (-2) -10 (0) Local Start State Terminal Storage Cost State Total (Connection) Cost Load Balancing in Collaborative Reinforcement Learning 1. File Arrives at A 2. A’s Local Storage Cost= -10. C’s cost is lower (-4 + -2) = -6. 3. File is forwarded to C. 4. A’s advertised cost =-6. Jim Dowling, WOSS 2004

Experimental Setup Experiments • Homogeneous Components • File Server @ c20 • Three Load Generators • File Servers @ c20, c6 Peak Load Generator (Simulates Multimedia Traffic) Jim Dowling, WOSS 2004

1.Homogeneous Components Jim Dowling, WOSS 2004

2.Single Server at C20 Jim Dowling, WOSS 2004

3.Three Load Generators Jim Dowling, WOSS 2004

4.Two Servers, c6 and c20 Jim Dowling, WOSS 2004

Analysis and Conclusions • Load Balancing system adapts and optimises to a changing environment • Positive Feedback • “Adaptation Agility” as important as Convergence in Decentralised Optimisation • Other Application Areas for CRL • Network Routing Protocol for MANETs • SAMPLE [WONS ’05] • Traffic Management • Peer-to-Peer Systems Jim Dowling, WOSS 2004

Decentralised Self-Managing Systems K-Components & Collaborative Reinforcement Learning