TC Report for the 201 3 June AdCom Meeting (June 20 , 201 3 )

TC Report for the 2013 June AdCom Meeting (June 20, 2013) Adaptive Dynamic Programming and Reinforcement Learning Technical Committee (ADPRL TC) Chair: Huaguang Zhang, China Vice-Chairs: Jagannathan Sarangapani, USA Ana Maria Madureira, Portugal

Outline • Introduction of ADPRLTC • Technical Activities of ADPRLTC • Review of 2013ADPRLTC Meeting • ADPRLTC Plans in 2013 • TF Activity Reports

ADPRLTC Members

ADPRLTC New Members There are five new members in 2013: • Warren Dixon, University of Florida, USA • Hao Xu, Missouri University of Science and Technology, USA • Xiong Luo, University of Science and Technology Beijing, China • Travis Dierks, Missouri University of Science and Technology, USA • Evangelos Theodorou, University of Southern California, USA

ADPRL TC Main Conference ADPRL: IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning

ADPRL TC Task Forces TF 1: Applications of ADP and RL Chair: Draguna Vrabie Vice-Chair(s): Zhong-Ping Jiang Members: Warren Powell Sean Meyn John Valasek Derong Liu Jay H. Lee Frank Lewis Jagannathan Sarangapani G K Venayagamoorthy Warren Dixon TF 2: Reinforcement Learning and Function Approximation Chair:Robert Babuska Vice-Chair(s):Lucian Busoniu Members: Robert Babuska Damien Ernst Lucian Busoniu Philippe Preux New Vice-Chair of TF2： Lucian Busoniu, University of Lorraine, France

ADPRL TC Task Forces TF 3: Robot Reinforcement Learning Chair: Evangelos Theodorou Vice-Chair(s): Stefan Schaal Members: Leslie P. Kaelbling Robert Babuska Jens Kober Jun Morimoto Martin Riedmiller Nick Roy Jennie Si Russ Tedrake Emo Todorov Nikos Vlassis TF 4: Evolutionary Algorithms forADPRL Chair: Hisashi HandaVice-Chair(s): Kazuhiro Ohkura Members: Yoshiaki KatadaMatteo Gagliolo Kazuaki Yamada Kazuhiro Ohkura Hisashi Handa New Chair of TF3： Evangelos Theodorou, University of Southern California, USA

ADPRL TC Task Forces TF 5: ADPRL in Real-time FeedbackControlSystems Chair: Xin XuVice-Chair(s):Haibo He Members: Wen Yu Yanhong Luo Dongbin Zhao Lucian Busoniu Haibo HeXin Xu TF 6: ADP in Game Theory and Multi-Agent Optimization Chair:Kyriakos G. Vamvoudakis Vice-Chair(s):Travis Dierks Members: Luis Rodolfo Garcia Carrillo Marcio Fantini Miranda Qinglai Wei New Task Forces: TF 6 is a new task force in 2013.

Activities at SSCI 2013 Symposium: • “Adaptive Dynamic Programming and Reinforcement Learning (ADPRL)” (Chairs: MarcoWiering, Huaguang Zhang, Jagannathan Sarangapani) • “Computational Intelligence Applications in Smart Grid (CIASG)” (Chairs: Ganesh Kumar Venayagamoorthy, Haibo He) • Keynotes • 1. Keynote on "General-purpose RLADP: Solving the scaling problem“ for ADPRL(Speaker: Paul Werbos) • 2. Keynote on “Intelligent adaptive optimal control: algorithms and stability” (Speaker: Huaguang Zhang)

Activities at SSCI 2013 Special Sessions: 1. “Evolutionary Algorithms for ADPRL” at ADPRL 2013 (Organizers: Hisashi Handa and Kazuhiro Ohkura) 2. “Online Planning” at ADPRL 2013 (Organizers: Lucian Busoniu and Rémi Munos) 3. “ADP and RL in real-time feedback systems” at ADPRL 2013 (Organizers: Xin Xu and Haibo He) 4. “Finite-Approximate-Error Based Adaptive Dynamic Programming: Algorithms and Applications” at ADPRL 2013 (Organizers: Yanhong Luo, Qinglai Wei, and Zengguang Hou)

Planned Activities in 2014 Symposium: • “2014Adaptive Dynamic Programming and Reinforcement Learning(ADPRL2014)” • Special Sessions: • “Solving Games, with ADP” at WCCI 2014 (Organizers: Kyriakos G. Vamvoudakis and Travis Dierks) • “ADP algorithm for the control of multidimensional systems” at WCCI 2014 (Organizers: Huaguang Zhang and Yanhong Luo ) • “Adaptive Dynamic Programming and Its Applications in Time-Delayed Systems” at ADPRL 2014(Organizers: Qinglai Wei, Ding Wang, and Dong-bin Zhao) • Tutorial: • 1. “Extreme Learning Machine in Neural Computing and Applications” at WCCI 2014 (Organizer: Guang-bin Huang)

Activities at CIS-Related Journals(1) Editorial Service • Derong Liu: Editor in Chief, IEEE Transactions on Neural Networks and Learning Systems. • G K Venayagamoorthy: Associate Editor, IEEE Transactions on Smart Grid. • Marco Wiering, Associate Editor, IEEE Trans. on Neural Networks and Learning Systems. • Huaguang Zhang: Associate Editor, IEEE Transactions on Fuzzy Systems. • Draguna Vrabie: Associate Editor, IEEE Transactions on Neural Networks and Learning Systems.

Activities at CIS-Related Journals(2) • Huaguang Zhang: Associate Editor, IEEE Transactions on Neural Networks and Learning Systems. • Haibo He: Associate Editor, IEEE Trans. on Neural Networks and Learning Systems. • W. B. Powell: Associate Editor, Operations Research. • Haibo He: Associate Editor, IEEE Transactions on Smart Grid. • Xin Xu: Editor-in-Chief, Journal of Intelligent Learning Systems and Applications.

Other Major Activities for CIS(1) • Special Issues for CIS-RelatedJournals • 1. Special issue on OptimizationModels and Algorithms for the Smart Grid, 2013 (IEEE Transactions on the Smart Grid) • Special issue of Neural Computing and Applications on “Data-based control, optimization, modeling and applications” in 2013 (Organizer: Dongbin Zhao, Yi Shen, Zhanshan Wang, & Xiaolin Hu) • Special issue on Learning Issues in Feedback Control of Uncertain Dynamical Systems, International Journal of Adaptive Control and Signal Processing, 2013. (Organizer: Xin Xu, Frank Lewis) • Special issue on Computational Intelligence in Smart Grid, IEEE Trans. on Smart Grid, 2013

Other Major Activities for CIS(2) Major Activities for Other CIS-SponsoredConferences/ Symposium 1. Derong Liu, 6th Int. Conf. on Brain Inspired Cognitive Systems (BICS 2013), Beijing, China, June, 9-11, 2013, General Chair. 2. Huaguang Zhang, 4th Int. Conf. on Intelligent Control and Information Processing (ICICIP 2013), Beijing, China, June, 9-11, 2013, General Chair. 3. Derong Liu, IEEE World Congress on Computational Intelligence, July 6-11, 2014, Beijing, General Chair. 4. Haibo He, 2014 IEEE Symposium Series on Computational Intelligence, Dec 9-12, 2014, Orlando, General Chair.

Other Major Activities(1) Book Publications: 1. Huaguang Zhang, Derong Liu, Yanhong Luo, Ding Wang, Adaptive Dynamic Programming for Control: Algorithms and Stability. Springer Verlag, 2013. 2. F. L. Lewis and D. Liu (eds)., Reinforcement Learning and Approximate Dynamic Programming for Feedback Control. New York: Wiley, 2012. 3. Ana Madureira, Cecilia Reis, Viriato Marques (eds)., Computational Intelligence and Decision Making–Trends and applications. Springer Verlag, 2012. 4. W. B. Powell, I. O. Ryzhov, Optimal Learning, John Wiley and Sons, New York, 2012.

Other Major Activities(2) 5. D. Vrabie, K. G. Vamvoudakis, and F. L. Lewis, Optimal Adaptive Control and Differential Games by Reinforcement Learning Principles, IET press, 2012. 6. M.A. Wiering and M. van Otterlo (eds)., Reinforcement Learning: state-of-the-art, Springer, 2012. 7. K. G. Vamvoudakis, F. L. Lewis, Shuzhi Sam Ge, “Neural Networks in Feedback Control Systems,” in Mechanical Engineers’ Handbook, Instrumentation, Systems, Controls, and MEMS, ed. Myer Kutz, John Willey, NY, 2012.

Other Major Activities(3) 8. K. G. Vamvoudakis, and F. L. Lewis, “Online Adaptive Learning Solution of Multi-Agent Differential Graphical Games,” in Frontiers in Advanced Control Systems, ed. Ginalber Luiz Serra, Chapter 2, INTECH, 2012. 9. Yanhong Luo, Huaguang Zhang, Adaptive Optimal Control for Complex Nonlinear Systems, Science Press, Beijing, June 2013. (in Chinese) 10. Zhanshan Wang, Stability Analysis of Recurrent Neural Network and Its Applications, Science Press, Beijing, 2013. (in Chinese)

Other Major Activities(4) Workshops: 1. NSF workshop on May 31/June 1, 2012: "A conversation between Artificial Intelligence and operations research on stochastic optimization" which addressed modeling and algorithmic issues in approximate dynamic programming (Warren Powell). 2. Workshop at IEEE Conference on Decision and Control, Dec 2012: “Optimization Based Control” which will include presentations related to ADP and applications (Draguna Vrabie). 3. Workshop at 24th Chinese Control and Decision Conference, May 2012: “Industry Process Control and Optimization” which includes presentations related to adaptive dynamic programming theory and applications (Huaguang Zhang).

Other Major Activities(5) 4. Organizing an entire track on "computational stochastic optimization" which includes talks that are specifically on approximate dynamic programming, both for the annual informs meeting, and also for the workshop sponsored by the Informs Computing Society (Warren Powell). 5. Workshop on Exploration vs. Exploitation, Edinburgh, Scotland, “The Knowledge Gradient for Optimal Learning,” ICML 2012 (Warren Powell).

Other Major Activities(6) Major Activities for Other Conferences • Frank Lewis: Keynote Lecture on “Optimal Design for Cooperative Control Synchronization and Games on Comunication Graphs ” in Brain Inspired Cognitive Systems (BICS 2013), Beijing, China, June 9-11, 2013. • Jagannathan Sarangapani: Invited Lecture “Optimal Adaptive Control of Uncertain Nonlinear Dynamic Systems” in the 25th Chinese Control and Decision Conference, Guiyang, China, May 25-27, 2013. • Dongbin Zhao: Organize the Special Session “Data-based control and optimization for nonlinear systems”, the 32th Chinese Control Conference (CCC 2013), Xi’an, China, July 26-28, 2013.

Other Major Activities(7) • Huaguang Zhang, PC member of 20th International Conference on Neural Information Processing (ICONIP2013), Daegu, Korea, November 3-7, 2013. • Haibo He: Invited talk at the 19th International Conference on Neural Information Processing (ICONIP'12), Doha, Qatar, November 14, 2012. • Warren Powell: Advanced Tutorial: “Unifying the Jungle of Stochastic Optimization,” Conference Principles and Practices of Constraint Propagation, Quebec City, Oct 12, 2012. • Guang-Bin Huang, International Symposium on Extreme Learning Machine (ELM2012), Singapore, Dec 11-13, 2012, Symposium Chair.

Other Major Activities(8) Society and Conference Service • Huaguang Zhang: Chair of IEEE CIS Shenyang chapter • Ana Maria Madureira: Elected vice-chair of IEEE Portuguese section • Ana Maria Madureira: Elected vice-chair of IEEE CIS Portuguese chapter • Huaguang Zhang: AdCom Member of Chinese Association for Artificial Intelligence • Warren Powell: Member of American Association for Artificial Intelligence. • Warren Powell: Member of Math Programming Society and American Mathematical Society.

Discussions in 2013 TC Meeting We have discussed the following issues at the TC Meeting: 1. How to increase the number of submissions to ADPRL Symposium 2014/WCCI 2014 and motivate authors to submit their papers before the original deadline. 2. How to motivate highly qualified specialist to help review the conference papers. 3. How to avoid possible plagiarism. 4. How to encourage the TC members to propose new Task Forces and carry out the webpage updating for each Task Force. 5. How to cross the boundaries of different communities, such as ADP community, reinforcement learning community, stochastic optimal control community, and so on. 6. How to extend the applications of ADPRL algorithms to more complex industrial processes.

ADPRLTC Chair’s Plan in 2013 Activate the task force “Robot Reinforcement Learning” in 2013. Increase the members from Oceania and Africa in 2013. Encourage more keynotes/workshops/tutorials on ADPRL during some related conferences, such as CDC/ACC/ECC/ICSP/IJCNN etc., to publicize this field. 4. Consider a special issue on ADPRL in CIS-sponsored journals, such as IEEEComputational Intelligence Magazine, IEEE TNNLS, etc.. 5. Extend the applications of ADPRL algorithms to more complex industrial processes. 6. Organize some summer schools in Asia/Europe. 7. Encourage the membership upgrading (e.g., Senior Members, Fellow) and awards nomination.

Task Force Report Applications of ADP and RL Chair: Draguna Vrabie Vice-Chair(s): Zhong-Ping Jiang Members: Warren Powell Sean Meyn John Valasek Derong Liu Jay H. Lee Frank Lewis Jagannathan Sarangapani G K Venayagamoorthy Warren Dixon Activities in 2013: • Keynote Lecture on “Optimal Design for Cooperative Control Synchronization and Games on Comunication Graphs ” in Brain Inspired Cognitive Systems (BICS 2013), Beijing, China, June 9-11, 2013; 2. Invited Lecture “Optimal Adaptive Control of Uncertain Nonlinear Dynamic Systems” in the 25th Chinese Control and Decision Conference, Guiyang, China, May 25-27, 2013. 3. Workshop at IEEE CDC 2012: “Optimization Based Control” which includes presentations related to ADP and applications, December 10-13, 2012. Planned Activities in 2014: • Special issue “Reinforcement Learning and Adaptive Dynamic Programming” at ACTA AUTOMATICA SINICA, 2014.

Task Force Report Reinforcement Learning and Function Approximation Chair: Robert Babuska Vice-Chair(s):Lucian Busoniu Members: Robert Babuska Damien Ernst Lucian Busoniu Philippe Preux Activities in 2012/2013: • Book chapters: L. Busoniu, R. Munos, R. Babuska, Optimistic planning in Markov decision processes. In: Reinforcement Learning and Adaptive Dynamic Programming for Feedback Control, F. Lewis, D. Liu (ed.), Wiley, 2012. • Special Session at SSCI 2013 : “Online Planning”, Organizer: L. Busoniu, R. Munos. Planned Activities in 2014: • Lucian Busoniu is to co-organize the symposium ADPRL 2014 as a co-chair of this symposium.

Task Force Report Robot Reinforcement Learning Chair: Evangelos Theodorou Vice-Chair(s): Stefan Schaal Members: Leslie P. Kaelbling Robert Babuska Jens Kober Jun Morimoto Martin Riedmiller Nick Roy Jennie Si Russ Tedrake Emo Todorov Nikos Vlassis Activities in 2012/2013: 1. Invited Lecture at WCCI2012: “Uncovering the Neural Code of Learning Control” 2. Panel at WCCI2012: “Computational Intelligence in Education and University Curricula” Planned Activities in 2014: Unsure still for 2014.

Task Force Report Evolutionary Algorithms forADPRL Chair:Hisashi Handa Vice-Chair(s):Kazuhiro Ohkura Members: Yoshiaki KatadaMatteo Gagliolo Kazuaki Yamada Kazuhiro Ohkura Hisashi Handa Activities in 2012/2013: • Special Session at WCCI 2012: “Real World Applications of Reinforcement Learning” ; 2. Special Session on “Evolutionary Algorithms for ADPRL” at ADPRL 2013. Planned Activities in 2014: Unsure still for 2014.

Task Force Report ADPRL in Real-time FeedbackControl Systems Chair:Xin Xu Vice-Chair(s):Haibo He Members: Wen Yu Yanhong Luo Dongbin Zhao Lucian Busoniu Haibo HeXin Xu Activities in 2013: • Special issue “Optimization Models and Algorithms for the Smart Grid” on IEEE Transactions on the Smart Grid, 2013; 2. A special issue on Learning Issues in Feedback Control of Uncertain Dynamical Systems is under publication in International Journal of Adaptive Control and Signal Processing in 2013; 3. Special Session on “ADP and RL in real-time feedback systems” at ADPRL 2013. Planned Activities in 2014: 1. Tutorial on “Reinforcement learning for real-time feedback control systems” at WCCI2014.

Task Force Report ADP in Game Theory and Multi-Agent Optimization Chair: Kyriakos G. Vamvoudakis Vice-Chair(s): Travis Dierks Members: Luis Rodolfo Garcia Carrillo Marcio Fantini Miranda Qinglai Wei Activities in 2013: • Special session on “Games, ADP and Network Security” for IEEE CDC 2013; • K. G. Vamvoudakis, F. L. Lewis, Shuzhi Sam Ge, “Neural Networks in Feedback Control Systems,” to appear in Mechanical Engineers’ Handbook, Instrumentation, Systems, Controls, and MEMS, ed. Myer Kutz, John Willey, NY, 2013. Planned Activities in 2014: 1. Special session on “Solving Games, with ADP” for WCCI 2014.

TC Report for the 201 3 June AdCom Meeting (June 20 , 201 3 )

TC Report for the 201 3 June AdCom Meeting (June 20 , 201 3 )

Presentation Transcript

20 March 201 3 , Valga

Presentation June 201 2

201 3

VB S Volunteers Meeting June 16 th , 201 3

ANNUAL MEETING SWISS-LATVIAN COOPERATION PROGRAMME April 201 2 – March 201 3

201 3

June 201 4

201 3 年 6 月 June 2013

Global Business #8 June 10 , 201 3

ANNUAL MEETING SWISS-LATVIAN COOPERATION PROGRAMME April 201 2 – March 201 3

201 3

Bucharest June , 201 3