LF Deep Learning Foundation Technical Advisory Council Meeting

LF Deep Learning FoundationTechnical Advisory Council Meeting January 3, 2018

Recording of Calls • This is a reminder that we have decided to record TAC meetings and store them on the TAC Wiki.

Dial-in Information • Join from PC, Mac, Linux, iOS or Android: https://zoom.us/j/430697670 • Or iPhone one-tap: • US: +16465588656,,430697670# or +16699006833,,430697670# • Or Telephone: • Dial(for higher quality, dial a number based on your current location): • US: +1 646 558 8656 or +1 669 900 6833 or +1 855 880 1246 (Toll Free) or +1 877 369 0926 (Toll Free) • Meeting ID: 430 697 670 • International numbers available: https://zoom.us/u/achYtcw7uN

Antitrust Policy Notice • Linux Foundation meetings involve participation by industry competitors, and it is the intention of the Linux Foundation to conduct all of its activities in accordance with applicable antitrust and competition laws. It is therefore extremely important that attendees adhere to meeting agendas, and be aware of, and not participate in, any activities that are prohibited under applicable US state, federal or foreign antitrust and competition laws. • Examples of types of actions that are prohibited at Linux Foundation meetings and in connection with Linux Foundation activities are described in the Linux Foundation Antitrust Policy available at http://www.linuxfoundation.org/antitrust-policy. If you have questions about these matters, please contact your company counsel, or if you are a member of the Linux Foundation, feel free to contact Andrew Updegrove of the firm of Gesmer Undergone LLP, which provides legal counsel to the Linux Foundation.

Agenda • Roll Call • Approval of Minutes • Project Updates • Acumos AI • Angel • Mission and Goals Statement • LF DL Interactive Landscape • Next Meeting Information • Open Discussion / Questions

TAC Member Directory *TAC Chairperson

Approval of Minutes • Draft minutes from the Sept 13, Oct 11, Oct 25, Nov 8, and Dec 6 meetings of the TAC were previously distributed to the TAC members • Proposed Resolution: • RESOLVED: That the minutes of the Sept 13, Oct 11, Oct 25, Nov 8, and Dec 6 meetings of the Technical Advisory Council of the LF Deep Learning Foundation are hereby approved

LFDL Overview • LF Deep Learning Foundation is an umbrella project of The Linux Foundation with the mission of supporting AI, ML and DL open source projects • LF DL currently has four projects and we are accepting contributions of additional projects A framework designed to help DL cloud service providers to build cluster cloud services using different DL frameworks. https://github.com/PaddlePaddle/edl Distributed training framework for TensorFlow, Keras, and PyTorch. https://github.com/uber/horovod Marketplace for ML models. https://github.com/acumos A high-performance distributed ML platform based on Parameter Server, running on YARN and Apache Spark. https://github.com/Angel-ML/angel

TAC Overview • The TAC is a committee within LF DL that is responsible for assisting in the coordination and communication of projects within LF DL. It is not a technical oversight body, all technical oversight is handled by the individual projects themselves according to their own governance • Since the March 2018 launch of LF DL, activity of the TAC has included: • approving a project lifecycle for projects of LF DL (incubation, graduation, archive stages); • reviewing and then approving contribution proposals for the Angel, EDL and Horovod projects; • open sourcing a landscape for open source AI projects; • exploring additional AI projects including Kubeflow, Pachyderm, ONNX, PMML, and so on, and • we are taking proposals for projects on LFLD github (https://github.com/LFDLFoundation/proposing-projects)

TAC Wiki • Visit the TAC Wiki for more information: (login to lists.deeplearning.org required): • https://lists.deeplearningfoundation.org/g/tac-general/wiki/home

Project Updates Acumos AI Project Angel Project

Acumos AI Project Update • Our Presenter: • John (Jack) Murray, jfm@research.att.com • https://wiki.acumos.org/display/AC

Acumos AI Project Update • First Community Release “Athena” Completed • https://wiki.acumos.org/ • https://docs.acumos.org/en/athena/ • https://marketplace.acumos.org/#/home

8 Projects • 59 Repositories • 500K+ lines of code • 3K+ Commits (>170 LPC AVG) • 38 Committers • 69 Contributors • 78 Reviewers • 42 Epics, 176 US, 417 Tasks • 1350 Jira Items • 100+ Internal API Functions • 30+ External API’s • 40%+ Code Coverage • Platform Hardening • Platform Hardening • SSO, Personalized UI • Client Libraries (TensorFlow, ScikitLearn, H2O, R, Java) • Onboarding, Micro Services Generation • Enhanced Marketplace for AI/ML Models • Design Studio DAG Models Solutions • Model Connector, Runner, Probe • Federation • Cloud/Kubernetes Deployment • Dataset Management (Seed Code) • ML Workbench Tools (Seed Code) • One Click Platform Deployment • O&AM (Admin, Logging, Security) Key Stats Features Acumos AI Project Overview • Acumos released it first public release Athena on November 14th, 2018 • Maintenance release Dec 12th, 2018 to address initial defect list • What has the project focused on and accomplished since launch or last update? Boreas, JIRA and Defects

Key Stats Athena Athena Release AT&T • TSC, Architecture, Security, Common Services, DS, Licensing, CMLP, CDS/Federation Tech Mahindra • Release Manager, AcumosR, Acumos.org • Portal, Deployment, DS, Testing Amdocs • Product/Community Committee PTL Nokia • Training, Data Pipeline planning • Cloud Native/Kubernetes Orange • Onboarding, External Libraries Redhat • Kubernetes platform deployment • 8 Projects • 59 Repositories • 500K+ lines of code • 3K+ Commits • 38 Commiters • 69 Contributors • 78 Reviewers • 42 Epics, 176 US, 417 Tasks • 1350 Jira Items • 100+ Internal API Functional • 30+ External API’s • 40%+ Code Coverage Athena Contributors Boreas Release Amdocs • Product/Community Committee PTL AT&T • Licensing, Workbench Ericson • Licensing, Training and Data Pipeline • 5G Use Cases Huawei • 5G Use Cases Nokia • Training, Data Pipeline planning • 5G Use Cases NXP • Whitebox deployment Orange • Onboarding, External Libraries Redhat • Kubernetes model deployment and GPU support • Kubeflow Tech Mahindra • Release Manager • Portal, Deployment, DS, Testing

2019 Key Goals • Acumos is targeting two release for 2019 • Release Boreas – May 2019 (Top 5) • 1 Training - Model Training Pipeline, onboarding improvement (Kubeflow) • 2 Licensing • 3 Data Pipeline – Training • 4 Model Evaluation and Validation – Model Evaluation and Test Harness • 5 Model Security – Validation and Citation • Release “C” – November 2019 • 1 Data Management and Tooling • 2 Data Catalog • 3 Model and Data Exchange • 4 Federated Training • 5 Enhanced Catalogs • What is the project looking for from the TAC and the community? • More Participation with Developers, Big issues- Model security, licensing, data policy

Boreas Priorities • 1 Training - Model Training Pipeline, onboarding improvement • 2 Data Pipeline - Training • 3 Model Evaluation and Validation – Model Evaluation and Test Harness • 4 Licensing • 5 Model Security - Security sub committee • 6 Model Deployment - K8S, Partner (openshift, whitebox), GPU • 7 Portal Architecture - UX/UI hardening, Public Federation checks, CMS removal • 8 OA&M - Performance/Optimization, Standard Logging (platform/models) • 9 Portal - Model Life Cycle Management- versioning notification, deletion/revoking, dashboard, composite public • 10 Model Builder - Jupyter, RCloud, RapidMiner, AutoML, Zeppelin, Design studio • 11 Data Management - Data stores, Data broker, Data Movement, Data catalog • 12 Design Studio/Model Interoperability - ONNX, PFA, etc. • 13 On-Boarding Model Type support - Caffe, MXNet, Paddle, PyTorch, etc. • 14 LF DL Projects Integration Tencent Angel, Baidu EDL, IBM FIDL • 15 Integration with 5G Use Cases • 16 OA&M- k8 tools for monitoring, metrics, backups, archival/purging, platform security • 17 Deploy - Serving Pipeline Training

Angel Project Update • Our Presenter: • Fitz Wang(王才华),fitzwang@Tencent.com • Acknowledgements: • Paynie Xiao, Lele Yu, Jeremy Jiang, Wenbin Wei • https://github.com/Angel-ML/angel

Angel Update - Agenda • Project summary • What the project focused on and accomplished • Key goals of Angel in 2019 • Help from TAC • Project statistics

Project Summary - Angel AutoML Serving Spark on Angel Angel PS Angel Native XXX on Angel A Flexible and Powerful Parameter Server for large-scale machine learning • PS: Angel provides a powerful parameter server • ML lib: Various of machine learning algorithms, specially for RS • SDK: Angel provide three levels of APIs for secondary development ML Core Tree Core Computing Graph Angel Math Library Level III Level I : PS + Math lib, for lower level development, that seeking for higher performance Level II: Level I + ML core, for platform development → XXX on Angel Level III: Level II + Out of the box algorithms → LR/FM/SVM/DAW/NFM/DFM/GBDT/LDA… Level II Level I

Ongoing focus areas and accomplishments • Computing graph: • Lightweight computing graph with automatic gradient calculation • JSON configuration for algorithm • New Optimizers: Adam, FTRL, Momentum, AdaGrad • Math library: optimize for sparse data calculation • Primitive types/Long key index/storage aware AutoML Serving Spark on Angel Angel PS Angel Native XXX on Angel ML Core Tree Core Computing Graph Angel Math Library

2019 Plans AutoML Serving Spark on Angel Angel PS Angel Native XXX on Angel • AutoML: automatic hyperparameter tuning based on BO • Serving: a generic serving system for Angel (similar to TFS) • Kubernetes support: enable Angel run on cloud • Graph Learning Algorithms: LINE, GraphSAGE, etc. • Flink on Angel: (ongoing discussion) Support streaming on Angel ML Core Tree Core Computing Graph Angel Math Library Note: Graph Learning is based on Level I APIs, So it is not in the stack Kubernetes support

Requesting support from TAC • Seeking for cooperation in AutoML and Graph Learning • Achieve a generic model format, model compression

Project Statistics • Contributors: 30 (XXX % + XXX %) • Committers: 7 (100 % + 0 %) • Stars: 3851 • Commits: 1670 • Forks: 985 • Branches: 18 • PRs: 257 • Issues:248 • Releases: 10

Level III API (for algorithm user) AutoML Serving Spark on Angel Angel PS Angel Native XXX on Angel ML Core Tree Core Computing Graph Angel Math Library Level III Level II API (for platform development) Angel PS Level II ML Core Tree Core Computing Graph Angel Math Library Level I Level I API (for lower level development) Angel PS Angel Math Library

LF DL Mission Statement & Strategic Goals

LF DL Mission Build and support an open community of data scientists, researchers, and developers focused on AI, ML, and DL, and drive open source innovation in these domains by enabling collaboration and the creation of new opportunities for the community and their supporting organizations

A Neutral Environment to Accelerate Open Source AI/ML/DL Innovation • Establish a neutral environment that fosters collaboration on AI/ML/DL open source projects • Enable cross-pollination between AI/ML/DL open source projects in the drive for faster innovation cycles and a higher multiplier network effect • Support building a sustainable AI/ML/DL open source ecosystem and provide needed support to projects under the LF DL (legal, marketing, events, business development, developer relations, etc.)

Harmonize open source AI/ML/DL projects • Minimize and avoid fragmentation and redundancies among open source AI/ML/DL projects • Support efforts that focus on ensuring interoperability between the various components of the open source AI/ML/DL stack • Align vendors and providers of open source solutions on the larger AI/ML/DL ecosystem

Become the home of open source efforts for AI ethics and fairness • Advocate for AI ethic and fairness principles • Support open source AI/ML/DL projects in adopting and abiding by these principles • Support the development of tooling and libraries that address ethical and fairness concerns related to AI practices • Collaborate with academics and other participants engaged in research in this domain • Support efforts in the area of ML explainability leading to the development of open source methods and tools that will help scientists understand data sets, models’ predictions and uncover and correct for biases • ML privacy (TBD)

Data Lifecycle Management - Best Practices,Tools and Marketplace Open source data sets allow researchers and data scientists to improve models, increases transparency, and further validation of trained models • Encourage enterprises to publish their data under an open source license (CDLA) • Host open source data sets and make them accessible to the wider community of users • Support the creation of open source tools and mechanisms required to support hosting data Provide a marketplace for AI/ML/DL models: • Allow the creation, onboarding, enhancements, publication and deployment of models • Provide an economical approach to how the marketplace would operate and incentivize sharing and interchanging of models • Enable a federated marketplace that allows control on what models are visible to whom

Increase awareness on key open source AI/ML/DL projects and their dependencies • Provide a funding model to support projects critical in building AI/ML/DL open source stacks • Grow developer and user community of these key open source projects and their critical dependencies • Lower the barriers to participate in the development of AI/ML/DL open source projects • Provide guidelines, development best practices, design principles, training courses and other educational content to enable developers without an AI/ML/DL background to participate and be effective contributors in open source projects in these specialty domains • Create and maintain an industry landscape of AI/ML/DL open source projects and make it available to the public

LF DL Interactive Landscape

LF Deep Learning Foundation – Interactive Landscape • This landscape is intended as a map to explore open source artificial intelligence, machine learning, and deep learning projects; it also features the member companies of the LF Deep Learning Foundation. • https://l.lfdl.io • Please open a pull request to correct any issues.

Try it at: https://l.lfdl.io

Next Meetings • Our next meeting is scheduled for January 17, 2019. • Presentation from Uber on PYRO • Pyro is a universal probabilistic programming language (PPL) written in Python and supported by PyTorch on the backend. Pyro enables flexible and expressive deep probabilistic modeling, unifying the best of modern deep learning and Bayesian modeling. • https://Pyro.ai • https://github.com/uber/pyro

Open Discussion / Questions

Addendum

Legal Notices The Linux Foundation, The Linux Foundation logos, and other marks that may be used herein are owned by The Linux Foundation or its affiliated entities, and are subject to The Linux Foundation’s Trademark Usage Policy at https://www.linuxfoundation.org/trademark-usage, as may be modified from time to time. Linux is a registered trademark of Linus Torvalds. Please see the Linux Mark Institute’s trademark usage page at https://lmi.linuxfoundation.org for details regarding use of this trademark. Some marks that may be used herein are owned by projects operating as separately incorporated entities managed by The Linux Foundation, and have their own trademarks, policies and usage guidelines. TWITTER, TWEET, RETWEET and the Twitter logo are trademarks of Twitter, Inc. or its affiliates. Facebook and the “f” logo are trademarks of Facebook or its affiliates. LinkedIn, the LinkedIn logo, the IN logo and InMail are registered trademarks or trademarks of LinkedIn Corporation and its affiliates in the United States and/or other countries. YouTube and the YouTube icon are trademarks of YouTube or its affiliates. All other trademarks are the property of their respective owners. Use of such marks herein does not represent affiliation with or authorization, sponsorship or approval by such owners unless otherwise expressly specified. The Linux Foundation is subject to other policies, including without limitation its Privacy Policy at https://www.linuxfoundation.org/privacy and its Antitrust Policy at https://www.linuxfoundation.org/antitrust-policy. each as may be modified from time to time. More information about The Linux Foundation’s policies is available at https://www.linuxfoundation.org. Please email legal@linuxfoundation.org with any questions about The Linux Foundation’s policies or the notices set forth on this slide. 1/3/18 The Linux Foundation Internal Use Only

LF Deep Learning Foundation Technical Advisory Council Meeting