Accelerating Distributed Machine Learning by Smart Parameter Server

AcceleratingDistributedMachineLearningbySmartParameterServer JinkunGeng, Dan Li and Shuai Wang

Background • Distributedmachinelearningbecomesthecommonpractice,becauseof: • 1.Theexplosivegrowthofdatasize

Background • Distributedmachinelearningbecomesthecommonpractice,becauseof: • 2.Theincreasingcomplexityoftrainingmodel ImageNetCompetition: <10(Hinton, 2012), 22 (Google, 2014), 152 (Microsoft, 2015), 1207 (SenseTime, 2016)

Background • ParameterServer(PS)-basedarchitectureiswidelysupportedbymainstreamDMLsystems.

Background • However,thepowerofPSarchitecturehasnotbeenfullyexploited. • 1.Communicationredundancy • 2.Stragglerproblem

Background • Adeeperinsight… • 1.Worker-centricdesignislessefficient • 2.PScanbemoreintelligent(i.e.SmartPS) SmartPS

Background • TomakePSmoreintelligent… • Dependency-Aware • Straggler-Assistant

ASimpleModelofParameters

WorkflowofPS-basedDML

DesignStrategies • TomakePSmoreintelligent… • 1.Selectiveupdate() • 2.Proactivepush() • 3.Prioritizedtransmission() • 4.Unnecessarypushblockage()

Strategy1:SelectiveUpdate

Strategy2:Proactive Push

Strategy3:Straggler-Assistant

Strategy4:Blocking Unnecessary Pushes

Evaluation • ExperimentSetting： • 17Nodeswithdifferentperformanceconfigurations:1PS+16Worker • 2Benchmarks: • MatrixFactorizationandPageRank • 5Baselines： • BSP， ASP，SSP（slack=1）， SSP（slack=2），SSP （slack=3）

Evaluation MFBenchmark: Withacommonthreshold,SmartPSreducesthetrainingtimeby68.1%~90.3%comparedwiththebaselines.

Evaluation PRBenchmark: Withacommonthreshold,SmartPSreducesthetrainingtimeby65.7%~84.9%comparedwiththebaselines.

FurtherDiscussion • Comparisontosomerecentworks: Bothleveragetheknowledgeofparameterdependency 2.Bothleverageprioritizedtransmission forDMLacceleration

FurtherDiscussion • Comparisontosomerecentworks:

OngoingWork • AdeeperinsightintoPS-basedarch… • FunctionofPS: • 1.ParameterDistribution • 2.ParameterAggregation • FunctionofWorker: • 1.ParameterRefinement ->DataAccessControl ->DataOperation ->DataOperation

OngoingWork ParameterDistribution ParameterAggregation ParameterRefinement

OngoingWork DataAccessControl DataOperation DataOperation

OngoingWork DataAccessControl Token Token Token DataOperation

NextGenerationofSmartPS • ParameterServer->TokenServer • 1.Decoupledata(access)controlanddataoperation • 2.Alight-weightandsmartTokenServerinsteadofParameterServer. TokenServer ParameterServer

Thanks! NASPResearchGroup https://nasp.cs.tsinghua.edu.cn/ https://www.gengjinkun.com/

Accelerating Distributed Machine Learning by Smart Parameter Server

Accelerating Distributed Machine Learning by Smart Parameter Server

Presentation Transcript

Accelerating e-Learning Interoperability

Accelerating Machine Learning Applications on Graphics Processors

Distributed Learning

Accelerating FOR/AS Learning

Accelerating Revenue through Learning

Parameter Learning

Accelerating Smart Play-Out

Accelerating Machine Learning Applications using Delite

Distributed Machine Learning: Communication, Efficiency, and Privacy

Accelerating Smart Grid Standards Development

Online Machine Learning with Distributed In-memory Clusters

Smart Phones using Machine Learning

Online Machine Learning with Distributed In-memory Clusters

Distributed Learning

Accelerating e-Learning Interoperability

Learning: Parameter Estimation

15 Smart Ways Machine Learning Helps Businesses