1 / 10

Towards Automated Acoustic Model Training

Towards Automated Acoustic Model Training. Eric Thayer Soliloquy Learning. Goals. Define concise declarative expression of the acoustic modeling task Provide expressions for common task idioms Iteration until convergence Enumeration over a set of parameter values

yazid
Télécharger la présentation

Towards Automated Acoustic Model Training

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Towards Automated Acoustic Model Training Eric Thayer Soliloquy Learning

  2. Goals • Define concise declarative expression of the acoustic modeling task • Provide expressions for common task idioms • Iteration until convergence • Enumeration over a set of parameter values • Task partitioning for distributed processing • Provide a task scheduling abstraction that can be realized with local processes, batch job submission, AM@HOME(?)

  3. Process Overview • Acoustic model building as a DAG of dependent tasks • Hierarchical/overridable parameter bindings • Similar to a complex software system build • Except • Iteration until convergence is unusual • Data-dependent work distribution among N processors • Using build systems like make, ant, scons  would require ‘shoehorning’

  4. Build Steps <step stepID=‘010’ stepName=‘MakeDict’ depends=‘002 004’ > <locale>UK</locale> </step> • Required parameters are fetched from the parameters defined in the step or the surrounding context • Steps have a name that is used to select the Python module that implements the step.

  5. Special “Steps” • iterate • Simple iterator that executes the contained steps • Step execution order determined by dependency lists • enumerate • Simple enumerator that executes the contained steps for a sequence of specified values • partition • Splits the work for a substep evenly among N workers • Equal audio duration • Equal cepstral frames • Equal state counts (senone decision tree building) • Implemented by the Executor

  6. Executor • Executor schedules the modeling steps on available processors • Does a topological sort of the top-level steps to determine the order of step execution • iterator/enumerator substeps • Contained steps are executed according to their dependencies • Dependencies are internal to the iterator/enumerator element (i.e. no references to steps outside the iterator/enumerator) • Executor allows substeps to execute so long as all dependent steps have successfully terminated and there are available processors.

  7. Python Step Implementation • hasPre() : boolean • Tells the Executor that the pre() method should be called before executing the step. • pre(log) : boolean • The Executor will call this method if hasPre() == True • hasCmd() : boolean • Tells the Executor that the step is implemented using a command to the OS • cmdString() : string • The command the Executor should submit to the OS/batch queue • hasPost() : boolean • Tells the Executor that it must call the post() method after the command completes successfully. • post(log) : boolean • The Executor will call this method if hasPost() == True • Executor step execution flow • Persist parameter definitions to a shared repository (NFS filesystem, DB, etc) • Schedule the execution of the Python step implementation on available processors • Load parameter definitions and execute step • Report step ended event and status to Executor

  8. Parameter Contexts <context> <taskName>HUK</taskName> <outputDir>/home/train/AcousticModeling</outputDir> <convergenceThreshold>0.0001</convergenceThreshold> <maxIterations>10</maxIterations> <logDirTemplate>$outputDir/$taskname/log/${stepID}.${stepName}</logDirTemplate> <context> <modelContext>CI</modelConext> <step stepID=‘005’ stepName=‘InitModels’ /> <iterate stepID=‘010’ stepName=‘RefineCiModels’ iterVar=‘iter’ depends=‘005’ > <partition stepID=‘020’ by=‘EqualCepstralFrames’ partVar=‘part’> <step stepName=‘BaumWelch’ /> </partition> <step stepID=‘030’ depends=‘020’ stepName=‘Norm’ /> </iterate> </context> </context>

  9. Executor Commands • Assumes you are in the top-level directory of an modeling task • Executor.py [--config etc/<someconfig>] [--run <runSpec>] • A <runSpec> can be: • “all” • Execute all the steps in the modeling task config • “050” • Execute just step 050 in the task config • “all:050” • Execute all the steps in the modeling task config starting at step 050 • “050:3/070:3000/090” • Execute step 050 (an iterator) w/ the iterator variable set to “3” • Execute the enumeration contained in the iterator with the enumeration variable set to 3000 • Execute the step 090 contained in the enumeration • “all:050:3/070:3000/090” • Same as the one above, but continues to execute the steps after 090 until the modeling task steps are all complete.

  10. Future Plans • Implement steps for FCHMM training • Implement job scheduling on some distributed batch job facility (PBS, Condor, Sun Grid Engine) • Deliver framework to the CMU Sphinx open source project

More Related