1 / 14

Acoustic models in Kaldi

Acoustic models in Kaldi . Acoustic models in Kaldi . Support for standard ML-trained models Linear transforms like LDA, HLDA, MLLT/STC Speaker adaptation with fMLLR, MLLR Support for tied-mixture systems initially discussed Support for SGMMs

didina
Télécharger la présentation

Acoustic models in Kaldi

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Acoustic models in Kaldi

  2. Acoustic models in Kaldi • Support for standard ML-trained models • Linear transforms like LDA, HLDA, MLLT/STC • Speaker adaptation with fMLLR, MLLR • Support for tied-mixture systems initially discussed • Support for SGMMs • Speaker adaptation with fMLLR (single transform) in addition to speaker subspaces • Modular code, can be easily extended

  3. Overview of AM Classes Acoustic Model (GMM) std::vector< GMM* > GMM “knows about”

  4. GMM • Gaussians represented using natural parameters. • For efficient likelihood evaluation

  5. GMM • Gaussians represented using natural parameters. • For efficient likelihood evaluation

  6. GMM • Gaussians represented using natural parameters. • For efficient likelihood evaluation

  7. GMM • Likelihood calculation done in 2 matrix-vector multiplications. • Optimized BLAS routines can be used.

  8. Overview of AM Classes “knows about” Features Decodable AM GMM Decoder Acoustic Model (GMM) GMM

  9. The Decodable Interface classDecodableInterface{ public: // Returns the log likelihood (negated in the decoder). virtualBaseFloatLogLikelihood(int32 frame, int32 index) = 0; // Frames are one-based. virtualboolIsLastFrame(int32 frame) = 0; /// Indices are one-based (compatibility with OpenFst). virtualint32NumIndices() = 0; virtual ~DecodableInterface() {} };

  10. Overview of AM Classes “knows about” Features “modifies” Decodable AM GMM Decoder Acoustic Model (GMM) std::vector<GMM Accumulators> GMM Accumulators GMM

  11. AM Classes with fMLLR Regression tree Features fMLLR Decodable AM GMM with fMLLR Decoder fMLLR Estimator Acoustic Model (GMM) GMM

  12. AM Classes with MLLR Regression tree Features MLLR Decodable AM GMM with fMLLR Decoder fMLLR Estimator Acoustic Model (GMM) GMM

  13. Overview of SGMM Classes “knows about” Features “modifies” Decodable SGMM Decoder Acoustic Model (GMM) SGMM Updater SGMM Accumulators Full-GMM Diag-GMM

  14. Things to do next • fMLLR basis for SGMMs • The “Symmetric” extension to SGMMs • Discriminative training code planned for this summer • Need lattice generation • Thoughts on multiple feature transforms

More Related