1 / 17

A Survey of ICASSP 2013 Language Model

A Survey of ICASSP 2013 Language Model. Department of Computer Science & Information Engineering National Taiwan Normal University. 報告者:郝柏翰. Converting Neural Network Language Models into Back-off Language Models for Efficient Decoding in Automatic Speech Recognition.

lucien
Télécharger la présentation

A Survey of ICASSP 2013 Language Model

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. A Survey of ICASSP 2013Language Model Department of Computer Science & Information Engineering National Taiwan Normal University 報告者:郝柏翰

  2. Converting Neural Network Language Models into Back-off Language Models for Efficient Decoding in Automatic Speech Recognition EbruArısoy et al., IBM T.J. Watson Research Center, NY

  3. Introduction • In this work, we propose an approximate method for converting a feedforward NNLM into a back-off n-gram language model that can be used directly in existing LVCSR decoders. • We convert NNLMs of increasing order to pruned back-off language models, using lower-order models to constrain the n-grams allowed in higher-order models.

  4. Method • A back-off n-gram language model takes the form • In this paper, we propose an approximate method for converting a feedforward NNLM into a back-off language model that can be directly used in existing state-of-the-art decoders. • where and represent the NNLM and background language model probabilities

  5. Method • To represent NNLM probabilities exactly over the output vocabulary requires parameters in general, where V is the complete vocabulary. • While we can represent the overall NNLM as a back-off model exactly, it is prohibitively large as noted above. The technique of pruning can be used to reduce the set of n-grams for which we explicitly store probabilities

  6. Method

  7. Experiments More smooth Before After

  8. Use of Latent Words Language Models in ASR: a Sampling-Based Implementation Ryo Masumura et al., NTT Media Intelligence Laboratories, Japan

  9. Introduction • This paper applies the latent words language model (LWLM) to automatic speech recognition (ASR). LWLMs are trained taking into account related words, i.e., grouping of similar words in terms of meaning and syntactic role. • In addition, this paper also describes an approximation method of the LWLM for ASR, in which words are randomly sampled on the LWLM and then a standard word n-gram language model is trained.

  10. Method • Hierarchical Pitman-Yor Language Model • If we directly implement LWLM to one-pass decoding, we have to calculate the probability distribution over current word given context • Latent Words Language Model • LWLMs are generative models with a latent variable for every observed word in a text.

  11. Method • The latent variable, called latent word , is generated by its context and observed word is generated from latent word

  12. Expriments • This result shows that we can construct LWLM comparable to HPYLM if we generate sufficient text data. Moreover, highest performance was achieved with LWLM+HPYLM. This results shows that LWLM possesses properties different from those of the HPYLM, and further improvement is achieved if they are combined.

  13. Incorporating Semantic Information to Selection of WEB Texts for Language Model of Spoken Dialogue System KoichiroYoshino et al., Kyoto University, Japan

  14. Introduction • A novel text selection approach for training a language model (LM) with Web texts is proposed for automatic speech recognition (ASR) of spoken dialogue systems. • Compared to the conventional approach based on perplexity criterion, the proposed approach introduces a semantic-level relevance measure with the back-end knowledge base used in the dialogue system. • We focus on the predicate-argument (P-A) structure characteristic to the domain in order to filter semantically relevant sentences in the domain.

  15. Method • Selection Based on Perplexity • For a sentence , its perplexity by a seed LM trained with the document set D is defined by • Selection Based on Semantic Relevance Measure where C(.) stands for an occurrence count and P(D) is a normalization factor determined by the size of D. γ is a smoothing factor estimated with a Dirichlet prior

  16. Method • For a P-A pair consisting of and , we define as a geometric mean of and • For each sentence , we compute a mean of for P-A pairs included in the sentence, defined as .

  17. Experiments

More Related