1 / 16

Relevance Language Modeling For Speech Recognition

Relevance Language Modeling For Speech Recognition. Kuan-Yu Chen and Berlin Chen National Taiwan Normal University, Taipei, Taiwan ICASSP 2011. 2014/1/17 Reporter: 陳思澄. Outline. Introduction Basic Relevance Model(RM) Topic-based Relevance Model Modeling Pairwise Word Association

ledell
Télécharger la présentation

Relevance Language Modeling For Speech Recognition

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Relevance Language Modeling For Speech Recognition Kuan-Yu Chen and Berlin Chen National Taiwan Normal University, Taipei, Taiwan ICASSP 2011 2014/1/17 Reporter:陳思澄

  2. Outline • Introduction • Basic Relevance Model(RM) • Topic-based Relevance Model • Modeling Pairwise Word Association • Experimental • Conclusion

  3. Introduction • In the relevance modelingto IR, each query is assumed to be associated with an unknown relevance class , and documents that are relevant to the information need expressed in the query are samples drawn from . • When RMis applied to language modeling in speech recognition, we can conceptually regard the search history as a query and each of its immediately succeeding words as a document, and estimate a relevance model for modeling the relationship between and . Relevance Documents Query

  4. Basic Relevance Model • The task of language modeling in speech recognition can be interpreted as calculating the conditional probability . • is a search history , usually expressed as a sequence , and is one of its possible immediately succeeding word. • Because the relevance class of each search history is not known in advance, A local feedback-like can be used to obtain a set of relevant documents to estimate the joint probability .

  5. Basic Relevance Model • where is the probability that we would randomly select and is the joint probability of simultaneously observing H and w in . • The joint probability of observing H together with w is: • Bag-of-word assumption: Assume the words are conditionally Independent given and their order is no importance.

  6. Basic Relevance Model • The conditional probability: • The background n-gram language model trained on a large general corpus can provide the generic constraint information of lexical regularities.

  7. Topic-based Relevance Model • TRM makes a step forward to incorporate latent topic information into RM modeling • Relevance documents of each search history are assumed to share a same set of latent topic variables describing the “word-document” co-occurrence characteristics.

  8. Topic-based Relevance Model TRM can be represented by: ( Word of the document all come from the same topic.)

  9. Modeling Pairwise Word Association • Instead of using RM to model the association between an entire search history and a newly decoded word, we can also use RM to render the pairwise word association between a word in the history and a newly decoded word .

  10. Modeling Pairwise Word Association • A “composite” conditional probability for the search history to predict can be obtained by linearly combining of all words in the history: • Where the value of the nonnegative weighting coefficients are empirically set to be exponentially decayed.

  11. By the same token, a set of latent topics to describe word-word co-occurrence relationships in a relevant document , and the pairwise word association between a history word and the decoded word is thus modeled by

  12. Experimental setup • Speech corpus: 196 hours(MATBN) • Vocabulary size: 72 thousands words • Trigram language model was estimated from a background text corpus consisting of 170 million Chinese characters. • The baseline rescoring procedure with the background trigram language model results in a character error rate(CER) of 20.08% on the test set. Experimental • 1. We assess the effectiveness of RM and PRM with respect to different numbers of retrieved documents being used to approximate the relevance class. • 2.Measure the goodness of RM and PRM when a set of latent topic is additionally employed to describe the word-word co-occurrence relationships in a relevant document ,when the resulting models are TRM and TPRM. • 3. Compare the proposed methods with several well-practiced language model adaption methods.

  13. Experimental • This reveals that only a small subset of relevant documents retrieved from the contemporaneous corpus is sufficient enough for dynamic language model adaptation. • PRM shows its superiority over RM for almost all adaptation settings. Results of RM and PRM (in CER(%))

  14. Experimental While simply assuming that the model parameters are uniformly distributed tends to perform Slightly worse than that with the Dirichlat prior assumption with their best setting. • Results of TRM and TPRM (in CER(%))

  15. Experimental • These results are at the same performance level as that obtained by TPRM. • On the other hand , TBLM has its best CER of 19.32% , for which the corresponding number of trigger pairs was determined using the development set. • Our proposed methods seem to be good surrogates for the exiting language model adaptation methods , in terms of the CER reduction.

  16. Conclusion • We study a novel use of relevance information for dynamic language model adaptation in speech recognition. • Our methods not only inherit the merits of several existing techniques but also provide a flexible but systematic way to render the lexical and topical relationships between a search history and an upcoming word. • Empirical results on large vocabulary continuous speech recognition seem to demonstrate the utility of the presented models. • These methods can also be used to expand query models for spoken document retrieval (SDR) tasks.

More Related