1 / 18

Linguist Module in Sphinx-4 By Sonthi Dusitpirom

Linguist Module in Sphinx-4 By Sonthi Dusitpirom. Objective. How to change dictionary in Sphinx-4. Sphinx-4 .

waseem
Télécharger la présentation

Linguist Module in Sphinx-4 By Sonthi Dusitpirom

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Linguist Module in Sphinx-4By SonthiDusitpirom

  2. Objective • How to change dictionary in Sphinx-4

  3. Sphinx-4 • Sphinx-4 is an open source framework for speech recognition, written in the Java programming to help in the research of speech recognition system. In Sphinx-4 it has 3 main components • The FrontEnd • The Decoder • The Linguist

  4. Sphinx-4

  5. Sphinx-4 • In this project we focus on the Linguist componentthat has 3 subcomponents • The Acoustic Model • Acoustic model is pronounced of individual characters, known as phonemes. • The Dictionary • Dictionary is the pronunciation of all the words that the system can recognize. • The Language Model • Language model describes how the grammar looks like.

  6. Acoustic Model • The acoustic model in Sphinx-4 consists of a set of left-to-right Hidden Markov Models for basic sound units. The units represent phones in a triphone context. • The acoustic model in Sphinx-4 is packed in JAR file. The advantage of packing it in a JAR file is that the file can be included in the classpath and referenced in the configuration file for it to be used in a Sphinx-4 application.

  7. Acoustic Model • In sphix-4 we have two important models that are for difference purpose • TIDIGITS_8gau_13dCep_16k_40mel_130Hz_6800.jar is designed and created for number. If you need to recognize number then you should use this model • WSJ_8gau_13dCep_16k_40mel_130Hz_6800.jar is designed and created for text. If you want to recognize text then you should use this model.

  8. Dictionary • Dictionary provides pronunciations for words found in the language model. The pronunciations split words into sequences of phonemes that found in the acoustic model.

  9. Language Model • There are two types of model that describe language • Grammars language model • Grammars describe very simple types of languages for command and control, and you are written by hand or generated automatically with plain code. • Statistical language model • Statistical language model estimate the probability of the distribution of natural language. The most widely used statistical language model is N-gram

  10. Create a new dictionary • In Sphinx-4 we already have a dictionary. This is the way to change dictionary • Extract WSJ_8gau_13dCep_16k_40mel_130Hz_6800Hz.jar in lib directory. • Go to dict folder and open cmudict.0.6.d file in that folder. • Insert words and phonemes into cmudict.0.6d file and save. • Zip the folder that we extract in zip file. • Remove WSJ_8gau_13dCep_16k_40mel_130Hz_6800Hz.jar from libraries in build path and add zip file into libraries in build path.

  11. XML Configuration File • The configuration of a particular Sphin-4 system is determined by a configuration file. This configuration file defines the following • The names and types of all of the components of the system. • The connectivity of these components – that is, which components talk to each other. • The detailed configuration for each of these elements.

  12. XML configuration File • Determining which components are to be used in the system. • Determining the detailed configuration of each of these components.

  13. Use Model in Sphinx-4 • There are three steps to use new model from Sphinx-4 • Defining a language model. • Defining a dictionary. • Defining an acoustic model.

  14. Define a Language Model <component name="jsgfGrammar" type="edu.cmu.sphinx.jsapi.JSGFGrammar"> <property name="grammarLocation“ value=" the path to the grammar folder "/> <property name="dictionary" value="dictionary"/> <property name="grammarName" value=“the name of grammar"/> <property name="logMath“ value="logMath"/> </component>

  15. Define a Language Model <component name="trigramModel" type="edu.cmu.sphinx.linguist.language.ngram.large.LargeTrigramModel"> <property name="unigramWeight“ value="0.7"/> <property name="maxDepth" value="3"/> <property name="logMath" value="logMath"/> <property name="dictionary" value="dictionary"/> <property name="location" value="the name of the language model file" </component>

  16. Define a Dictionary <component name="dictionary" type="edu.cmu.sphinx.linguist.dictionary.FastDictionary"> <property name="dictionaryPath" value="the name of the dictionary file" <property name="fillerPath" /> value="the name of the filler file"/> <property name="addSilEndingPronunciation" value="false"/> <property name="allowMissingWords" value="false"/> <property name="unitManager" value="unitManager"/> </component>

  17. Define an Acoustic Model <component name="sphinx3Loader" type="edu.cmu.sphinx.linguist.acoustic.tiedstate.Sphinx3Loader"> <property name="logMath" value="logMath"/> <property name="unitManager" value="unitManager"/> <property name="location" value="the path to the model folder"/> <property name="location" value="the path to the model folder"/> </component> <component name="acousticModel" type="edu.cmu.sphinx.linguist.acoustic.tiedstate.TiedStateAcousticModel"> <property name="loader" value="sphinx3Loader"/> <property name="unitManager" value="unitManager"/> </component>

  18. Any Question ?

More Related