University of Sheffield

University of Sheffield M4 speech recognition Vincent Wan, Martin Karafiát

Trigram language model (SRILM) Word internal triphone models Cross word triphone models Lattice rescoring Time synchronous decoding (HTK) n-best lattice generation Best first decoding (Ducoder) MLLR adaptation (HTK) MLLR adaptation (HTK) Front end Recognition output Recognition output The Recogniser

System limitations • N-best list rescoring not optimal • Adaptation must be performed on two sets of acoustic models • Many more hyper-parameters to tune manually • SRILM is not efficient on very large language models (greater than 10e+9 words)

Advances since last meeting • Models trained on two databases • SWITCHBOARD recogniser • Acoustic & language models trained on 200 hours of speech • ICSI meetings recogniser • Acoustic models trained on 40 hours of speech • Language model is a combination of SWB and ICSI • Improvements mainly affect the Switchboard models • 16kHz sampling rate used throughout

Advances since last meeting • Adaptation of word internal context dependent models • Unified the phone sets and pronunciation dictionaries • Improved the pronunciation dictionary for Switchboard • Now using the ICSI dictionary with missing pronunciations imported from the ISIP dictionary • Better handling of multiple pronunciations during acoustic model training • General bug fixes

Results overview % word error rates * Results from lapel mics † Results from beam former

Results: adaptation vs. direct training on ICSI % word error rates * Results from Ducoder using all pruning

Acoustic model adaptation issue • Acoustic models are presently not very adaptive • Better MLLR code required (next slide) • More training data required • Need to make better use of the combined ICSI/SWB training data for M4.

Other news • The next version of HTK’s adaptation code will be made available to M4 before the official public release. • Sheffield to acquire HTK LVCSR decoder • Licensing issues to be resolved • May be able to make binaries available to M4 partners

University of Sheffield