Automatic Speech Recognition System

Automatic Speech Recognition System Experimental Study Effect of parameter variation on WER Performance Sanjay Patil, Jun-Won Suh Human and Systems Engineering

Details of the experiment • Details of the system: • HMM Speech Recognition System • TIDigits Database • (41300 utterances, 12547 sentences), 11 words – zero to 9, O • Cross-word, loop grammar • Objective: • To study the ASR performance as a function of .. • WER = fn ( frame, Window, IP, State-tying) • Frame = 5 ms to 50 ms • Window = 5 ms to 50 ms • IP = -10 to -200 • State-tying = {split, merge, occupancy} => total # of tied states

Test Results for varying Frame-Window Variation on WER

Test Results for varying Frame-Window Variation on Time

Training Schedule

Command line to run the experiment • tidigit_decode -model_type xwrd_triphone • -train_mode baum_welch • -decode_mode loop_grammar • options: • -model_type : [what type of model you want to build] • xwrd_triphone : context-dependent cross-word • triphone models • -train_mode : [specifies the training algorithm to use] • baum_welch : the standard Baum-Welch, • forward-backward algorithm • -decode_mode : [specifies the type of decoding to perform] • loop_grammar : decodes using a grammar where any • digit can follow any other digit with equal • probability

Language Model • Combining Acoustic and Language Models • Language Model contribution = P(W)LM IPN(W) • LM — language model scale [ we did not observe change in WER] • IP — Insertion Penalty – Penalty of inserting a new word. • IP is determined empirically to optimize the recognition performance

Test Results for varying Insertion Penalty on WER Same will be true for other combinations of Frame and Window pair. The remaining two are: (Frame, Window) pair (10, 25) and (15, 25)

State-Tying Results Ref. : Naveen’s Thesis. These results are from Naveen’s Thesis

References • J.Picone. “Lecture.” [online]. Available: http://www.isip.msstate.edu/publications/courses • X. Huang, A. Acero, H. Hon, Spoken Language Processing (Prentice Hall, 2001) • F. Jelinek, Statistical Methods for Speech Recognition (The MIT Press, 1999)

Questions

State-Tying(reference 3)

Automatic Speech Recognition System

Automatic Speech Recognition System

Presentation Transcript

Automatic Speech Recognition

Automatic Speech Recognition: An Overview

Adaptation Techniques in Automatic Speech Recognition

Automatic Speech Recognition

Automatic Speech Recognition

Automatic Speech Recognition (ASR)

Automatic speech recognition

Automatic Speech Recognition II

Automatic Speech Recognition and Audio Indexing

Confidence Measures for Automatic Speech Recognition

Automatic Speech Recognition

Automatic Continuous Speech Recognition

Automatic Speech Recognition Studies

Automatic Speech Recognition Introduction

Automatic Speech Recognition

Automatic Speech Recognition - Edukite

Automatic Speech Recognition Introduction

Introduction to Automatic Speech Recognition

Automatic Speech Recognition Introduction

Automatic Speech Recognition