Optimizing One-Stage Speech Recognition Model with Intra- and Inter-Syllable Analysis
This document outlines a comprehensive approach to enhancing speech recognition through a one-stage model that evaluates both intra-syllable and inter-syllable dynamics. It details the exploration of all possible states and the methodology for determining the optimal path, recording back traces, and accumulating likelihoods. The process begins with initializing observation probabilities and progressively refining the recognition process through a systematic allocation of likelihoods, states, and models, ultimately aiming to identify the most effective path for accurate text recognition.
Optimizing One-Stage Speech Recognition Model with Intra- and Inter-Syllable Analysis
E N D
Presentation Transcript
Recognition 1.不知道有幾個字 2.不知道(答案)是那些字 考慮所有情況(411+1+1) bre sil syll . . . syll
. . . . . . . . . . . . . . . . . . . . . . . . State的情形 Intra-syllable bre bre sil sil syll syll syll Inter-syllable bre bre sil syll syll bre bre sil sil bre syll syll
frame對model的所有情況 model . . . . . . . . . . . . . . . . . . . . . . start end frame One-stage的方法: 對所有的states(411*8+1+3)做三件事: (1)找出最佳路徑 (2)記錄最佳路徑(backtrace) (3)累加likehood 1.第一個state從(411+1+1) 條路徑中選最好的 (top_one search) 2.其他state就自已、前一個
One_stage 的程式流程 Begin • Allocate • L, LL(likelihood) • BK, BBK(for back trace) • D, DD(state duration) • silence model • state 0 • self • top_mode • Initialization (t=0) • observation prob(t=0) • disable all models (D=0) • enable first state of all syllable and • silence , breath • top_likelihood • Top_model • FM[0] = MODEL-2(for back trace ) • FF[0] = -1(for time ) 1 2 • Top_one_search • for 411 syllable , • silence ,breath • update top_likelihood • update top_model t t>Nframe 1 • Find optimal path • back trace to find • recognized text and • segment position 1 • Switch buffer • L,LL • BK,BBK • D,DD end • Disable all models • D = 0 411 model 411 model • Initial • state 0(first) • self • top_model • other states • self • last state 411 model • final • state 3 • self • last state of initial • other states • self • last state 2 1 411 model • Breathe model • state 0 • self • top_model • other states • self • last state