130 likes | 290 Vues
This paper presents a novel approach to Letter-to-Phoneme (L2P) alignment essential for effective Text-to-Speech (TTS) systems, particularly benefitting automated telecom services and improving pronunciation assistance for varying language complexities. By employing graphical models and machine learning techniques, we address inconsistencies, lack of transparency, and introduce unsupervised learning to enhance accuracy in phonetic analysis. Our framework evaluates aligned dictionaries and dynamically adjusts to properly accommodate language evolution, including new words and proper nouns, ensuring high fidelity in speech output.
E N D
Letter to Phoneme Alignment Using Graphical Models N. Bolandzadeh, R. Rabbany Dept of Computing Science University of Alberta 1
Text to Speech Problem Conversion of Text to Speech: TTS Automated Telecom Services E-mail by Phone Banking Systems Handicapped People
Pronunciation Pronunciation of the words Dictionary Words Non-Dictionary Words Phonetic analysis Dictionary lookup? Language is alive, new words add Proper Nouns Machine Learning higher accuracy L 2 P alignment is needed
L2P Problem Automatic Speech Recognition & Spelling Correction • Letter to Phoneme Alignment • Letter: c a k e • Phoneme: k ei k 4
It's not Trivial! why? • No Consistency • City / s / • Cake / k / • Kid / k / • No Transparency • K i d (3) / k i d / (3) • S i x (3) / s i k s / (4) • Q u e u e (5) / k j u: / (3) • A x e (3) / a k s / (3) 5
Framework L2P aligner Aligned Dictionary Dictionary Brick brIk Brightening br2tHIN British brItIS Bronx brQNks Bugle bjugP Buoy b4 b|r|i|ck|b|r|I|k| b|r|ig|ht|en|i|ng| b|r|2|t|H|I|N| b|r|i|t|i|sh|b|r|I|t|I|S| b|r|o|n|x|b|r|Q|N|ks| b|u|g|le|b|ju|g|P| bu|oy| b|4|
Evaluation • No Aligned Dictionary • Unsupervised Learning • Previously aligner was tied with a generator • Evaluation on percentage of correctly predicted phonemes and words Tee’s L2P Generator Aligned Dictionary Accuracy
Model of our problem B | r | i | t | i | sh | B | r | I | t | I | S |
Static Model, Structure • Independent sub alignments l1 l2 l3 l4 ln-1 ln a1 a2 ak p1 p2 p3 p4 pm-1 pm
Static Model, Learning • EM • Initialize Parameters • Expectation Step: • Parameters Alignments • Maximization Step: • Alignments Parameters
Dynamic Model • Sequence of data • Unrolled model for T=3 slices l1 l2 l3 l4 l5 l6 a1 a2 ak p1 p2 p3 p4 p5 p6