Graphical Models of Articulation Using GMTK

Graphical Models of Articulation Using GMTK Karen Livescu Massachusetts Institute of Technology CLSP Workshop 2001 August 16, 2001

Overview • Motivation • Why use articulatory models for speech recognition? • Why use graphical models to represent articulation? • Proposed model • Issues in articulatory modeling for ASR • Choice of features • Model size and constraints • Initialization • Structure learning • Conclusion

Lips Velum Glottis Tongue Why articulatory modeling? • Definition: • Articulatory features specify the state of the articulators (directly or implicitly) at a given point in time • Features can be binary/multivalued, discrete/continuous, partial/complete • Motivation: • Speech may be better described by asynchronous motion of articulators than by phones with rigid start and end times • Articulatory features concisely represent coarticulatory effects such as nasalization and inserted stop closures (“warmth”  “warmpth”) • Articulatory features can help recover information; e.g., a vowel is more likely to be nasalized if following nasal is deleted • Pronunciation modeling: phones are more likely to be modified by articulatory change rather than replaced with other phones

Phone-based view: Brain: Give me a []! Lips, tongue, velum, glottis: Right on it, sir! Lips, tongue, velum, glottis: Right on it, sir! Lips, tongue, velum, glottis: Right on it, sir! Lips, tongue, velum, glottis: Right on it, sir! Brain: Give me a []! Lips: Huh? Velum, glottis: Right on it, sir ! Velum, glottis: Right on it, sir ! Tongue: Umm…yeah, OK. Example: “warmth”  “warmpth” • Articulatory view:

Why graphical models to represent articulation? • Graphical models allow us to • quickly and easily specify structures with large number of variables • represent knowledge about the variables in a concise and transparent way • easily reason and communicate information about conditional independence relations in the model • perform structure learning explicitly on the variables of interest • Articulatory models use a large number of variables in each time frame, therefore taking advantage of the above properties

frame i frame i+1 phone phone . . . . . . a1 a2 a1 a2 aN aN obs obs Articulatory graphical model for speech recognition • Initial version implemented at CLSP Workshop 2001

GMTK implementation of articulatory structure variable : phone { type: discrete hidden cardinality NUM_PHONES; switchingparents: nil; conditionalparents: word(0), wordPosition(0) using MTCPT("wordWordPos2Phone"); } variable : voicing { type: discrete hidden cardinality 2; switchingparents: nil; conditionalparents: phone(0), voicing(-1) using MDCPT(“voicingMDCPT”); } variable : velum { type: discrete hidden cardinality 2; switchingparents: nil; conditionalparents: phone(0), velum(-1) using MDCPT("velumMDCPT"); }

Issues in articulatory modeling • Choice of features • Feature set developed with Eva Holtz and Katrin Kirchhoff at WS01 • 8 features, state space size = 8,960 • Model size and constraints • Use inter-frame (and inter-articulator) constraints to limit state space • Experiment with varying the size of the state space • Initialization • Initialize from existing phone models • If multiple phones match the same feature setting, aggregate models • If no phone matches a feature setting, interpolate models of phones with similar features (similarly to HAMM of Richardson, Bilmes, and Diorio) • Discrminative structure learning over articulatory variables

Conclusion • Speech may be better modeled by articulatory features than by phones • Graphical models in general, and GMTK in particular, allow for easy experimentation with articulatory models • Plan for future work: • Experiments with pre-specified articulatory structure • Structure learning on articulatory variables

Graphical Models of Articulation Using GMTK

Graphical Models of Articulation Using GMTK

Presentation Transcript

Graphical Models

Incomplete Graphical Models

Graphical Models

Graphical Models

Graphical Models

Graphical Models - Inference -

GRAPHICAL MODELS

Inference using Graphical Models and Software Tools

Probabilistic graphical models

Hierarchical Reinforcement Learning Using Graphical Models

Probabilistic Graphical Models

Compiling Graphical Models

Graphical Models

Graphical Multiagent Models

Graphical Causal Models

Undirected Graphical Models

Probabilistic Graphical Models

Hierarchical Reinforcement Learning Using Graphical Models

Graphical Models