1 / 53

GEPPETO 1 : A modeling approach to study the production of speech gestures

GEPPETO 1 : A modeling approach to study the production of speech gestures. Pascal Perrier (ICP – Grenoble) with Stéphanie Buchaillard (PhD) Matthieu Chabanas (ICP) Ma Liang (PhD), Yohan Payan (TIMC – Grenoble).

maris
Télécharger la présentation

GEPPETO 1 : A modeling approach to study the production of speech gestures

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. GEPPETO1: A modeling approach to study the production of speech gestures Pascal Perrier (ICP – Grenoble) with Stéphanie Buchaillard (PhD) Matthieu Chabanas (ICP) Ma Liang (PhD), Yohan Payan (TIMC – Grenoble) 1GEstures shaped by thePhysicsand by aPErceptuallyorientedTargetsOptimization

  2. Outline • Introduction • Current hypotheses implemented in GEPPETO • Some results obtained with a 2D biomechanical tongue model • New issues raised by the use of 3D biomechanical tongue model

  3. Basic issuesin Speech Production Research • Phonology/Phonetics Interface • Link between discrete representations and continuous physical signals • Nature of physical correlates of speech units

  4. Basic issuesin Speech Production Research • Control and Production of Speech Gestures • Control variables • Central representations of physical characteristics of the speech production apparatus • Interaction Perception-Action

  5. Basic issuesin Speech Production Research • From Gestures to Speech Sounds • Nature of acoustic sources • Relations between motor commands and acoustics • Interaction between airflow and articulatory gestures.

  6. What is GEPPETO? • An evolutive modeling framework to quantitatively test hypotheses about the control and the production of speech gestures. • It includes • Hypotheses about the physical correlates of phonological units. • Models of motor control • Physical models of the speech production apparatus

  7. Current Hypotheses • Phonology/Phonetic Interface • The smallest phonological unit is the phoneme • Phonemes are associated with target regions in the auditory domain • Larger phonological units are associated with speech sequences for which specific constraints exist for target optimization or for motor commands sequencing

  8. Current Hypotheses • Control of speech gestures • Control variables: l commands (EP Hypothesis, Feldman, 1966) • No on line use of feedback going through the cortex. • Short-delay orosensory and proprioceptive feedbacks are taken into account. • Existence in the brain of internal representations of the speech apparatus (internal models).

  9. Current Hypotheses • Control of speech gestures • Internal representations do not account for the whole physical complexity of the speech production apparatus • Kinematic characteristics are not directly controlled. They are the results of the interaction between motor control setups and physical phenomena of speech production • Which characteristics of speech signals are specifically controlled?

  10. Application to the generation of speech gestures with a 2 D biomechanical tongue model • Implementation of the model of control • Inversion from desired perceptual objectives to motor commands • Generation of gestures

  11. 2D Biomechanical Model • Finite element structure • Linear elasticity (small deformations) • No account of the gravity

  12. Posterior genioglossus Anterior Genioglossus Hyoglossus 2D Biomechanical Model

  13. Styloglossus Verticalis Inferior Longitudinalis 2D Biomechanical Model

  14. Learning a static internal modelFrom l commands to formants • Step 1: • - Uniform sampling of • the l commands space • Generation of the • corresponding tongue • shapes. 9000 simulations

  15. Learning a static internal modelFrom l commands to formants Step 2: Computation of the area function.

  16. Learning a static internal modelFrom l commands to formants Step 3: Formants computation for 2 lip apertures (red dots: spread lips; blue dots: rounded lips)

  17. 1st layer Learning a static internal modelFrom l commands to formants Step 4: Learning and generalizing with radial basis functions 2nd layer

  18. Target regions for some non rounded French phonemes InversionFrom target regions to l commands • Target regions • Dispersion ellipses in the (F1, F2, F3) space • Currently defined by Fc1, Fc2, Fc3and sF1,sF2, sF3

  19. Target regions for some non rounded French phonemes InversionFrom target regions to l commands • Target regions • Dispersion ellipses in the (F1, F2, F3) space • Currently defined by Fc1, Fc2, Fc3and sF1,sF2, sF3

  20. + Speaker oriented Listener oriented InversionFrom target regions to l commands Optimization Cost minimization (Gradient descent technique) Cost for a sequence made of N phonemes with

  21. InversionFrom target regions to l commands Example 1 Sequence [œ-e-k-i]

  22. InversionFrom target regions to l commands Example 2 Sequence [œ-e-k-a]

  23. [oe] [e] [k] [a] Production of tongue movements from inferred l commands Serial command patterns No difference between vowels and consonants

  24. Execution of tongue movements from inferred l commands Öhman’s model: Vowel-to-Vowel basis Consonants are seen as perturbation of V-V [oe] [e] [k] [a]

  25. Execution of tongue movements from inferred l commands Observed flesh point

  26. [a] [i] Production of tongue movements from inferred l commands Serial command patterns

  27. [a] [i] Production of tongue movements from inferred l commands Öhman’s command patterns

  28. Interaction control / physics.Influence on the shapes of the articulatory paths Example: the Articulatory loops [aka] [ika] R. Houde (1969)

  29. Fluid-Wall Interaction Imposed pressure difference Forces Mechanics of the tissues. Flow model Finite element model) Deformation

  30. Deplacement X - Y 120 115 110 105 Y - mm 100 +++ PS = 3000 Pa ...... PS = 800 Pa 95 -------No aerodynamics 90 40 50 60 70 80 90 100 110 120 X - mm Interaction control / physics.Influence on the shapes of the articulatory paths Example: the Articulatory loops [aka]

  31. Interaction control / physics.Influence on the shapes of the articulatory paths Example: the Articulatory loops No aerodynamics With aerodynamics [aka]

  32. Deplacement X - Y 113 112 ... PS = 1600 Pa ---- No aerodynamics 111 110 Y - mm 109 108 107 61 62 63 64 65 66 67 X - mm Interaction control / physics.Influence on the shapes of the articulatory paths Example: the Articulatory loops [ika]

  33. Interaction control / physics.Influence on the shapes of the articulatory paths Example: the Articulatory loops No aerodynamics With aerodynamics [ika]

  34. A 3D biomechanical tongue model:For a better account of physics • Visible Human Project ® data (Wilhelms-Tricarico, 2003) • Finite Element Mesh made of Hexahedres • Adaptation of the mesh to a specific speaker (PB) Gerard et al., ICP Grenoble Wilhelms-Tricarico R.,1995

  35. Inner muscle structure of the tongue Genioglossus (medium) Genioglossus (anterior) Styloglossus Geniohyoid Genioglossus (posterior) Hyoglossus Verticalis Transversus Inferior longitudinalis Mylohyoid Superior longitudinalis

  36. Vocal tract structure TONGUE’S BODY HYOID BONE MANDIBLE PALATE OTHER MUSCLES

  37. Displacement 0 Force Linear Non Linear Tongue Indentator Elastical properties of tongue muscles • Hyperelastic material (2nd order Yeoh model) with large deformation hypothesis

  38. Effect of gravity [1s]

  39. Dealing with gravity with the EP hypothesis [300ms]

  40. Dealing with gravity with the EP hypothesis • Activation of GGp and MH •  Increase of reflex activity [300ms]

  41. Dealing with gravity with the EP hypothesis GGP activation

  42. Dealing with gravity with the EP hypothesis Example of a good choice of control parameters [300ms]

  43. Conclusions • A model of control based on perceptual objectives specified in terms of formants target regions associated with l motor commands and on an optimization process using a static model of the motor-perception relations can generate realistic speech movements if it is applying to a realistic physical model of speech production.

  44. Conclusions • It supports our hypothesis that there is not need to assume the existence of a central optimization process that would apply to the articulatory trajectories in their whole (i.e. minimum of jerk, minimum of torque…)

  45. Conclusions • It gives an interesting account of coarticulation phenomena by separating the effects of planning and those of physics. • It permits to test hypotheses about the phonological units (see serial model versus Öhman’s model).

  46. Conclusions However • a systematic comparison with data is required (currently in progress for French, German, Chinese, Japanese) • No account for time control, or for hypo/hyperspeech • No account for gravity

  47. Conclusions • Necessity to work on a more complex internal representations that would integrate some aspects of articulatory dynamics.

  48. Thank you

  49. Influence of elasticity modeling Hyperelastic Large defo. Linear Small defo. Linear Activation of the Hyoglossus (2N)

More Related