200 likes | 213 Vues
Version WS 2007-8. Speech Science X. Production models. How can we model the production process ?. • We need to explain how we control the complex articulatory activity in terms of the different sounds that we can hear.
E N D
Version WS 2007-8 Speech Science X Production models
How can wemodel the production process? • We need to explain how we control the complex articulatory activity in terms of the different sounds that we can hear. • Whether or not the orientation is perception, the change from one sound to another has to be explained. = How do we move the articulators to where they have to be? = How do we know when they have reached the correct position so that we can move on to the next sound?
How do we move the articulators to where they have to be? • How is the movement defined? - as a specific gesture? - as a gesture of a specific duration? - as a target position of the (main) articulator? - as (a position related to) an auditory percept? - as a spatial configuration? • The target concepts are closely related to one another … All the ideas take the sound segment (phoneme) as the unit which is being controlled.
- visual feedback? (no use in speech) - auditory feedback? (too late) - tactile feedback? (yes, but not enough - kinaesthetic feedback? (yes) - “proprioceptive” feedback? (yes) • Tactile and proprioceptive feedback provide information about the momentary position. So we know whether an articulatory action “feels right” or not. How do we know whether thearticulators have reached their target? • There are a number of “feedback” channels
Feedback use? • Closed loop reports on a preceding action and this triggers the next command:
Types of feedback 2 • Open loop gives information on the result of a sequence of commands.
Do we use closed loop feedback at all? • Open-loop feedback allows rapid sequences of commands to be carried out (we don‘t think about each syllable, but we still have to monitor events) • Closed-loop feedback is useful (and necessary) for units that take long enough for us to observe them …. …and for units that require some conscious decisions or planning. E.g. at phrase level(phonologically, the intonation phrase)
Evidence for open loop at syllable level • Khozhevnikov & Chistovich (K&C) from Leningrad provided the first evidence in den 60s and 70s, showing different degrees of variance in different size units. • In sentence repetitions of: "Tonya topila banyu", they found that syllable-duration variance was greater than phrase duration variance. …and they found that neighbouring syllablescorrelated negatively with one another (i.e., if one was shorter the next was longer)
But what is the basic production unit? • K&C looked at syllables in the phrase, but they didn‘t take for granted that the syllable is THE basic unit of articulatory planning. • They deduced it from the patterns of effects with changes of speech rate. They found: - Rate changes occur between phrases, - Rate changes do not affect the relationship between syllables and words, - But rate changes do affect the relationship between consonants and vowels.
Syllable units but C+V together) • K&C’s results and our lip-rounding observations indicate that C and V commands seem to be issued at the same time.
Control of gestures towards targets • We have still not explained how the very fast and muscularly complex gestures are controlled. • Different muscle tension, and even different muscles are important for one sound in different contexts. • An automatically feedback-controlled servo-system would be the engineering solution.
Muscle spindles as a servo system - summary • The muscle spindle is set to the required target tension. • Fast afferent (feedback) nerve fibres report on the discrepancy between momentary muscle tension and target tension. • Fast commands direct to the muscle correct the discrepancy between momentary muscle tension and target tension. = target reached, independent of the position of the articulator prior to movement.
Mass spring model of speech production 1 • The picture painted so far is simple because we have only considered one muscle. • As we have discussed before, any one gesturerequires the coordination of many muscles… • .. and any one sound requires the coordinationof a number of gestures (tongue, lips, etc.) • The target position for any given sound isthe product of all the target tensionsof allparticipating agonist and antagonist muscles.
Mass spring model of speech production 2 • If all the target tensions are set, the servo-system moves the articulators “automatically”towards the target position. • This is compared to a mass moving under the influence of a (damped) spring. The movementstops when the point of equilibrium is reached. • Advantage of model: It explains „undershoot“ (a frequently observed reduction of a sound). How? The next target is defined before the previous one is reached,
Coordinative structures • Disadvantage of the “mass spring model“ asoriginally conceived: It cannot explain the compensation for articu-latory disturbances that are observed! • It is therefore assumed (no real proof) that the “mass“ is made up of sub-structures which worktogether and compensate for each other. A well-documented coordinative structure of this kind is the jaw + lips; but also the jaw +tongue.
Coarticulation vs. Co-production • “Coarticulation” assumes that features from one phonemes spread into a neighbouring one. - There is no explanation for the different durationsof sound segments (as a function of stress & tempo). - There is no explanation for how certain featuresspread. In short, no real link to production models. • “Co-production” assumes a fixed (but unknown)duration for each speech sound. Phonemes arenot abstract, but concrete articulatory things. • Different durations are the result of differing degreesof overlap.
Advantage of a co-production view • Stronger (stressed) syllables have less overlapamong their segments. Unstressed syllables have more overlap. - More overlap implies that the command for thefollowing segment comes before the target of thepreceding segment is reached. - This results in a “reduced“ realisation (shorterand often spectrally less well defined). • But the different phonetic realisation is not theresult of a different articulatory planfor the sound.
Summary • We have shown that a target-orientated modelcan be explained physiologically • We have seen that feedback is vital and canbe used in an automatic servo-system. • We have seen that complex articulatory patternsa) depend on mutually compensating musclesynergies, b) can fit into „packets“ of lower-order commands („macros“) which allow more complex units tobe produced automatically.