540 likes | 839 Vues
Vowels (again). February 23, 2010. The News. For Thursday: Give me a (one paragraph or so) description of what you’re thinking of doing for a term project Also note: two new readings have been posted Peterson & Barney (1952) Liljencrants & Lindblom (1972). Fun Stuff.
E N D
Vowels (again) February 23, 2010
The News • For Thursday: • Give me a (one paragraph or so) description of what you’re thinking of doing for a term project • Also note: two new readings have been posted • Peterson & Barney (1952) • Liljencrants & Lindblom (1972)
Fun Stuff • Who is producing each of these vowels? (And which vowel are they producing?)
Source/Filter Lab Review • Silke made predictions on the basis of her formant values:
Practical Stuff • So you want to plot your formant space…
Source/Filter Lab Review • Stephanie made an interesting (general) prediction:
Peterson & Barney (1952) • Gordon Peterson and Harold Barney conducted a landmark study of variability in the production and perception of English vowels way back in 1952. • Methods: • Recorded speakers of “General American English” reading a list of 10 hVd words (heed, hid, head, etc.) twice. • 76 speakers (33 men, 28 women, 10 children) • Measured the F0, F1, F2 and F3 from the midpoint of all 1520 vowels. • Presented all 1520 vowels to 70 listeners in a vowel identification experiment (in eight sessions).
Peterson & Barney (1952) • Acoustically, they found much variability in vowel production • Also: much overlap in terms of absolute formant frequencies • General confirmation of F1-F2 vowel space schema • “herd” distinguished by low F3.
Peterson & Barney (1952) • They organized their response data in the form of a confusion matrix. • Each row corresponds to the “intended vowel” • = the stimulus category • Each column corresponds to the classification made by the listeners • = the response category
Peterson & Barney (1952) • Some confusion matrix basics: • Entries on the main diagonal represent correct responses. • Entries off the main diagonal represent the “confusions” • Popular confusions here include: • “hod” perceived as “hawed” (1013 / 10273) • “hid” perceived as “head” (694 / 10279)
Peterson & Barney (1952) • Summing up the columns provides a rough sense of the listeners’ response bias • = tendency to favor one response category over another, independent of the stimulus presented • Popular options: “had” (10906), “hawed” (10737) • Not-so-popular: “hid” (9813), “hud” (9956)
Peterson & Barney (1952) • Note: listeners identified only 94.4% of vowels correctly • “heed”, “who’d” and “herd” were highly distinct; • “hod” and “head” were not • The available response options in the neighborhood matter…
Source/Filter Lab Review • Sue plotted some confusion matrices:
Source/Filter Lab Review • Rhonda (and Jon) broke things down by features:
Class Confusion Matrix • This is the response data summed across all conditions… • From all five listeners.
Back to Perturbation Theory • Basic idea #1: vocal tract resonances (formants) are the result of standing waves in the vocal tract • These standing waves have areas where velocity alternates between high and low (anti-nodes), and areas where velocity does not change (nodes)
Perturbation Principles • Basic Idea #2: constriction at a velocity anti-node decreases a resonant frequency anti-node anti-node
Perturbation Principles • Basic Idea #3: constriction at a velocity node increases a resonant frequency node node
Labial • Constrictions in the labial region are at anti-nodes for both F1 and F2. • Labial constrictions decrease both F1 and F2
Palatal Labial • Constrictions in the palatal region are at an F2 node and near an F1 anti-node • F1 decreases; F2 increases
Velar Palatal Labial • Constrictions in the velar region are at an F2 anti-node and near an F1 anti-node • F1 decreases; F2 decreases
Pharynx Velar Palatal Labial • Constrictions in the pharyngeal region are at an F2 anti-node and near an F1 node • F1 increases; F2 decreases
Larynx Pharynx Velar Palatal Labial • Constrictions in the laryngeal region are at an F2 node and an F1 node • F1 increases; F2 increases
Different Sources • For a particular articulatory configuration, the vocal tract will resonate at a certain set of frequencies… • no matter what the sound source is. • (Remember the talk box) • Let’s see what happens when we change our sound source to a duck call…
Duck Call Vowels • Now let’s filter the duck call with differently shaped plastic tubes…. • Care to make any predictions? duck call is placed here http://www.exploratorium.edu/exhibits/vocal_vowels/vocal_vowels.html
Another View [i]
How About These? duck call is placed on this side
[i] vs. [e] [i] [e]
[u] vs. [o] [u] [o]
Philosophical Fragments • Consider the Cardinal Vowels. • Two “anchor” vowels: • [i] - Cardinal Vowel 1 - highest, frontest vowel possible • - Cardinal Vowel 5 - lowest, backest vowel possible • Remaining vowels are spaced at equal intervals of frontness and height between the anchor vowels. • Note: [u] - Cardinal Vowel 8 - may serve as a third anchor as the highest, backest, roundest vowel possible • Q: Why are the first two anchors unrounded… • While the third anchor is rounded?
Perturbation to the Rescue! • Rounding back vowels takes advantage of an acoustic synergy…which lowers both F1 and F2. Larynx Pharynx Velar Palatal Labial Q: Is there anything wrong with rounding other (non-back) vowels?
A “Bad” Vowel Space • One answer is found in the typical structure of vowel systems. • For instance, a five vowel system is rarely, if ever, distributed thusly: • [i] • [e] • [æ]
Five Vowel Spaces • Many languages with only five vowels spread them out evenly in the vowel space in a triangle • Here’s a popular vowel space option: • i u • e o • a
A Complicated Vowel Space • The language is Swedish.
Adaptive Dispersion Theory • Developed by Bjorn Lindblom and Johan Liljencrants • (Swedish speakers) • Adaptive Dispersion theory says: • Vowels should be as acoustically distinct from each other as possible • (This helps listeners identify them correctly) • So…languages tend to maximize the distance between vowels in acoustic space • Note: lack of ~ distinction in Canadian English.
Liljencrants + Lindblom (1972) • Attempted to quantify “contrast” in the vowel space. • to emphasize the importance of perception in the formation of phonological structure. • They start with an articulatory model of the limits of the vowel space: • note: space is plotted in three formants… • and in mels (auditory equivalent of frequency)
Liljencrants + Lindblom (1972) • Quantification of contrast in the space: • Given m pairs of n vowels, • Where m = (n * (n-1)) / 2 • And ri2 = the Euclidean distance between the ith pair of vowels, in formant space. • The perceptual goal of the system is: • I.e., the more formant space between vowels, • the easier they will be to distinguish from one another. • Note: floating magnets analogy • Also: crowded elevator analogy
Liljencrants & Lindblom (1972) • In perceptually optimal systems… • vowels tend to spread out around the edges of the available space. • There is also a trend for more high vowel contrasts than are normally found in language.