1 / 65

Speech perception

Langston Psycholinguistics Lecture 3. Speech perception. Plan. Top-down Comprehension Bottom-up. Plan. Our goal is to start with the input and see how far we can take it. Constraint satisfaction problem. We will introduce top-down influences when the situation demands it.

libitha
Télécharger la présentation

Speech perception

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Langston Psycholinguistics Lecture 3 Speech perception

  2. Plan • Top-down • Comprehension • Bottom-up

  3. Plan • Our goal is to start with the input and see how far we can take it. • Constraint satisfaction problem. • We will introduce top-down influences when the situation demands it.

  4. What is speech? • Levels of analysis: • Acoustic: The physical speech signal. • Articulatory: How it's made. • Phones: Individual sounds (approximately 4000 available, about 869 in some language, about 100 account for most, Kluender, 1994). • Phonemes: Mental representation of sounds or sounds that affect meaning. Often made up of several phones treated as alike (keep cool).

  5. What is speech? • Levels of analysis: • Phonemes: Not all differences in sounds are phonemic (pin spin). Allophones: Set of phones treated as identical by a language. • Changing a phoneme will change the meaning (bit pit). • You can map a language's phonemes by looking for minimal pairs.

  6. What is speech? • Levels of analysis: • Phonemes: Languages seem to choose phonemes to maximize distinctiveness: • (Kluender, 1994)

  7. What is speech? • Levels of analysis: • Morphemes: Units that actually have meaning (we'll come to these later).

  8. Articulatory Phonetics • Based on how sounds are produced. • A consonant is: • Air + • Voicing (on or off) + • Manner (some form of disruption) + • Place (where the disruption happens)

  9. Articulatory Phonetics • The places:

  10. Articulatory Phonetics • Here's a link to a map with a clickable glossary: http://www.sil.org/mexico/ling/glosario/E005ci-PlacesArt.htm

  11. Articulatory Phonetics • The manners: • Plosive (stop): Completely stop the air flow. • Fricative: Interrupt the air flow and create friction. • Affricate: Stop released to a fricative. • Nasal: Stop with sound coming out the nasal passages. • Flap: Brief stoppage. • Trill: Hold it in place and let it vibrate.

  12. Articulatory Phonetics • The manners: • Approximant: Like a fricative, little obstruction. • Liquids: Central (flow over the middle of the tongue) or lateral (flow around the sides of the tongue). • Glides: Similar to a vowel but with the tongue creating a small amount of turbulence (also called semivowels).

  13. Articulatory Phonetics • IPA table:

  14. Articulatory Phonetics • English:

  15. Articulatory Phonetics • English: • Bilabial: • Stop: voiced bin, unvoiced pin • Nasal: man • Approximants: wind • Labiodental: • Fricative: voiced vat, unvoiced fat • Dental: • Fricative: voiced then, unvoiced thin

  16. Articulatory Phonetics • English: • Alveolar: • Stop: voiced dip, unvoiced tip • Nasal: nap • Flap: city • Fricative: voiced zap, unvoiced sap • Approximants: central rip, lateral lip • Post-alveolar: (palatal?) • Fricative: voiced azure, unvoiced sure • Affricate: voiced jug, unvoiced chug

  17. Articulatory Phonetics • English: • Palatal: • Approximant: your • Velar: • Stop: voiced got, unvoiced cot • Nasal: sing • Glottal: • Stop: satin • Fricative: hen

  18. Articulatory Phonetics • A vowel is: • Part: front, center, back + • Height: High, mid, low

  19. Articulatory Phonetics • English:

  20. Articulatory Phonetics • English: • Front: beet, bit, baby, bet, bat • Central: hut, sofa, bird, heater • Back: boot, book, bode, bought, hot

  21. Articulatory Phonetics • English: • Also dipthongs: cute, bite, bough, boy • Also suprasegmentals (added on to the vowels): • Stress: blackbird, blackbird • Length • Tone contour

  22. Acoustic Phonetics • You can use a spectrograph to produce a spectrogram. This is a graphic representation of speech.

  23. Acoustic Phonetics • If you download Praat you can produce your own spectrograms relatively easily. Get Praat here: http://www.fon.hum.uva.nl/praat/

  24. Acoustic Phonetics • The acoustic approach is to analyze the physical speech signal without making reference to how it was produced.

  25. As an aside, we can think about vision for a minute… • [D/H= tan(θ)]

  26. Acoustic Phonetics • Formant: “a concentration of acoustic energy around a particular frequency in the speech wave” (Praat Tutorial, see next page for link).

  27. Acoustic Phonetics • You can learn more about formants in the Praat tutorial here: http://person2.sol.lu.se/SidneyWood/praate/whatform.html

  28. Acoustic Phonetics • Formant transition: A sharp rise or fall in a formant. Usually a consonant.

  29. Acoustic Phonetics • Steady state: Part of a formant with little or no change. Generally vowels.

  30. Acoustic Phonetics • The darker the band the more energy there is there. • You can see sounds change over time by going from left to right.

  31. Acoustic Phonetics • Problems for perception: • Parallel transmission: You do not produce phonemes like beads on a necklace. Instead, you are transmitting overlapping parts of phonemes in parallel (Easter eggs).

  32. Acoustic Phonetics • Problems for perception: • Parallel transmission:

  33. Acoustic Phonetics • Problems for perception: • Context conditioned variation: Each phoneme is affected by surrounding phonemes (lack of invariance).

  34. Acoustic Phonetics • Problems for perception: • Context conditioned variation:

  35. How Does Perception Work? • From Kerzel & Bekkering (2000; doi:10.1037/0096-1523.26.2.634): • Direct realism: “listeners to speech recover information about the articulatory activities of the vocal tract from various sources of information” (p. 635). • But: Not motor based. The articulators structure the “informational medium.”

  36. How Does Perception Work? • From Kerzel & Bekkering (2000): • Direct realism: “when the ear of the listener is stimulated by the acoustic medium, the structure is imparted and the listener perceives the speaker's gestures” (p. 635). • Can also come from structuring of optic medium. • Direct perception.

  37. Example 1 http://sunburn.stanford.edu/~nick/compdocs/, click on Practical HI Examples.pdf

  38. Examples 4 & 5 http://www.baddesigns.com/file.html http://www.baddesigns.com/sidewalk.html

  39. How Does Perception Work? • Direct realism: • Carello, Anderson, & Kunkler-Peck (1998; doi:10.1111/1467-9280.00040): Information in the auditory signal can be used to recover information about lengths of dowels (I'll be dropping some dowels).

  40. How Does Perception Work? Carello, Anderson, & Kunkler-Peck (1998, p. 212)

  41. How Does Perception Work? Carello, Anderson, & Kunkler-Peck (1998, p. 212)

  42. How Does Perception Work? • Direct realism: • Kunkler-Peck & Turvey (2000; doi:10.1037/0096-1523.26.1.279): Auditory information can also be used to recover information about an object's shape.

  43. How Does Perception Work? • Direct realism: • To sum up: The signal contains sufficient structure to recover a distal property (shape). Speech could work the same way (the distal property is phonetic gesture).

  44. How Does Perception Work? • From Kerzel & Bekkering (2000): • Fuzzy logical model of perception (FLMP): • “features are evaluated in terms of prototypes of syllables” (p. 635). • “degree of correspondence to the prototype is determined” (p. 635). • “the relative goodness of match of each prototype is evaluated, and the prototype with the best match is selected” (p. 635).

  45. How Does Perception Work? • From Kerzel & Bekkering (2000): • Fuzzy logical model of perception (FLMP): • “speech perception is explained by a best-match procedure” (p. 635).

  46. How Does Perception Work? • From Galantucci, Fowler, & Turvey (2006): Motor theory of speech perception. 3 parts: • “speech processing is special” (p. 361) • “perceiving speech is perceiving vocal tract gestures” (p. 361) • “speech perception involves access to the speech motor system” (p. 361)

  47. How Does Perception Work? • “speech processing is special” • Perception of distal properties unique to speech. No. (See the shape stuff above.) • Recruitment of the motor system unique to speech. No. • Special neural hardware. Not enough evidence to tell, but probably no.

  48. How Does Perception Work? • “perceiving speech is perceiving vocal tract gestures” • “the objects of speech perception are the speakers' vocal tract gestures and not the acoustic patterns that the gestures generate in the air” (p. 365)

  49. How Does Perception Work? • “perceiving speech is perceiving vocal tract gestures” • When articulation and sound go their separate ways, which way does perception go? With articulation (di-du).

More Related