Identification and Discrimination of Onset Timing in Component Tones: Voicing Perception Implications

Identification and discrimination of the relative onset time of two component tones: Implications for voicing perception in stops David B. Pisoni (1976-77)

Past Studies • Lisker and Abramson (1964, 1970) • Liberman et al. • Mattingly, Liberman, Sydral, and Halwes • Eimas (1971) • Kuhl Miller (1975) • Lasky • Streeter • Miller et al

Lisker and Abramson • They investigated • Voicing and aspiration differences shown across different languages (last class) • Differences in timing and glottal activity • They discovered 3 modes of voicing (1964): • Pre-voiced stops = voicing onset precedes the release burst (negative onset, -VOT) • Short-lag voiced stops = voicing onset is simultaneous or briefly lags behind the release burst (0 VOT) • Long-lag voiceless stops in which the voicing onset lags behind the release burst (positive onset, +VOT)

Liberman • In perceptual experiments done with synthetic stimuli* they found (1961)… • Subjects ID and discriminate differences in VOT in a categorical-like manner that reflects the phonological categories* of their language • Consistent labeling with sharp crossover points • Discontinuities in discrimination that are correlated with the abrupt changes in the labeling functions • Better at discriminating 2 synthetic stimuli from 2 different phonological categories vs. from the the same

According to Several • Empirical Findings • Non-speech signals are perceived in a continuous mode • No other categorical perception studies had been done with synthetic stimuli • Non-monotonic discrimination functions are the result of labeling processes associated with phonetic categorization • Interpretation • Evidence for the operation of a special mode of perception…Speech Mode

Liberman et al. (1961) & Mattingly, Liberman, Syrdal and Halwes (1971) • Are discontinuities in speech discrimination functions due to the acoustic or psychophysical* attributers of the signals themselves rather than some speech related labeling process? • Found no peaks in the non-speech discrimination functions at phoneme boundaries so… • Conclusion = Speech Mode • discrimination of speech stimuli were attributable to phonetic categorization resulting from the stimuli being perceived as speech.

Eimas (1971) • 2 and 3 month old infants • Found that they can discriminate synthetic speech sounds varying in VOT much like English speaking adults • Implication = infants have access to mechanisms of phonetic categorization • Innate mechanisms • Responding to phonetic coding VS. psychophysical differences • Environment plays a secondary role

Kuhl and Miller (1975) • Study done with chinchillas • Trained to respond differently to the consonants /d/ and /t/ (human voice) • Used synthetic stimuli varying in VOT with a sharp crossover point • The discrimination functions were similar to English speaking human data…but chinchillas don’t have spoken language • Suggests a psychophysical basis VS. phonetic basis for the labeling behavior • Results = the boundary for voiced and voiceless labial stops occurs at about +25 msec…threshold

Lasky et al. (1975) • Cross-language studies • 4 to 6 1/2 month old infants born to Spanish-speaking parents • Found evidence for 3-categories in discrimination • Boundary occurred in the region of +20 msec and +60 msec (corresponds to the English voiced/voiceless times) • And at -20 msec and -60 msec • Spanish only has one phoneme boundary b/w voiced and voiceless stops and it does not coincide with the boundaries they found • Conclusion: • Environment plays minor role • Responding to psychophysical attributes

Streeter (1976) • Kikuyu infants (Kenya) • Show evidence of 3 categories of voicing for labial stops • Kikuyu have no voicing contrasts for labial stops (but they exist at other articulation places) • Conclusion: • They had not been exposed to these before • Responding to psychophysical attributes • Similar to the Laskey et al. research

Miller et al. (1976) • Non-speech control signals • Using VOT in the form of a noise bust and a buzz • Adults • Results: discrimination functions that were similar to those found with stop consonants differing in VOT • Discrimination was excellent for stimuli selected from b/w categories and poor for stimuli within a category • Perceptual threshold • Psychophysical account

Pisoni • Independent from Miller et al. but at the same time • Used stimuli that varied in temporal order of the onsets of 2 component tones at 2 frequencies (Figure 1): • 500 Hz • 1500 Hz • -50, 0, +50 msec VOT (ranging in 10 msec increments)

Pisoni • Goals: • To learn something about how the timing relations in stop consonants are perceived • To provide a more general account of the diverse findings obtained with adults, infants and chinchillas on VOT stimulus • To provide an account of the results obtained with non-speech stimuli

Pisoni: Experiment I • 8 paid volunteers from ad in student paper • All were right handed and native English speakers • Stimuli (Figure 1): • 11 digital two-tone sequences • Lower tone = 500 Hz • Higher tone = 1500 Hz • Variable is VOT • -50 • 0 • +50

Pisoni: Experiment I • Stimuli was presented at 80 dB SPL • 2 one-hour sessions done over 2 days • Day one: • Identification training sequences • Presented with the endpoint stimuli (-50 & +50) • Told to learn (w/their own strategy) which one of the 2 buttons was associated w/ea sound • Immediate feedback for correct responses

Pisoni: Experiment I • Day two: • Tested for identification • 11 stimuli presented in random order • No feedback • Tested for ABX discrimination* • 9 two-step pairs along the continuum • Feedback provided for correct responses • Told to determine whether the 3rd sound (X) was most like the first (A) or second (B) sound • Chance performance

Pisoni: Experiment I • Figure 2 (p.1355) • Filled in circles = labeling functions response to 2 end points • Sharp and consistent for some • Crossover points for the category boundary for 6 of the 8 are not at 0 but are displaced towards the lagging (+50) stimuli • Why?

Pisoni: Experiment I • Possibly due to limitation on the processing of temporal information or… • Due to Masking of the high frequency (1500 Hz) by the low (500 Hz) • So, they accounted for that by running a pilot study • Pilot study (p.1355-56) • Results: they found no shift in boundary location so…the Limitation on the Processing of Temporal Information is the more like cause of asymmetry

Pisoni: Experiment I • ABX-discrimination results • Open circles Fig. 2 • Categorical-like discrimination • Peaks and troughs • S2 ideal

Pisoni: Experiment I • Results from ID and ABX: • Categorical perception with non-speech signals • This form of perception is not unique to speech signals • Removes one positive line of evidence for the Speech Mode theory • Questions: • Are the findings due to labeling process brought about by the training process? • Or is it do to a simpler psychophysical explanation?

Pisoni: Experiment II • Goal (in order to answer the previously asked questions): • To obtain ABX-discrimination functions before any training experience (label training) • If peaks in discrimination exist there will be reason to suspect a psychophysical basis for the observed discrimination functions from E1

Pisoni: Experiment II • 12 volunteers • Same 11 stimuli used in E1 • 2 one-hour sessions held on separate days (no label training) • 360 ABX trials done ea. day with feedback • 9 two-step stimuli comparisons were responded to 80x by ea subject

Pisoni: Experiment II • Results: • Figure 3 (p. 1357) • 2 patterns shown (except S1 = chance) • Single peak @ approximately +20 msec • Double peak @ approximately +20 and -20 • Natural categories are present at places along the stimulus continuum marked by narrow regions of high sensitivity (thresholds) • 3 categories corresponding with the temporal events • Lower tone leading by 20 msec or more (-) • More or less simultaneously within the -20 to +20 msec region • Lower tone lags by 20 msec or more (+)

Pisoni: Experiment II • These results contrast: • Liberman et al. (1961) • Mattingly et al. (1971) • The above both found: • Marked differences in discrimination between speech and non-speech signals • Why?

Pisoni: Experiment II • The lack of familiarity with the stimuli used (Liberman = synthetic spectograms of /do/ and /to/; Mattingly = 2nd formant transitions were isolated from the rest of the stimulus pattern) • The absenceof anyfeedback during the discrimination task • With complex multidimensional signals it may be difficult for subjects to attend to the relevant attributes that distinguish these stimuli

Pisoni: Experiment II • Patterns of categorical perception are seen when using speech and non-speech stimuli

Pisoni: Experiment III • Goal: • To demonstrate that subjects can classify these same stimuli into three distinct categories whose boundaries occur at precisely these regions on the continuum

Pisoni: Experiment III • Same training procedure • Except…3 responses instead of 2 • 8 additional subjects were recruited • Same set of 11 tonal stimuli • Took place on 2 separate days

Pisoni: Experiment III • Day 1 • Shaping and identification training with the 3 stimuli (-50, 0, +50 msec) • Subjects were free to adopt their own coding strategies • Immediate feedback was provided • Day 2 • Labeling tests were conducted

Pisoni: Experiment III • Figure 4 (p. 1358) • All subjects partitioned the stimulus continuum into three well-defined categories • Boundaries found at approximately -20 and +20 msec • Perceptual threshold • Ability to discriminate temporal differences

Pisoni: Experiment IV • Goal: • Simultaneous vs. non-simultaneous…. • Having the subjects determine whether there are one or two distinct events at stimulus onset

Pisoni: Experiment IV • 8 additional volunteers • None had participated previously • Same 11 tonal stimuli • A single 1-hour session • 11 stimuli presented randomly • Told to listen to ea sound carefully and then determine whether they could hear one or two events at stimulus onset • No feedback was given

Pisoni: Experiment IV • Figure 5 (p. 1359) • All subjects showed similar U-shaped functions with sharp crossover points between categories • Results: • The presence of 3 natural categories that may be distinguished by the relative discriminability of the temporal order of the component events

Pisoni Findings • A perceptual effect for processing temporal order information which may also underlie the perception of voicing distinctions in stop consonants in initial position • There is a perceptual threshold (consistent with studies done by Hirsh, Hirsh and Sherrick, and Stevens and Klatt) of about 20 msec

Pisoni Findings • We know…that VOT (in terms of onset of voicing) must be judged in relation to the temporal attributes of other events (release from closure) • So, these events are ordered in TIME, therefore highly distinctive and discriminable changes will be produced at various regions along the temporal continuum • Phonological systems apparently have exploited the principle of discriminating discrete attributes (natural categories) during the evolution of language • In other words, we’ve positioned our phonemes on either side of the natural auditory boundary provided by the threshold

Identification and Discrimination of Onset Timing in Component Tones: Voicing Perception Implications

Identification and Discrimination of Onset Timing in Component Tones: Voicing Perception Implications

Presentation Transcript

David B. Silipigno

Lecture # 3 Why Montreal?

Lecture # 3 Why Montreal?

David Wafula

Lebanon and Camp David (1976-1984)

David B. Truman

David B. McDonagh

David Choe

The Implications for Higher-Accuracy Absolute Gravity Measurements for NGS and its GRAV-D Project

David B. Clarke

The late 1980s climate regime shift during boreal winter

b - Lactamase Inhibitors

Son of Sam David Berkowitz

David Evans cs.virginia/evans