310 likes | 797 Vues
Spectrogram & its reading. What is spectrogram?. Begin to be used since 1940s Another representation of frequency domain analysis The most popular way of representing spectral information 3 dimensional representation X-axis: Time Y-axis: Frequency Darkness (or color): Energy .
E N D
What is spectrogram? • Begin to be used since 1940s • Another representation of frequency domain analysis • The most popular way of representing spectral information • 3 dimensional representation • X-axis: Time • Y-axis: Frequency • Darkness (or color): Energy Reviving Sonus
Waveform & Spectrogram aligned Reviving Sonus
Spectrogram example (color resolution of word “compute”) Reviving Sonus
Spectrogram example (grayscale of word “compute”) Reviving Sonus
Wideband vs. Narrowbandspectrograms of the question "Is Pat sad, or mad?" The 5th, 10th and 15th harmonics have been marked by white squares in two of the vowels Reviving Sonus
Types of spectrogram • Wideband spectrogram • better time resolution • eg) 15 msec window, 1 msec shift, 125 Hz bandwidth • Narrowband spectrogram • better frequency resolution • eg) 50 msec window, 1 msec shift, 40 Hz bandwidth Reviving Sonus
Advantages & Disadvantages • Advantages • Time alignment • Disadvantages • Less reliable than waveform Reviving Sonus
Vowel Spectrogram • Formant frequencies are critical cues for vowel distinction • F1: Height • high vowels: low F1 • F2: Backness • back vowels: low F2 Reviving Sonus
Examples of formant frequencies of English monophthongs Reviving Sonus
"heed, hid, head, had, hod, hawed, hood, who'd" (a male speaker, American English) Reviving Sonus
Consonant Spectrogram • General • Acoustic structure more complicated than vowels • Adjacent sounds (especially vowels) convey important information locus • High frequency characteristics especially for fricatives and affricates Reviving Sonus
What is LOCUS • Information of formant transition from vowels into obstruents or from obstruents into vowels • The target frequency that each formant transition is heading toward as an obstruction is made, or the frequency the transition comes as the obstruction is released • The characteristic of the consonantal place and manner roughly the same in different vowel contexts Reviving Sonus
Stops • General • Fairly distinct locus for each place • Burst • Silence during the closure (only at syllable onset position) • Virtually no difference during the closure Reviving Sonus
Stops (cntd.) • Voicing distinction • voiced: vertical striations for voiced sounds, less abrupt burst, frequently weakened to be like fricatives or approximants • voiceless: generally abrupt burst at higher frequency area Reviving Sonus
Stops (cntd.) • Place distinction • bilabial • relatively low F2, F3 locus rising into and falling out of vowel • weak and spread vertical lines • alveolar • F2 locus about 1800 Hz • Strong vertical lines • velar • Velar pinch: vowels F2, F3 merging • often double burst • long formant transitions Reviving Sonus
Stops (cntd.) • Manner distinction • Silence duration, VOT, Following V F0 Reviving Sonus
Examples -- “a bab, a dad, a gag” Reviving Sonus
Place dependent loci Reviving Sonus
Fricatives • General • Random noise pattern especially in high frequency regions • Place distinction • Labiodental [f, v]: rising locus into the following vowel • Dental [T, D]: major energy above 6000Hz • Alveolar [s, z]: major energy above 4000Hz • Alveopalatal [s&, zà]: major energy above 6000Hz • Glottal [h]: the trace of formant frequencies of neighbouring vowels Reviving Sonus
Fricatives (cntd.) • Weak vs. strong • Strong [s, z, s&, zà]: darker bands • Weak [f, v, T, D]: spread and fainter • Voiced [v, D]: often so weak and confused with nasals or approximants • Cues to tell [T] from [f]: higher formants of [T] fall into adjacent vowels Reviving Sonus
Example –“fie, thigh, sigh, shy” Reviving Sonus
Example –“ever, weather, fizzer, pleasure” Reviving Sonus
Nasals • General • Formants similar to vowels but fainter • Very low F1 (about 250Hz), F2 (about 2500Hz), and F3 (about 3250Hz) • Place distinction • bilabial [m]: downward F2, F3 locus • alveolar [n]: less amount of F2 transition • velar [N]: velar pinch Reviving Sonus
Examples -- “a Pam, a tan, a kang” Reviving Sonus
Liquids & Approximants • General • Formants similar to vowels but fainter (especially at high frequency regions) • Approximately F1(250Hz), F2(1200Hz), F3(2400Hz) • Slow formant movements Reviving Sonus
Liquids & Approximants(cntd.) • Phone specific properties • Labial glide [w]: • very low F1, F2 (600-1000Hz|) and gets too close to each • relatively low F3 • rapid falloff of spectral amplitude (formant movements) • Palatal glide [y]: • extremely low F1 • extremely high F2, F3 Reviving Sonus
Liquids & Approximants(cntd.) • Phone specific properties (cntd.) • Flap [R]: soft burst, short duration • Retroflex [r]: • F3 dipping down close to F2 • General lowering of F3, F4 • Lateral [l]: • Low F1, F2 (approx. F1 250Hz, F2 1200Hz) • usually substantial energy in the high F region Reviving Sonus
Example –“led, red, wed, yell” Reviving Sonus
Final remarks • Spectrogram is not the only cue for acoustic distinction of speech sounds. • When there is a mismatch between waveform & spectrogram, the waveform is more reliable in general. Reviving Sonus
References & Links • http://cslu.cse.ogi.edu/tutordemos/SpectrogramReading/spectrogram_reading.html • http://hctv.humnet.ucla.edu/departments/linguistics/VowelsandConsonants/course • http://www.cs.indiana.edu/~port/teach/306/speech.acoustics.html • http://www.phon.ucl.ac.uk/courses/spsci/b203/week2-5.pdf Reviving Sonus