Spectral Analysis

Spectral Analysis Feburary 12, 2008

Sorting Things Out • Course project reports to hand in… • And new report guidelines to hand out • The TOBI transcription homework… • didn’t go all that well. • (to be discussed later) • On Thursday: back in the computer lab. • Analysis of Korean stops.

Voice Quality Summary • So far, we’ve talked about: • AT LT MC Flow • Modal moderate varies moderate med. • Tense high varies high high • Creaky high low high low • Breathy low varies low high

Contrasts • Gujarati contrasts breathy voiced vowels with modal voiced vowels: • Hausa contrasts modal [j] with creaky [j]: • Hausa is spoken in West Africa (primarily in Nigeria) • Creaky consonants are also said to be laryngealized.

All Three • Jalapa Mazatec has a three-way contrast between modal, breathy and creaky voiced vowels: • Jalapa Mazatec is spoken in southern Mexico, around Oaxaca and Veracruz.

Voiced Aspirated • Some languages distinguish between (breathy) voiced aspirated and voiceless aspirated stops and affricates. • Check out Hindi: • Any VOT questions?

Optional VOT Analysis • Check out some Thai and Hindi contrasts.

One Random Thing • Breathy voiced segments can “depress” the tone on a following segment. • Examples from Tsonga: • Tsonga is spoken in South Africa and Mozambique. • Voiced stops also “depress” tones more than voiceless stops. • depressor consonants • Nobody really knows why.

Open Quotient • From EGG measures, we can calculate the “open quotient” for any particular voicing cycle = • time glottis is open • period of voicing cycle • EGG measures show that there are reliable differences in open quotient values between voice qualities. • Breathy voice has a high open quotient • Creaky/Tense voice has a low open quotient • Modal voice is in between

Open Quotient Traces one period open phase • The open quotient in modal voicing is generally around 0.5

Tense Voice one period open phase • Tense voice (from throat singing demo) has a lower open quotient. • Result of medial compression. • Actual value: about 0.3

OQ Traces, continued • OQ for creaky voice is also supposed to be low… • but it’s actually quite sporadic. • Breathy voice OQ is quite high • (0.65 or greater)

4. Whispery Voice • When we whisper: • The cartilaginous glottis remains open, but the ligamental glottis is closed. • Air flow through opening with a “hiss” • The laryngeal settings: • Little or no adductive tension • Moderate to high medial compression • Moderate airflow • Longitudinal tension is irrelevant…

Nodules • One of the more common voice disorders is the development of nodules on either or both of the vocal folds. • nodule = callous-like bump • What effect might this have on voice quality?

Last but not least • What’s going on here? • At some point, my voice changes from modal to falsetto.

5. Falsetto • The laryngeal specifications for falsetto: • High longitudinal tension • High adductive tension • High medial compression • Contraction of thyroarytenoids • Lower airflow than in modal voicing • The results: • Very high F0. • Very thin area of contact between vocal folds. • Air often escapes through the vocal folds.

Falsetto EGG • The falsetto voice waveform is considerably more sinusoidal than modal voice.

Voice Quality Summary AT LT MC Flow Modal moderate varies moderate med. Tense high varies high high Creaky high low high low Breathy low varies low high Whisper low N/A high med. Falsetto high high high low

Back to Basics • Remember that the most basic kind of sound wave is a sinewave. pressure time • Sinewaves can be defined by three basic properties: • Frequency, (peak) amplitude, phase

Complex Waves • It is possible to combine more than one sinewave together into a complex wave. • At any given time, each wave will have some amplitude value. • A1(t1) := Amplitude value of sinewave 1 at time 1 • A2(t1) := Amplitude value of sinewave 2 at time 1 • The amplitude value of the complex wave is the sum of these values. • Ac(t1) = A1 (t1) + A2 (t1) • Note: a harmonic is simply a component sinewave of a complex wave.

Complex Wave Example • Take waveform 1: • high amplitude • low frequency + • Add waveform 2: • low amplitude • high frequency = • The sum is this complex waveform:

Another Perspective • Sinewaves can also be represented by their power spectra. • Frequency on the x-axis • Intensity on the y-axis (related to peak amplitude) • WaveformPower Spectrum

Putting the two together Waveform Power Spectrum + + = = harmonics

More Combinations + = + = • What happens if we keep adding more and more high frequency components to the sum?

A Spectral Comparison Waveform Power Spectrum

What’s the Point? • Remember our EGG waveforms for the different kinds of voice qualities: • The glottal waveform for tense voice resembles a square wave. •  lots of high frequency components (harmonics)

What’s the point, part 2 • A modal voicing EGG looks like: • It is less square and therefore has less high frequency components. • Although it is far from sinusoidal...

What’s the point, part 3 • Breathy and falsetto voice are more sinusoidal... • And therefore the high frequency harmonics have less power, compared to the fundamental frequency.

Let’s Check ‘em out • Head over to Praat and check out the power spectra of: • a sinewave • a square wave • a sawtooth wave • tense voice • modal voice • creaky voice • breathy voice • falsetto voice

Spectral Tilt • Spectral tilt = drop-off in intensity of higher harmonics, compared to the intensity of the fundamental.

The Source • The complex wave emitted from the glottis during voicing= • The source of all voiced speech sounds. • In speech (particularly in vowels), humans can shape this spectrum to make distinctive sounds. • Some harmonics may be emphasized... • Others may be diminished (damped) • Different spectral shapes may be formed by particular articulatory configurations. • ...but the process of spectral shaping requires the raw stuff of the source to work with.

Spectral Shaping Examples • Certain spectral shapes seem to have particular vowel qualities.

Spectrograms • A spectrogram represents: • Time on the x-axis • Frequency on the y-axis • Intensity on the z-axis

Real Vowels

Ch-ch-ch-ch-changes • Check out some spectrograms of sinewaves which change frequency over time:

The Whole Thing • What happens when we put all three together? • This is an example of sinewave speech.

The Real Thing • Spectral change over time is the defining characteristic of speech sounds. •  It is crucial to understand spectrographic representations for the acoustic analysis of speech.

Spectral Analysis