1 / 25

Pavel Skrelin (Saint-Petersburg State University)

Pavel Skrelin (Saint-Petersburg State University). Some Principles and Methods of Measuring Fo and Tempo. My main principle:.

noma
Télécharger la présentation

Pavel Skrelin (Saint-Petersburg State University)

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Pavel Skrelin (Saint-Petersburg State University) Some Principles and Methods of Measuring Fo and Tempo

  2. My main principle: • Acoustic data that we retrieve from speech material for analysis should be connected with phonetic (linguistic) features, hence obtained values should reflect concrete features and have clear phonetic (linguistic) interpretation; used methods of calculations and classifications should take into account not only speech production but speech perception properties too.

  3. Fo measurements (phrase № 53 in reading and spontaneous speech, F>40)

  4. Terms: • Smoothing: Fo data is processed by rectangular window of 100 ms long with pitch-synchronous shift • Correction: pitch marks are eliminated on voiced consonants + approximants, on voiced onsets, hesitations and voiced transitions between vowels and consonants.

  5. Why the correction is needed on voiced transitions between vowels and consonants This voiced transition does not affect the perceived vowel duration: The whole group [kak'i tak'i] • [kaki] isolated -- original vowel length -- vowel without transition • [i] -- original length (66 ms) -- vowel without transition(37ms) but affects the next consonant duration and Fo values:

  6. Smoothed Fo data with pitch marks on voiced [i-t] transition Smoothed Fo data without pitch marks on voiced [i-t] transition

  7. Raw data: reading Raw data: spontaneous speech

  8. Smoothed data: reading • Smoothed data: spontaneous speech

  9. Smoothed Fo without laryngealization: reading Smoothed Fo without laryngealization: spontaneous speech

  10. Smoothed Fo without laryngealization and some consonants: reading Smoothed Fo without laryngealization and some consonants: spont. speech

  11. Fo measurements

  12. Tempo measurements • Methods may be different for different tasks: • For Comparison on the basis of the whole material • For tempo monitoring, for example for revealing tempo modification specific for some IU types or IU position in the utterance or for local tempo comparison between read and spontaneous realizations of the same phrase

  13. Tempo measurements Comparison on the basis of the whole material • Syllables: Average Duration of Syllables realized in Spont. Speech vs Average Duration of Syllables realized in Reading Example for F>40 152/143 = 1.06 • Sounds: Average Duration of Sounds realized in Spont. Speech vs Average Duration of Sounds realized in Reading Example for F>40 67/63 = 1.06 Possible correction - taking into account the ideal number of syllables or sounds

  14. The simplest way: direct comparison of sound duration in both phrases But some sounds are longer in reading, others – in spontaneous speech, it makes the tempo comparison difficult and inconsistent.

  15. Tempo measurements Tempo monitoring: Example (Speaker F<20: phrase №12, sounds duration in spontaneous speech and reading)

  16. Methods for tempo monitoring 1. Current syllable duration/average syllable duration: current syllable duration = IU duration/number of syllables; average syllable duration = net sound material duration/number of syllables Not good because the result depends on syllable structures in the current IU, so it needs use of some normalization taking into account the average syllable structure (C/V coefficient) and current one.

  17. Methods for tempo monitoring 2. Average sound duration in current IU/average sound duration average sound duration in current IU = IU duration/number of sounds; average sound duration = net sound material duration/number of sounds With possible correction - taking into account the ideal number of sounds in the whole material and in the current IU Example for F<20 • Reading 1-st IU 59/64 = 0.92 2-nd IU 61/64 = 0.95 • Spont. Speech 1-st IU 70/71 = 0.99 2-nd IU 54/71 = 0.76 Not good because the result does not take into account individual average durations of each sound in the IU and deviations of current duration of each sound in the IU from its average duration in the whole material.

  18. Methods for tempo monitoring 3. Average sound duration in current IU/averaged sound duration in the IU average sound duration in current IU = IU duration/number of sounds; averaged sounds duration = sum of average sound durations (on the basis of the whole material) in the IU/ number of sounds in the IU (some pictures)

  19. Or the same in better view

  20. Or the same in better view

  21. Methods for tempo monitoring 3. Average sound duration in current IU/averaged sounds duration in the IU average sound duration in current IU = IU duration/number of sounds; averaged sounds duration = sum of average sounds durations (on the basis of the whole material) / number of sounds Example for F<20 • Reading 1-st IU 59/72 = 0.82 2-nd IU 61/56 = 1.09 • Spont. Speech 1-st IU 70/75 = 0.93 2-nd IU 54/66 = 0.82 With possible correction - taking into account the ideal number of sounds in the whole material and in the current IU and average durations of pre-stressed and post-stressed vowels

  22. Methods for tempo monitoring 4. Rob van Son proposal (Z-values): • "As Finnish and Dutch (and Russian?) use quantities on (some) phonemes, this is not a good way to define tempo. We had a PhD student (Xue Wang) who developed a very nice way to define "local" tempo as the Z value of the phoneme (i.e., LocalTempo = (PhonemeDuration - MeanPhonemeDuration)/StandDeviation for each phoneme). • The local speaking rate is then the mean of these values over an utterance." Example for F<20 • Reading 1-st IU -0.39 2-nd IU 0.26 • Spont. Speech 1-st IU -0.12 2-nd IU -0.41 No comprehensible relation between values and linguistic features

  23. Method comparison

More Related