1 / 12

Prosody modification in speech signals

Prosody modification in speech signals. Project by Edi Fridman & Alex Zalts supervision by Yizhar Lavner.

cleo-finch
Télécharger la présentation

Prosody modification in speech signals

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Prosody modification in speech signals Project by Edi Fridman & Alex Zalts supervision by Yizhar Lavner

  2. Prosody: the "non-textual" aspects of the speech signal”Segmental" aspects: timing, duration, rhythm, stress, and metrical structure. The duration of each individual "segment" is under the control of the speaker to varying degrees, and varies with stress and rate.The relative strength of an individual syllable, word, or phrase may be realized in a number of ways, including lengthening (or shortening and cliticization), changes in pitch, and amplitude, and spectral character.

  3. Project goals • Prosody modification with TDPSOLA algorithm • Prosody modification with HNM model • Conversion of male voice to female voice & vice versa

  4. Four steps in prosody modification • Time-scale modification • Pitch-scale modification • Energy envelope modification • Modification of distribution of utterancers

  5. TDPSOLA Approach (*) Based on Overlapp-and-Add idea (*) Synchronization with original pitch by: 1) Setting up pitch marks in analysis signal 2) Setting up new pitch marks in synthesis signal according to time-scale and pitch-scale factors (0.6 for pitch 1.3 for time) (*) Building synthesis signal using OLA

  6. Setting up new pitch marks Let us define time instants in analysis signal ta(s) as original pitch marks and pitch contour as P(t) The stream of synthesis pitch-marks ts(u) is determined from ta(s) according to desired time-scale modification (tD(t)) and pitch-scale modification Fp(P) by:  P`(t) dt ts`(u+1) 1 ts(u+1)-ts(u) = ts`(u+1)-ts`(u) ts`(u) with ts(u+1) = D(ts`(u+1)) P`(t) = Fp (P(t))

  7. Problem of TDPSOLA: Impossible to change pitch contour because algorithm is based on original pitch marks original pitch-marks new pitch-marks Problem: too many pitch marks are not counted in, resulting bad sound quality

  8. HNM Approach • Speech signal is modeled as harmonics of pitch plus noise • Harmonics and noise are treated in different • ways • Synthesis and analysis are performed in pitch synchronous way

  9. Let X(n) be the speech segment. According to HNM model can be found and written as: To minimize error where the complex constants hk and zk are defined as: hk - complex amplitude of harmonic K fk - frequency of harmonic K T - sampling period W(n) - noise

  10. Harmonic K is set to be K*F0 where F0 is pitch that found by PDA Amplitudes and phases of pitch-harmonics computed with Prony algorithm by minimizing least square error between harmonics and original signal yielding: In each voiced speech fragment maximum voiced frequency Fm is calculated and noise part obtained by filtering signal with HP filter with cutoff frequency Fm In unvoiced fragments signal’s specturm is modeled by pth-order all-pole filter H(z). The noise is synthesized by filtering a unit variance gaussian noise through H(z) When pitch scaling is done there is a need to re-compute amplitudes and phases of modified pitch-harmonics. For this purpose a frequency-continuous spectral and phase envelope is necessary.

  11. Comparing between TDPSOLA & HNM

  12. The only target in pitch-scaling was to change F0 preserving other formantsThere was an attempt to change spectral envelope in order to change male voice to female voice and vice versaNew algorithm was proposed

More Related