1 / 14

Study and Implementation of Two Voice Warping Algorithms: Pitch Shifting & Time Stretching

Study and Implementation of Two Voice Warping Algorithms: Pitch Shifting & Time Stretching. EEL 6586 Project Presentation Deng, Chengyu Wang, Dexiang. Outline. Pitch analysis Voice warping applications Pitch shifting Time stretching Pitch shifting algorithm How to change pitch

forbes
Télécharger la présentation

Study and Implementation of Two Voice Warping Algorithms: Pitch Shifting & Time Stretching

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Study and Implementation of Two Voice Warping Algorithms:Pitch Shifting & Time Stretching EEL 6586 Project Presentation Deng, Chengyu Wang, Dexiang

  2. Outline • Pitch analysis • Voice warping applications • Pitch shifting • Time stretching • Pitch shifting algorithm • How to change pitch • General approaches: phase vocoder VS PSOLA • Improve frequency resolution • Formants consideration • Time stretching implementation • Software exhibition (real work we have done)

  3. Take a Look at Pitch • The perceived fundamental frequency of a sound (definition) Pitch period • Due to glottis excitation • An important identification of male or female, adults or children • Accompany with lots of harmonics

  4. Voice Warping Applications • Pitch shifting: maintain time duration but upscale or downscale pitch • Change men’s voice to women’s OR vice versa • Create chipmunk or Mickey mouse like sounds • Lots of applications in movie industry • Time stretching: keep the pitch unchanged but shorten or stretch time duration • Help with word identification • Create some extremely short or long period of voice which can hardly be spoken by normal people

  5. How to Change Pitch? • Naïve idea • Down-sample or up-sample the speech signal • Problems • Time duration also gets changed • Formants get moved as well • We should generate the same number of samples but only scale the pitch

  6. Two General Approaches • Phase Vocoder • Manipulate the signal in frequency domain • Phase is an important feature to determine the pitch and its harmonic position • More accurate, higher fidelity, but longer computation • Time domain scaling ((P)SOLA, etc) • Manipulate the signal in time domain • Precise pitch detection is a critical prerequisite • Shorter computation, but lower quality • (P)SOLA: (Pitch) Synchronous OverLap/Add

  7. Window Concatenation Original Speech Window STFT Frequency Scaling iSTFT Pitch shifted Speech Basic Algorithms We Used for Pitch Shifting • Frequency domain process (more accurate) • Use short time frequency transform • And overlapped windows • Scale the frequency axis to change the pitch and harmonics positions • Upscale: discard high frequency components to avoid aliasing (human cannot feel difference) • Downscale: put zeros as high frequency components

  8. Improve Frequency Resolution • Due to the accuracy limitation of discrete fourier transform • Cannot precisely represent peak components • Example • A: frequency point exactly on 50th sample • B: frequency point in between 50th and 51st samples • Solution • Utilize phase difference between two successive windows to compute exact frequency bins (final report will have more details)

  9. Formants Consideration • Deal with formant movement issues • Lose vocal tract information • Upshifting pitch -> smaller vocal tract (shape) effect • Downshifting pitch -> bigger vocal tract (shape) effect • Solution • Calculate formants envelop (LPC) • Normalize magnitudes before frequency scaling • Scale frequency axis • Recover formants envelop

  10. Frequency Domain Interpolation /Compression Window Concatenation Original Speech Window STFT iSTFT Pitch shifted Speech Time Stretching Implementation • Still take advantage of frequency domain manipulation • Stretch time duration • Interpolate additional samples between original frequency bins (upsampling in frequency domain) • Linear interpolation instead of SINC function interpolation (for convenience of computation) • Shrink time duration • Compression of original frequency bin samples (downsampling in frequency domain)

  11. Put All Together (Building Our Software Implementation) • Windows platform / Visual C++ • Self-developed framework & algorithms • Formant position maintenance (LPC formant envelop calculation) • Time stretching • Borrowed idea and some source codes from DSP website • http://www.dspdimension.com/ for elementary frequency shifting algorithm • http://www.koders.com/ for Levinson LPC algorithm

  12. Introduction to Our GUI Functions • Set target parameters needed by pitch shifting and time stretching process • Click “Process Voice File…” to assign the original voice file and altered voice file • Waiting for process completion • Click “Play Voice File…” button to hear the effect of altered voice

  13. Introduction to Our GUI Functions • Advanced setup • Change the parameters used in our algorithms • LPC order • Window size • Overlapped percentage of the windows

  14. Demo Show (Question session follows)

More Related