1 / 29

Xkl: A Tool For Speech Analysis

Xkl: A Tool For Speech Analysis. Eric Truslow Adviser: Helen Hanson. Outline. Introduction to speech analysis Production mechanism Models of speech production Background about Xkl Design Pitch Detection Labeling Portability Future Work. Outline. Introduction to speech analysis

adamcox
Télécharger la présentation

Xkl: A Tool For Speech Analysis

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Xkl: A Tool For Speech Analysis Eric Truslow Adviser: Helen Hanson

  2. Outline • Introduction to speech analysis • Production mechanism • Models of speech production • Background about Xkl • Design • Pitch Detection • Labeling • Portability • Future Work

  3. Outline • Introduction to speech analysis • Production mechanism • Models of speech production • Background about Xkl • Design • Pitch Detection • Labeling • Portability • Future Work

  4. Speech Production Vocal Tract Frequency Reponse

  5. Speech Production Vocal Tract Frequency Reponse Periodic Source

  6. Speech Production Nasal cavities contribute too Output Vocal Tract Frequency Reponse Periodic Source

  7. Speech Model: Basic Gain Pitch Period Vocal Tract Parameters Impulse Train Generator Glottal Pulse Model X Vocal Tract Model Random Noise Generator X Voiced/Unvoiced Decision Gain

  8. Speech Model: Klatt

  9. Parameters • Source characterization • Voiced or unvoiced • Frequency of periodic source • Energy distribution of a noise source • Vocal tract model • Resonant frequency (Formants), antiresonant frequencies and bandwidths

  10. Outline • Introduction to speech analysis • Production mechanism • Models of speech production • Background about Xkl • Design • Pitch Detection • Labeling • Portability • Future Work

  11. Background - Xkl • Developed in-house at MIT by Dennis Klatt in the 1980s, and was originally a command line tool on Vax systems. • Later was ported to UNIX and an X11/Motif GUI was added. • Currently runs on Linux. • Praat has become a very versatile alternative to Xkl, but Xkl has functionality that Praat does not.

  12. Xkl – Features • Allows users to easily examine speech signals in fine detail. • Automatically computes DFT and spectrogram. • Can perform a variety of computations not available in other packages. • Averages spectra over time or waveforms • Smooth spectrum

  13. Spectrogram and DFT in Xkl Spectrogram DFT and smoothed spectrum

  14. Outline • Introduction to speech analysis • Production mechanism • Models of speech production • Background about Xkl • Design • Pitch Detection • Labeling • Portability • Future Work

  15. Design Requirements Users surveyed wanted: • Pitch period estimator • An improved labeling system • Portability • Compatibility with multiple operating systems • Support for more audio file formats

  16. Pitch Detection • How rapidly the vocal tract is excited with periodic pulses. • Carries lexical and prosodic information. • During computation we must decide whether speech is voiced or unvoiced. • Errors in computation often occur during transitions between sounds. • Errors depend on type of pitch detector being used.

  17. Pitch Detection: Design • There are many different pitch detectors • Praat's was chosen because it • Outperforms other detectors (SNR, HNR) • Is readily available

  18. Pitch Detection: Algorithm Time domain, autocorrelation method Frame processing determines strongest pitch candidates including unvoiced. Viterbi algorithm minimizes global cost from candidates. Praat Pitch Detector Remove Hanning Window Sidelobe Compute Global Peak Value Process Frame To Obtain Local Optimal Choices Find Path with Globally Minimum Cost Tone 4

  19. Outline • Introduction to speech analysis • Production mechanism • Models of speech production • Background about Xkl • Design • Pitch Detection • Labeling • Portability • Future Work

  20. Labeling • Support for reading and saving TextGrid files, for interaction with Praat [1]. • Tiers for grouping labels • Want labels to be displayed in same window as waveform • Different from Xkl's separated window layout

  21. Labeling

  22. Outline • Introduction to speech analysis • Production mechanism • Models of speech production • Background about Xkl • Design • Pitch Detection • Labeling • Portability • Future Work

  23. Portability PortAudio a cross-platform audio library supports most operating systems simplifies software maintenance Runs on OS X Since it natively runs X11 Added support to open Microsoft .wav files.

  24. Outline • Introduction to speech analysis • Production mechanism • Models of speech production • Background about Xkl • Design • Requirements • Alternatives • Final Design • Future Work

  25. Future Work • Deploy to users for feedback • Finalize • Labeling • Pitch Contour • Fix bugs and add small features

  26. Software Used • Eclipse – Integrated Development Environment. • Doxygen – A documentation generation system. • SVN – A version control system. • Open Motif – X Windows window managing system and widget library. • GDB – The GNU debugger. • GNU build system on OS X. • PortAudio – A multiplatform audio library.

  27. Thank you for your attention. Special thanks to: • Professor Helen Hanson • Dr. Stefanie Shattuck-Hufnagel (MIT) • Dennis H. Klatt • Survey Participants • ECE Department

  28. Questions?

  29. References 1: Paul Boersma & David Weenink (2009):Praat: doing phonetics by computer (Version 5.1.05) [Computer program].Retrieved May 1, 2009, from http://www.praat.org/ 2: Paul Boersma, Accurate Short-term analysis of the Fundamental Frequency and the Harmonics-to-Noise Ratio of a Sampled Sound, 1993, http://www.fon.hum.uva.nl/paul/papers/Proceedings_1993.pdf

More Related