Download
advanced speech audio signal processing n.
Skip this Video
Loading SlideShow in 5 Seconds..
[Advanced] Speech & Audio Signal Processing PowerPoint Presentation
Download Presentation
[Advanced] Speech & Audio Signal Processing

[Advanced] Speech & Audio Signal Processing

231 Vues Download Presentation
Télécharger la présentation

[Advanced] Speech & Audio Signal Processing

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

  1. [Advanced] Speech & Audio Signal Processing ES 157/257: Speech and Audio Processing Prof. Patrick Wolfe, Harvard DEAS 02 February 2006

  2. State of the Art in Speech/Audio • Speech and audio processing may be divided into “low-level” and “high-level” inference • Speech enhancement, compression, and coding are all widely used technologies • This low-level work is the most mature • High-level tasks will drive future advances • Speech/music database information retrieval • Automatic speaker and speech recognition • But low-level issues also remain…

  3. How to obtain highly structured representations of speech and audio signals? Time frequency “atoms” as building blocks How can statistical inference enable advances in speech signal processing? A means to obtain an “atomic decomposition” Statistical modeling of time-frequency coefficients provides a principled solution Fundamental Questions

  4. Missing data in the context of VOIP: Original Missing Restored Source / Speaker Separation Source 1 Source 2 Mixture 1 Mixture 2 Recovery 1 Recovery 2 Representative Applications

  5. Digital Speech/Audio Processing

  6. Speech Production

  7. Time-Scale Modification

  8. Male & Female Speaker Original Fast Faster Slower Trumpet Original Fast Slow Time-Scale Modification • Speech and Quasi-Periodic Audio • Sinewave-based Modification • Voicing-dependent Rate Factor

  9. Falling Can, Bongo Drums, Loon Original Slow More Time-Scale Modification • Complex Non-Speech Signals • Phase-Vocoder-based Modification • Event-Dependent Phase Coherence

  10. Male & Female Speaker Original Low pitch/Long vocal tract High pitch/Short vocal tract Male Speaker Original and Monotone Pitch and Vocal Tract Change • Sinewave-based Modification

  11. Female Speaker Original CELP 8000 bps Sine 4800 bps Sine 2400 bps Speech Coding • Sinewave-based • Code-Excited Linear Prediction • Male Speaker • Original • CELP 8000 bps • Sine 4800 bps • Sine 2400 bps

  12. Cell Phone Noise, Cocktail Party, Automobile Noise Original Enhanced Noise Reduction • Adaptive Wiener Filter • Adaptation Based on Spectral Change

  13. Low-noise case Original 1.5 dB Reduction 3.0 dB Reduction Compression • Reduction of Peak-to-RMS amplitude ratio • Based on Sinewave Analysis/Synthesis • High-noise case • Original • 1.5 dB Reduction • 3.0 dB Reduction