1 / 12

VAD (Voice Activity Detector)

VAD (Voice Activity Detector). Supervised by Dr. Kepuska By Preetham Nosum. VAD. Part of Front End Module – Spectrum, MFCC Detects speech using various features Overall flag set when individual flags are all ON. Tools. Visual C++ - Parameters stored into respective extensions Matlab

merlin
Télécharger la présentation

VAD (Voice Activity Detector)

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. VAD(Voice Activity Detector) Supervised by Dr. Kepuska By Preetham Nosum

  2. VAD • Part of Front End Module – Spectrum, MFCC • Detects speech using various features • Overall flag set when individual flags are all ON

  3. Tools • Visual C++ - Parameters stored into respective extensions • Matlab - Graphs plotted from the data

  4. VAD Features • Energy Feature • MFCC Feature • Spectrum Feature • MFCC Enhanced Feature

  5. Energy Feature • Input used from the previous module – Frame energy • Mean Frame energy calculated • Compared to current frame energy • Energy flag set if the difference is high • Mean not calculated during the VAD ON stage

  6. MFCC Feature • MFCC feature calculated from vector • Each frame compared to the overall mean mfcc • Deviation from mean sets the MFCC flag • Mean not calculated during the VAD ON stage

  7. Spectrum Feature • Uses Variance for detectioin • How much the signal changed after each frame • Mean compared to variance • Flag is set after a certain threshold is crossed

  8. MFCC Enhanced Feature • Uses Hybrid of Spectrum and MFCC • Two ways to detect: Variance mean and Variance of variance • Most sensitive of all features • Flag set ON/OFF based on two different conditions

  9. Logic • Overall flag set when all the flags are turned on • Overall flag turned off when any one of the feature flags is turned off • Waits certain frames to make sure it’s speech

  10. Future • Needs more refinement • Test the MFCC and MFCC Enhanced features with different set of MFCC vector values • Test with more data

  11. Refrences • Discrete-Time Speech Signal Processing, Thomas F. Quantieri

  12. Questions?

More Related