Speech Recognition in Noise

Speech Recognition in Noise Esfandiar Zavarehei Department of Electronic and Computer Engineering Brunel University 25 May, 2004

Contents The use of formant features in speech recognition - Variable-Order LP Formant Tracker with Kalman Filtering - Results Kalman De-noising - Tracking and Filtering the Frequency Trajectories (RASTA) - How Kalman Filter is applied to de-noising problem - Advantages of Kalman

Variable-Order LP Formant tracker • Higher order of LP modelling for higher resolution • Continuity criteria for better classification • Kalman Filtering for smoother Tracks

Formant Feature (FF) Vectors • In addition to the Frequency of poles their Band Widths and Magnitudes are used as well • The HMM models are trained on mono-phones.

FF vs. MFCC with and without energy component Mono-phone recognition in Train noise • Better performance of FF in severe noisy conditions

Robustness of dynamic FF to noise Mono-phone recognition in Train noise • Dynamic Features are much more robust to noise

The use of the Formants for consonant recognition Mono-phone recognition in Train noise • Higher Recognition rates than vowels in higher SNR • More sensitive to noise because of the lower energy level

De-noising the speech by filtering frequency trajectories

RelAtive SpecTrA (RASTA) Processing • Filtering the frequency trajectories of the cubic root of power spectrum using a fixed IIR filter

The use of FIR filters in RASTA • Filtering the frequency trajectories of the power spectrum using a bank of non-casual FIR filters • not adaptive • experimentally derived Filters’ Impulse Response

Kalman Filtering • Kalman Filter adaptively updates itself with noise covariance

Prior Noise Model and Trajectory Statistics Noise Modelling Noise Covariance Mean Error covariance Predictor Kalman Gain VAD Segment Frequency Bin Trajectory Predicted Estimator Output Spectral Subtraction Observation Kalman Filtering How Kalman Filter is applied to de-noising problem Noise Modelling and updating Neighbour Trajectory

A more informed noise reduction Combining the prediction and the observation of the frequency trajectory Adaptively updating the noise model while filtering the trajectory (in comparison with RASTA) Could (and probably should) be combined with spectral subtraction for improved performance Advantages of Kalman

Speech Recognition in Noise

Speech Recognition in Noise

Presentation Transcript

Speech Recognition

Speech Recognition

Speech Recognition

Speech Recognition

Speech Recognition

Environmental Noise No Longer Relevant for Speech Recognition.

Speech recognition

Speech Recognition

Speech Recognition

Speech Recognition

Speech Recognition

Speech Recognition

Speech Recognition

SPEECH RECOGNITION:

Speech Recognition

Noise Reduction in Speech Recognition

Speech Recognition

Speech Recognition

Speech Recognition

Speech Recognition

Speech Recognition

Speech Recognition