Advances in WP1

Nancy Meeting – 6-7 July 2006 Advances in WP1 www.loquendo.com

WP1: Environment & Sensor RobustnessT1.2 Noise Independence Noise Reduction: • Spectral Subtraction (YEAR 1) and Spectral Attenuation (YEAR2) “Automatic Speech Recognition With a Modified Ephraim-Malah Rule”, Roberto Gemello, Franco Mana and Renato De Mori IEEE Signal Processing Letters, VOL 13, NO 1, January 2006 • Evaluation of HEQ for feature normalization (HEQ study + Revision 2)

Ephraim–Malah MMSE log estimator rule: Denoising Techniques for Y2 evaluations (1) Spectral Attenuation (or spectral weighting) is a form of audio signal enhancement in which noise suppression can be viewed as the application of a suppression rule, or non-negative real-valued gain Gk, to each bin k of the observed signal magnitude spectrum, in order to form an estimate of the original signal magnitude spectrum.

Modified Ephraim–Malah MMSE log estimator rule: Denoising Techniques for Y2 evaluations (2) We propose to make the estimation of the a priori and the a posteriori SNR dependent on the noise overestimation factor a(m) and the spectral floor b(m) as follows:

Denoising Techniques for Y2 evaluations (3) The noise spectrum amplitude is obtained by a first-order recursion in conjunction with an energy based Voice Activity Detector (VAD) as follows: Where:  controls the update speed of the recursion (0.9),  controls the allowed dynamics of noise (4.0), and the noise standard deviation (m) is estimated as:

Baseline evaluations of Loquendo ASR on Aurora2 speech databases

Year 1+2 Performance evaluations The testing conditions used in the experiments are the following: 1) No Denoising (ND): Rasta PLP features (RPLP) are used without any preliminary noise reduction. 2) Wiener modified (WM): RPLP with Wiener filtering dependent on global SNR. 3) Ephraim-Malah modified (EMM): RPLP with noise reduction based on the modified Ephraim-Malah spectral attenuation rule.

HEQ + Denoising techniques

E+12CEP DE+12DEP DDE+12DDEP (39 coefficients) HEQ (121) HEQ Evaluation: Revision 1 (1)(Loquendo & UGR) Problems: (1) Context dependency (whole utterance CDF estimation the best) (2) High variability in background noise segment

UGR HEQ Loquendo ASR Loquendo FE HEQ Integration: Revision 1 (2)(Loquendo & UGR) Phoneme-based Models Feature Normalization (Frame -39coeff- level) Denoise (Power Spectrum level)

HEQ Evaluation: Revision 2 (3)(Loquendo & UGR) HEQ (1573) E+12CEP DE+12DEP DDE+12DDEP (39 coefficients) HEQ (1573) HEQ (1573) Benefits: (1) Relation in magnitude and dynamics among coefficients are preserved (2) More stable CDF estimation similar to extend the HEQ temporal window

HEQ Evaluation: Revision 2 (4)(Loquendo & UGR)

HEQ for denoising (5)(Loquendo & UGR) Comparing RPLP / HEQrev1 / HEQrev2 using the same clean and noisy signal

HEQ for signal level equalization (6)(Loquendo & UGR) Comparing RPLP / HEQrev1 / HEQrev2 using the same clean signal at normal gain level and at low gain level

WP1: Workplan • Selection of suitable benchmark databases; (m6) • Completion of LASR baseline experimentation of Spectral Subtraction (Wiener SNR dependent) (m12) • Discriminative VAD (training+AURORA3 testing) (m16) • Exprimentation of Spectral Attenuation rule (Ephraim-Malah SNR dependent) (m21) • Preliminary results on spectral subtraction and HEQ techniques (m24) • Integration of denoising and normalization techniques (m33) • Noise estimation and reduction for non-stationary noises (m33)

Advances in WP1

Advances in WP1

Presentation Transcript

PASI WP1

TEMPEST WP1

PACMAN WP1

WP1

Advances in WP1

Contribution – WP1

FarmPath WP1

WP1 Management

WP1 report

WP1

WP1

WP1 presentation

WP1. GOAL

WP1 Objectives

WP1 : Applications

Advances in WP1

Advances in WP1

WP1 PRESENTATION

WP1 : Applications

WP1 Review