Digital Audio Signal Processing Lecture-4: Noise Reduction

Digital Audio Signal ProcessingLecture-4: Noise Reduction Marc Moonen/Alexander Bertrand Dept. E.E./ESAT-STADIUS, KU Leuven marc.moonen@esat.kuleuven.be homes.esat.kuleuven.be/~moonen/

Overview • Spectral subtraction for single-micr. noise reduction • Single-microphone noise reduction problem • Spectral subtraction basics (=spectral filtering) • Features: gain functions, implementation, musical noise,… • Iterative Wiener filter based on speech signal model • Multi-channel Wiener filter for multi-micr. noise red. • Multi-microphone noise reduction problem • Multi-channel Wiener filter (=spectral+spatial filtering) • Kalman filter based noise reduction • Kalman filters • Kalman filters for noise reduction

desired signal s[k] desired signal estimate y[k] ? noise signal(s) desired signal contribution noise contribution Single-Microphone Noise Reduction Problem • Microphone signalis • Goal: Estimate s[k] based on y[k] • Applications: Speech enhancement in conferencing, handsfree telephony, hearing aids, … Digital audio restoration • Will consider speech applications: s[k] = speech signal

Spectral Subtraction Methods: Basics • Signalchopped into `frames’ (e.g. 10..20msec), for each frame a frequency domain representation is (i-th frame) • However, speech signal is an on/off signal, hence some frames have speech +noise, i.e. some frames have noise only, i.e. • A speech detection algorithm is needed to distinguish between these 2 types of frames (based on energy/dynamic range/statistical properties,…)

Spectral Subtraction Methods: Basics • Definition: () = average amplitude of noise spectrum • Assumption: noise characteristics change slowly, hence estimate () by (long-time) averaging over (M) noise-only frames • Estimate clean speech spectrum Si() (for each frame), using corrupted speech spectrum Yi() (for each frame, i.e. short-time estimate) + estimated (): based on `gain function’

Spectral Subtraction: Gain Functions Magnitude Subtraction Spectral Subtraction Wiener Estimation Maximum Likelihood Non-linear Estimation Ephraim-Malah = most frequently used in practice see next slide

modifiedBessel functions Spectral Subtraction: Gain Functions skip formulas • Example 1: Ephraim-Malah Suppression Rule (EMSR) with: • This corresponds to a MMSE(*)estimation of the speech spectral amplitude |Si()| based on observation Yi() ( estimate equal toE{ |Si()||Yi() } )assuming Gaussian a priori distributions for Si() and Ni()[Ephraim & Malah 1984]. • Similar formula for MMSE log-spectral amplitude estimation [Ephraim & Malah 1985]. (*) minimum mean squared error

Spectral Subtraction: Gain Functions • Example 2: Magnitude Subtraction • Signal model: • Estimation of clean speech spectrum: • PS:half-wave rectification

<- cross-correlation in i-th frame <- auto-correlation in i-th frame Spectral Subtraction: Gain Functions • Example 3: Wiener Estimation • Linear MMSE estimation: find linear filter Gi() to minimize MSE • Solution: Assume speech s[k] and noise n[k] are uncorrelated, then... • PS: half-wave rectification

Spectral Subtraction: Implementation • Short-time Fourier Transform(=uniform DFT-modulated analysis filter bank) = estimate for Y(n ) at time i (i-th frame) N=number of frequency bins (channels) n=0..N-1 M=downsampling factor K=frame lengthh[k] = length-Kanalysis window (=prototype filter) • frames with 50%...66% overlap (i.e. 2-, 3-fold oversampling, N=2M..3M) • subband processing: • synthesis bank: matched to analysis bank (see DSP-CIS) Y[n,i] y[k] Short-time analysis Gain functions Short-time synthesis

magnitude subtraction Spectral Subtraction: Musical Noise • Audio demo:car noise • Artifact: musical noise What? Short-time estimates of |Yi()| fluctuate randomly in noise-only frames, resulting in random gains Gi() • statistical analysis shows that broadband noise is transformed into signal composed of short-lived tones with randomly distributed frequencies (=musical noise)

probability that speech is present, given observation Spectral Subtraction: Musical Noise • Solutions? • Magnitude averaging: replace Yi() in calculation of Gi() by a local average over frames • EMSR (p7) • augment Gi() with soft-decision VAD: Gi()  P(H1 | Yi()). Gi() … average instantaneous

Spectral Subtraction: Iterative Wiener Filter Example of signal model-basedspectralsubtraction… • Basic: Wiener filtering basedspectralsubtraction(p.9),with (improved) spectra estimationbased on parametricmodels • Procedure: • Estimate parameters of a speech model fromnoisysignaly[k] • Using estimated speech parameters, performnoisereduction (e.g. Wiener estimation, p. 9) • Re-estimate parameters of speech model from the speech signalestimate • Iterate 2 & 3

Spectral Subtraction: Iterative Wiener Filter pulse train … … all-pole filter voiced pitch period u[k] speech signal x white noise generator unvoiced frequency domain: time domain: = linear prediction parameters

Spectral Subtraction: Iterative Wiener Filter For each frame (vector) y[m](i=iteration nr.) • Estimate and • Construct Wiener Filter (p.9) with: • estimated during noise-only periods 3. Filter speech frame y[m] Repeat until some error criterion is satisfied

Multi-Microphone Noise Reduction Problem speech source ? (some) speech estimate microphone signals noise source(s) speech part noise part

Multi-Microphone Noise Reduction Problem Will estimate speech part in microphone 1 (*) (**) ? (*) Estimating s[k] is more difficult, would include dereverberation... (**) This is similar to single-microphone model (p.3), where additional microphones (m=2..M) help to get a better estimate

Multi-Microphone Noise Reduction Problem • Data model: See Lecture-2 on multi-path propagation, with q left out for conciseness. Hm(ω) is complete transfer function from speech source position to m-the microphone

Multi-Channel Wiener Filter (MWF) • Data model: • Will use linear filters to obtain speech estimate (as in Lecture-2) • Wiener filter (=linear MMSE approach) Note that (unlike in DSP-CIS) `desired response’ signal S1(w) is unknown here (!), hence solution will be `unusual’…

Multi-Channel Wiener Filter (MWF) • Wiener filter solution is (see DSP-CIS) • All quantities can be computed ! • Special case of this is single-channel Wiener filter formula on p.9 compute during speech+noise periods compute during noise-only periods

Multi-Channel Wiener Filter (MWF) • MWF combines spatial filtering(as in Lecture-2) with single-channel spectral filtering(as in single-channel noise reduction) if then…

Multi-Channel Wiener Filter (MWF) • …then it can be shown that • represents a spatial filtering (*) Compare to superdirective & delay-and-sum beamforming (Lecture-2) • Delay-and-sum beamf. maximizes array gain in white noise field • Superdirectivebeamf. maximizes array gain in diffuse noise field • MWF maximizes array gain in unknown (!) noise field. MWF is operated without invoking any prior knowledge (steering vector/noise field) !(the secret is in the voice activity detection… (explain)) (*)Note that spatial filtering can improve SNR, spectral filtering never improves SNR (at one frequency)

Multi-Channel Wiener Filter (MWF) • …then it can be shown that (continued) • represents an additional `spectral post-filter’ i.e. single-channel Wiener estimate (p.9), applied to output signal of spatial filter (prove it!)

filter coefficients Multi-Channel Wiener Filter: Implementation • Implementation with short-time Fourier transform: see p.10 • Implementation with time-domainlinear filtering:

compute during speech+noise periods compute during noise-only periods Multi-Channel Wiener Filter: Implementation Solution is… • Implementation with time-domainlinear filtering:

process noise measurement noise Kalman Filter 1/12 State space model of a time-varying discrete-time system with v[k] andw[k]: mutually uncorrelated, zero mean, white noises Then: given A[k], B[k], C[k], D[k], V[k], W[k] and input/output-observations u[k],y[k], k=0,1,2,... then Kalman filter produces MMSE estimates of internal statesx[k], k=0,1,... PS: will use shorthand notation here, i.e. xk, yk,.. instead of x[k], y[k],..

Kalman Filter 2/12 Definition:= MMSE-estimate ofxk using all available data up until time l • `FILTERING’ = estimate • `PREDICTION’ = estimate • `SMOOTHING’ = estimate

Kalman Filter 3/12 Initalization: ‘Conventional Kalman Filter’ operation @ time k (k=0,1,2,..) Given and corresponding error covariance matrix , Compute and usinguk, yk: Step 1: Measurement Update (produces ‘filtered’ estimate) (compare to standard RLS!) Step 2:Time Update (produces ‘1-step prediction’) =error covariance matrix

Kalman Filter 4/12 PS: ‘Standard RLS’ is a special case of ‘Conventional KF’  Internal state vector is FIR filter coefficients vector, which is assumed to be time-invariant

Kalman Filter 5/12 State estimation @ time k corresponds to overdeterminedset of linear equations, where vector of unknowns contains all previous state vectors :

Kalman Filter 6/12 State estimation @ time k corresponds to overdeterminedset of linear equations, where vector of unknowns contains all previous state vectors : Technical detail…

Kalman Filter 7/12 State estimation @ time k corresponds to overdeterminedset of linear equations, where vector of unknowns contains all previous state vectors :

Kalman Filter 8/12 ‘Square-Root’ Kalman Filter Propagated from time k-1 to time k   ..hence requires only lower-right/lower part

Kalman Filter 9/12 ‘Square-Root’ Kalman Filter Propagated from time k-1 to time k  

Kalman Filter 10/12 ‘Square-Root’ Kalman Filter Propagated from time k-1 to time k   Propagated from time k to time k+1  

Kalman Filter 11/12 QRD- =QRD-

Kalman Filter 12/12 can be derived from square-root KF equations is . : can be worked into measurement update eq. : can be worked into state update eq. [details omitted]

Kalman filter for Speech Enhancement • Assume AR model of speech and noise • Equivalent state-space model is… y=microphone signal u[k], w[k] = zero mean, unit variance,white noise

Kalman filter for Speech Enhancement with:

Kalman filter for Speech Enhancement Iterative algorithm iterations y[k] split signal in frames estimate parameters Kalman Filter reconstruct signal • Disadvantages iterative approach: • complexity • delay

Kalman filter for Speech Enhancement iteration index time index (no iterations) Sequential algorithm D State Estimator: Kalman Filter Parameters Estimator (Kalman Filter) D

CONCLUSIONS • Single-channel noise reduction • Basic system is spectral subtraction • Only spectral filtering, hence can only exploit differences in spectra between noise and speech signal: • noise reduction at expense of speech distortion • achievable noise reduction may be limited • Multi-channel noise reduction • Basic system is MWF, • Provides spectral + spatial filtering (links with beamforming!) • Iterative Wiener filter & Kalman filtering • Signal model based approach

Digital Audio Signal Processing Lecture-4: Noise Reduction