1 / 20

Minimum Mean Squared Error Time Series Classification Using an Echo State Network Prediction Model

Minimum Mean Squared Error Time Series Classification Using an Echo State Network Prediction Model. Mark Skowronski and John Harris Computational Neuro-Engineering Lab University of Florida. Automatic Speech Recognition Using an Echo State Network. Mark Skowronski and John Harris

oma
Télécharger la présentation

Minimum Mean Squared Error Time Series Classification Using an Echo State Network Prediction Model

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Minimum Mean Squared Error Time Series Classification Using an Echo State Network Prediction Model Mark Skowronski and John Harris Computational Neuro-Engineering Lab University of Florida

  2. Automatic Speech Recognition Using an Echo State Network Mark Skowronski and John Harris Computational Neuro-Engineering Lab University of Florida

  3. Transformation of a graduate student 2000 2006

  4. Motivation: Man vs. Machine Wall Street Journal/Broadcast news readings, 5000 words Untrained human listeners vs. Cambridge HTK LVCSR system

  5. Overview • Why is ASR so poor? • Hidden Markov Model (HMM) • Echo state network (ESN) • ESN applied to speech • Conclusions

  6. ... frequency … m1 m2m3 m4 m5 m6 coefficients ASR State of the Art • Feature extraction: MFCC vs. HFCC* • Acoustic pattern rec: HMM • Language models *Skowronski & Harris. JASA, (3):1774–1780, 2004.

  7. Hidden Markov Model Premier stochastic model of non-stationary time series used for decision making. Assumptions: 1) Speech is piecewise-stationary process. 2) Features are independent. 3) State duration is exponential. 4) State transition prob. function of previous-next state only.

  8. ASR Example • Isolated English digits “zero” - “nine” from TI46: 8 male, 8 female, 26 utterances each, fs=12.5 kHz. • 10 word models, various #states and #gaussians/state. • Features: 13 HFCC, 100 fps, Hamming window, pre-emphasis (α=0.95), CMS, Δ+ΔΔ (±4 frames) • Pre-processing: zero-mean and whitening transform • M1/F1: testing; M2/F2: validation; M3-M8/F3-F8 training • Test: corrupted by additive noise from “real” sources (subway, babble, car, exhibition hall, restaurant, street, airport terminal, train station)

  9. HMM Test Results

  10. Overcoming the limitations of HMMs • HMMs do not take advantage of the dynamics of speech • Well known HMM limitations include: • Only the present state affects transition probabilities • Successive observations are independent • Assumes static density models Need an architecture that better captures the dynamics of speech

  11. Echo State Network Recurrent neural network proposed by Jaeger 2001 Recurrent “reservoir” of nonlinear processing elements with random untrained weights. Linear readout, easily trained weights. random untrained input weights. W dx L M I A N P E P A E R R dy Win Wout Note similarities to Liquid State Machine

  12. ESN Diagram & Equations

  13. How to classify with predictors Build 10 word models that are trained to predict the future of each of the 10 digits 0 1 2 ? Z-1 The best predictor determines the class 8 9 Not a new idea!

  14. ESN Training • Minimize mean-squared error between y(n) and desired signal d(n). Wiener solution:

  15. Multiple Readout Filters • Need good predictors for separation of classes • One linear filter will give mediocre prediction. • Question: how to divide reservoir space and use multiple readout filters? • Answer: competitive network of filters • Question: how to train/test competitive network of K filters? • Answer: mimic HMM.

  16. ASR Example • Same spoken digit experiment as before. • ESN: M=60 PEs, r=2.0, rin=0.1, 10 word models, various #states and #filters/state. • Identical pre-processing and input features • Desired signal: next frame of 39-dimension features

  17. ESN Results

  18. ESN/HMM Comparison

  19. Conclusions • ESN classifies by predicting • Multiple filters mimic sequential nature of HMMs • ESN classifier noise robust compared to HMM: • Ave. over all sources, 0-20 dB SNR: +21 percentage points • Ave. over all sources: +9 dB SNR • ESN reservoir provides a dynamical model of the history of the speech. Questions?

  20. HMM vs. ESN Classifier

More Related