Factor Analysis of Acoustic Features for Streamed Hidden Markov Modeling

Factor Analysis of Acoustic Features for Streamed Hidden Markov Modeling Chuan-Wei Ting Department of Computer Science and Information Engineering, National Cheng Kung University, Tainan, Taiwan

Outline • Introduction • Cepstral Factor Analysis • FA Streamed Hidden Markov Model • Experiments • Conclusions & Future Works

Outline • Introduction • Stochastic modeling • Cepstral Factor Analysis • FA Streamed Hidden Markov Model • Experiments • Conclusions & Future Works

Introduction • The objective of constructing acoustic model is to capture the characteristics of speech signal. • Stochastic modeling • Hidden Markov model (HMM) • Multi-Stream HMM • Factorial HMM

Hidden Markov Model • Topology of HMM • Constraints • All features are “tied” together • Topology • Transition moment • Independent assumption

Multi-Stream HMM • Topology of Multi-stream HMM

Simplification of Multi-Stream HMM • Streams are assumed to be statistical independent • Weighted log-likelihood approach

Factorial HMM • Topology of FHMM

Outline • Introduction • Cepstral Factor Analysis • Features analysis • Factor analysis • FA Streamed Hidden Markov Model • Experiments • Conclusions & Future Works

Cepstral Factor Analysis • Feature analysis • Dynamics of different features • Correlations

Factor Analysis • Discover the correlations inherent in observation data. • Applications • Data compression • Signal processing • Acoustic modeling

specific factor factor loading matrix common factor Mathematical Definition of FA • FA conducts data analysis of the multivariate observations using the common factors and the specific factors. • For a dimensional feature vector , the general form of FA model is given by

Principal Component Solution • Find an estimator that will approximate the fundamental expression • Decompose covariance matrix of observation • FA parameters can be estimated by

Principal Factor Analysis Solution • Using an initial estimate (diagonal) and then obtain loading matrix by • Obtain an estimate of by performing a principal component analysis on . • This process is continued until the communality estimates converge.

Maximum Likelihood Solution • When FA is carried out on the correlation matrix • Where , , , , and is a diagonal matrix.

Varimax rotation • Let • can be obtained by maximizing Rotation of Loading Matrix • Rotate loading matrix by an orthogonal matrix • Where satisfies

Effectiveness of Rotation • Obtain greater discriminability

Outline • Introduction • Cepstral Factor Analysis • FA Streamed Hidden Markov Model • Survey of different HMMs • FASHMM • Experiments • Conclusions & Future Works

FA Streamed HMM • Using FA, the processes of observed features and hidden states are represented by common factors and residual factors.

Survey of Different HMMs (FAHMM) • Covariance matrix modeling • Full vs. diagonal • Sufficient data problem • FA representation • State/latent representation • Discrete vs. continuous

Survey of Different HMMs (Streamed HMM) • In standard HMM, the joint probability of observation sequence and state sequence was represented by • Using FHMM, the state at time was extended to states, i.e. . • Likelihood combination • Multi-stream HMM • FHMM  sub-word level  frame level

common covariance matrix Likelihood Function of FHMM • State transition probability • Likelihood function

 Estimation Approaches for FHMM • Exact inference • Expectation maximization (EM) algorithm • Complexity • Approximations • Gibbs sampling • Variational inference

FASHMM • According to FA method, the common factor are associated with some features, which are highly correlated. • Correlated features are grouped together in a stream and shared by the same FA parameters. • Observed feature vector can be represented by

Topology of FASHMM • State transition probability

Outline • Introduction • Cepstral Factor Analysis • FA Streamed Hidden Markov Model • Experiments • Simulated data setup • HMM vs. FASHMM • Recognition results & discussion • Conclusions & Future Works

Experimental Setup • Simulated data • 4 classes, 5 variables • Training: 100 sentences, 5 “words” per sentence • Testing: 50 utterances, 4 “words” per sentence • Model structure • HMM • 7 states each class • Only one Gaussian each state • FASHMM • 3 states each class • Only one Gaussian each state

Class 1

Class 2

Class 3

Class 4

HMM vs. FASHMM HMM FASHMM

Recognition Results

Discussion

Outline • Introduction • Cepstral Factor Analysis • FA Streamed Hidden Markov Model • Experiments • Conclusions & Future Works

Conclusions • We have presented the FA approach • Extract the common factor and the residual factors in acoustic features • Separate the Markov chains for these factors. • Represent the sophisticated dynamics in stochastic process of speech signal. • A new topology of FA streamed HMM was proposed.

Future Works • More acoustic features • Model selection • Streams • States • Mixtures • Large vocabulary continuous speech recognition (LVCSR) task

Factor Analysis of Acoustic Features for Streamed Hidden Markov Modeling

Factor Analysis of Acoustic Features for Streamed Hidden Markov Modeling

Presentation Transcript

Hidden Markov Modeling, Multiple Alignments and Structure

Hidden Markov Models

Hidden Markov Models

Hidden Markov Models

Hidden Markov Models

Sequential Modeling with the Hidden Markov Model

Hidden Markov Models

Hidden Markov Modeling, Multiple Alignments

Hidden Markov Models

Hidden Markov Models

Hidden Markov Models

Hidden Markov Model

Hidden Markov Model

Hidden Markov Models

Hidden Markov Model

Hidden Markov Models

Hidden Markov Models

Hidden Markov Model

Hidden Markov Model

Hidden Markov Models

Hidden Markov Model

Hidden Markov Models