TIME-FREQUENCY PRINCIPAL COMPONENTS OF SPEECH: APPLICATION TO SPEAKER IDENTIFICATION

Principle of the Time-Frequency Principal Components (TFPC) Analysis: Results: Expanded vectors: Contextual Covariance Matrix: covariance matrix of the expanded vectors. Time-Frequency principal Components: principal components of the contextual covariance matrix. Examples of Vector Filtering of Spectral Trajectories: Conclusion:  The TFPC analysis performs better when applied to spectral coefficients.  The use of 1 TFPC filtering for each speaker gives better results than the use of 1 TFPC for all speakers.  The best result is obtained by calculating 1 TFPC for each speaker from spectral coefficients with a context of 3 vectors. This score outperforms the cepstral coefficients augmented by their  parameters by roughly 20 %. Future directions:  Choice of the components.  Application of the TFPC filtering to speaker verification.  Application of the TFPC filtering to other pattern recognition problems. *Rice University, Houston, Texas - **Faculté Polytechnique de Mons, Mons, Belgiumivan@ieee.org - durou@tcts.fpms.ac.be TIME-FREQUENCY PRINCIPAL COMPONENTS OF SPEECH: APPLICATION TO SPEAKER IDENTIFICATION Introduction: Principle of the Vector Filtering of Spectral Trajectories:  Goal: filtering of the spectral vectors in order to extract dynamic speaker characteristic information.  Data-driven approach: the filtering is learned on the training data.  Class-dependent filtering: one filtering for each speaker.  Principle: principal component analysis applied to spectral vectors augmented by their context  Time-Frequency Principal Components (TFPC) Ivan Magrin-Chagnolleau * and Geoffrey Durou ** Experiments:  Task: closed-set text-independent speaker identification.  Database: subset of the POLYCOST database - 112 speakers (64 females and 48 males) - 90 seconds of training (free text through the telephone) - 560 test utterances of 5 second in average.  Spectrum: 13 Mel-scale filterbank coefficients.  Cepstrum: 12 cepstral coefficients (the first one is discarded) augmented by their  parameters.  TFPC Filtering: 1 TFPC filtering for each speaker using several sizes of context - 1 TFPC for all the speakers.  Modeling: Gaussian mixture models with 8 components and diagonal covariance matrices.

TIME-FREQUENCY PRINCIPAL COMPONENTS OF SPEECH: APPLICATION TO SPEAKER IDENTIFICATION

TIME-FREQUENCY PRINCIPAL COMPONENTS OF SPEECH: APPLICATION TO SPEAKER IDENTIFICATION

Presentation Transcript

APPLIED ECONOMETRICS Lecture 1 - Identification

Free Speech/1 st Amendment

A Walk Through The SF424 (R&R)

Feedback Control of Computing Systems M1: Introduction

Robust PCA in Stata

Chem-806 Identification of organic and inorganic compounds by advance NMR techniques

System Components Internal

Making Workflow Flow

Oracle Database 12c: Real Application Security for Oracle Application Express

Why Inner Speech?

Laryngeal Function and Speech Production

Check your speaker volume—1 of 5

Check your speaker volume

Eyewitness Identification

Create a mesoscale analysis that: Assimilates all available surface data at high time frequency

A Tutorial on Bayesian Speech Feature Enhancement

Computer Fundamentals & Component Identification Ogunniran Stephen T.

This is what happens when you can’t afford a real speaker!

Use of Kalman filters in time and frequency analysis John Davis 1st May 2011

Feature Extraction for speech applications

Speech:

Retail Revolution

TIME-FREQUENCY PRINCIPAL COMPONENTS OF SPEECH: APPLICATION TO SPEAKER IDENTIFICATION