Database and Visual Front End

Database and Visual Front End Makis Potamianos

Active Appearance ModelVisual Features Iain Matthews

Acknowledgments • Cootes, Edwards, Talyor, Manchester • Sclaroff, Boston

AAM Overview Appearance Region of interest Warp to reference Shape & Appearance Landmarks Shape

Relationship to DCT Features Face Detector DCT AAM Tracker AAMFeatures • External feature detector vs. model-based learned tracking • ROI ‘box’ vs. explicit shape + appearance modeling

Training Data • 4072 hand labelled images = 2m 13s (/ 50h)

Final Model 3 mean 3

Fitting Algorithm Current model projection Current model projection Image under model Image under model Difference Difference Iterate until convergence a dc Appearance Appearance Warp to reference Warp to reference Error  weight c is all model parameters dc PredictedUpdate Image

Tracking Results • Worst sequence - mean, mean square error = 548.87 • Best sequence - mean, mean square error = 89.11

Tracking Results • Full-face AAM tracker on subset of VVAV database • 4,952 sequences • 1,119,256 images @ 30fps = 10h 22m • Mean, mean MSE per sentence = 254.21 • Tracking rate (m2p decode)  4 fps • Beard area and lips only models will not track • Regions lack sharp texture gradients needed locate model?

Features • Use AAM full-face features directly (86 dimensional)

Audio Lattice Rescoring Results Lattice random path = 78.14% DCT with LM = 51.08% DCT no LM = 61.06%

Audio Lattice Rescoring Results • AAM vs. DCT vs. Noise

Tracking Errors Analysis • AAM vs. Tracking error

Analysis and Future Work • Models are under trained • Little more than face detection on 2m of training

Analysis and Future Work reproject • Models are under trained • Little more than face detection on 2m of training • Project face through a more compact model • Retain only useful articulation information?

Analysis and Future Work reproject • Models are under trained • Little more than face detection on 2m of training • Project face through a more compact model • Retain only useful articulation information? • Improve the reference shape • Minimal information loss through the warping?

Asynchronous Stream Modelling Juergen Luettin

The Recognition Problem M: word (phoneme) sequence M*: most likely word sequence OA: acoustic observation sequence OV: visual observation sequence

Integration at the Feature Level • Assumption: • conditional dependence between modalities • integration at the feature level

Integration at the Decision Level • Assumption: • conditional independence between modalities • integration at the unit level

Multiple Synchronous Streams • Assumption: • conditional independence • integration at the state level Two streams in each state: X: state sequence aij : trans. prob. from i to j bj: probability density cjm: mth mixture weight of multivariate GaussianN

Multiple Asynchronous Streams • Assumption: • conditional independence • integration at the unit level Decoding: individual best state sequences for audio and video

Composite HMM definition 4 7 9 8 2 5 1 3 6 Speech-noise decomposition (Varga & Moore, 1993) Audio-visual decomposition (Dupont & Luettin, 1998)

Stream Clustering

AVSR System • 3-state HMM with 12 mixture components, 7-state HMM for composite model • context dependent phone models (silence, short pause), tree-based state clustering • cross-word context dependent decoding, using lattices computed at IBM • trigram language model • global stream weights in multi stream models, estimated on held out set

Speaker independent word recognition

Conclusions • AV 2 Stream asynchronous model beats other models in noisy conditions Future directions: • Transition matrices: context dependent, pruning transitions with low probability, cross-unit asynchrony • Stream weights: model based, discriminative • Clustering: taking stream-tying into account

Phone Dependent Weighting Dimitra Vergyri

Weight Estimation Hervé Glotin

Database and Visual Front End

Database and Visual Front End

Presentation Transcript

PBA Front-End

PBA Front-End

PBA Front-End

TEAM FRONT END

COLD FRONT END

Front End and Back End analog subsystems

Front End Slides

PBA Front-End

FRONT-END

PLC Front-end

Front End Issues

FRONT-END and HV

Front-end node

Front-end characteristics

Karo Front End

The Front End

FRONT-END

Front-end quality database

Front End Processes

The Front End

Front End Electronics

Front End Loader