1 / 9

Universal Attribute Characterization for Enhanced Spoken Language Recognition

This study explores a novel approach to spoken language recognition through universal attribute characterization. By proposing an alternative acoustic framework based on acoustic phonetic features, the research introduces a front-end processing module that tokenizes spoken utterances and utilizes universal attributes to define phonetic units. This method enhances robustness and cross-linguistic applicability. Experimental results from training on the OGI-TS corpus and CallFriend corpus demonstrate significant improvements in language modeling, culminating in tests against the NIST 2003 evaluation materials.

kaycee
Télécharger la présentation

Universal Attribute Characterization for Enhanced Spoken Language Recognition

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Exploring Universal Attribute Characterization of Spoken Languages for Spoken Language Recognition

  2. Outline • Introduction • UAR-FrondEnd • VSM-BackEnd • Experiment

  3. Introduction • Here we focus on the token-based. • Propse an alternative universal acoustic characterization of spoken languages based on acoustic phonetic feature. • The advantage of using attribute-based unit is they can be define universally across all language.

  4. System Overview

  5. UAR-FrondEnd • The frond-end processing module tokenize all spoken utterances into sequences of speech unit using a universal attribute recognizer. • Two phoneme-to-attribute table are created that are phoneme-to-manner and phoneme-to-place.

  6. VSM-BackEnd • Each transcription is converted into a vector-based representation by applying LSA.

  7. Experiment • The OGI-TS corpus is used to train the articulatory recognizer. This corpus has phonetic transcriptions for six language.

  8. Experiment • CallFriend corpus is used for training the back-end language models. • Test are carried out on the NIST 2003 spoken language evaluation material.

  9. Experiment

More Related