1 / 6

Country Report - Singapore

Country Report - Singapore. Haizhou Li Institute for Infocomm Research, Singapore Eng Siong Chng Nanyang Technological University, Singapore. November 2010, Oriental COCOSDA , Nepal. 1. Smart Meeting Room (I 2 R).

tadita
Télécharger la présentation

Country Report - Singapore

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Country Report - Singapore Haizhou Li Institute for Infocomm Research, Singapore Eng Siong Chng Nanyang Technological University, Singapore November 2010, Oriental COCOSDA , Nepal

  2. 1. Smart Meeting Room (I2R) • Collecting meeting speech data in participation of NIST Speaker Diarization and Rich Transcription evaluation campaign • Users are able to select the audio/video channels for recording • Audio/video signals from multi-channels are captured synchronously • Each meeting session is stored with metadata including: time and date, duration, number of participants, audio channels, video channels, meeting types (presentation, discussion, spontaneous, scenarios, etc. ) • Dedicated storage to facilitate data transferring and handling.

  3. 1. Smart Meeting Room (setup)

  4. 2. SEAME (NTU)- a Mandarin-English Code-switch Speech Corpus in South-East Asia • Motivation: For code-switching speech recognition research (LVCSR, LID etc) in South East Asia (SEA) • Goal: To develop 50 hours of intra-sentential Mandarin/English spontaneous code-switching speech in two locations: • Singapore: Nanyang Technological University (NTU) • Malaysia: Universiti Sains Malaysia (USM) • Process: • Recording (interview and conversations) • Transcription (languages, discourse particle and non-speech signal, other languages, proper noun and short pause) • Language ID boundary labeling • Status: 25 hours of code-switching speech by Oct 2010 • Reference: Dau-Cheng Lyu, Tien-Ping Tan, Eng-Siong Chng, Haizhou Li, "SEAME: a Mandarin-English Code-switching Speech Corpus in South-East Asia", in Interspeech, Japan, September 2010

  5. 2. SEAME (summary) ~25 hours (intra-sentential code-switching speech) ~20K utterances, 89 speakers 2.5 language turns of each intra-sentential code-switching speech M AN ENGMAN ENGMANENG Ex. (5 times of language turns) • Speaking rate is measured by number of words per minute on avg.. • Number of turns is measured the counts of each utterance occurring • ENG-MAN and MAN-ENG on avg.

  6. 3. ASR and TTS corpus (I2R)

More Related