1 / 16

Outline

Motivations Objectives QAST 2007 Tasks Participants Results QAST 2008 Conclusion. Outline. QAST Organization. Evaluation campaign is jointly organized by : UPC, Spain (J. Turmo, P. Comas) Coordinator ELDA, France (N. Moreau, C. Ayache, D. Mostefa)

wanda
Télécharger la présentation

Outline

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Motivations Objectives QAST 2007 Tasks Participants Results QAST 2008 Conclusion Outline

  2. QAST Organization Evaluation campaign is jointly organized by : • UPC, Spain (J. Turmo, P. Comas) Coordinator • ELDA, France (N. Moreau, C. Ayache, D. Mostefa) • LIMSI, France (S. Rosset, L. Lamel)

  3. Much of human interaction is via spoken language QA research developed techniques for written texts with correct syntactic and semantic structures Spoken data is very different from textual data Speech phenomena, false starts, speech corrections, truncated words, etc Grammatical structure of spontenous speech is very particular No punctuation and no capitalization For meetings, interaction creates run-on sentences where the distance between the first part and the last one can be very long Motivations

  4. In general, motivating and driving the design of novel and robust factual QA architectures for automatic speech transcriptions. Comparing the performances systems dealing with both types of transcriptions and both types of questions (fatual and definitional). Measuring the loss of each system due to ASR. Measuring the loss of each system due to the ASR output degradation. Objectives

  5. Corpus: The CHIL corpus: 25 seminars of 1 hour each Spontenous speech English spoken by non native speakers Domain of lectures: Speech and language processing Manual transcription done by ELDA Automatic transcription provided by LIMSI The AMI corpus: 168 meetings (100 hours) Spontenous speech English Domain of meetings: Design of television remote control Manual transcription done by AMI Automatic transcription provided by AMI 4 tasks: T1 : QA in manual transcriptions of lectures T2 : QA in automatic transcriptions of lectures T3 : QA in manual transcriptions of meetings T4 : QA in automatic transcriptions of meetings QAST 2007: Resources and tasks

  6. For each task, 2 sets of questions were provided: Development set: Lectures: 10 seminars, 50 questions Meetings: 50 meetings, 50 questions Evaluation set: Lectures: 15 seminars, 100 questions Meetings: 118 meetings, 100 questions Factual questions. No definition questions. Expected answers = named entities. List of NEs: person, location, organization, language, system/method, measure, time, color, shape, material. QAST 2007 : development and evaluation

  7. Assessors used QASTLE, an evaluation tool developed by ELDA, to evaluate the data. QAST 2007: Human judgment

  8. Four possible judgments: Correct Incorrect Non-Exact Unsupported Two metrics were used: Mean Reciprocal Rank (MRR): measures how well ranked is a right answer. Accuracy: the fraction of correct answers ranked in the first position in the list of 5 possible answers Participants could submit up to 2 submissions per task and 5 answers per question. Task: Scoring

  9. Five teams submitted results for one or more QAST tasks: CLT, Center for Language Technology, Australia ; DFKI, Germany ; LIMSI, Laboratoire d’Informatique et de Mécanique des Sciences de l’Ingénieur, France ; TOKYO, Tokyo Institute of Technology, Japan ; UPC, Universitat Politècnica de Catalunya, Spain. In total, 28 submission files were evaluated: Participants

  10. Results for CHIL lectures (T1 and T2)

  11. Results for AMI meetings (T3 and T4)

  12. Extension of QAST 2007: 3 languages: French, English, Spanish 4 domains: Broadcast news, Parliament speeches, Lectures, Meetings Different level of WERs (10%, 20% and 30%) Factual and Definition questions 5 corpora CHIL lectures AMI meetings TC-STAR05 EPPS English corpus TC-STAR05 EPPS Spanish corpus ESTER French broadcast news corpus Evaluation from June 15-June 30 QAST 2008

  13. T1a: Question Answering in manual transcriptions of lectures (CHIL corpus) T1b: Question Answering in automatic transcriptions of lectures (CHIL corpus) T2a: Question Answering in manual transcriptions of meetings (AMI corpus) T2b: Question Answering in automatic transcriptions of meetings (AMI corpus) T3a: Question Answering in manual transcriptions of broadcast news for French (ESTER corpus) T3b: Question Answering in automatic transcriptions of broadcast news for French (ESTER corpus) T4a: Question Answering in manual transcriptions of European Parliament Plenary sessions in English (EPPS English corpus) T4b: Question Answering in automatic transcriptions of European Parliament Plenary sessions in English (EPPS English corpus) T5a: Question Answering in manual transcriptions of European Parliament Plenary sessions in Spanish (EPPS Spanish corpus) T5b: Question Answering in automatic transcriptions of European Parliament Plenary in Spanish (EPPS Spanish corpus) QAST 2008 tasks

  14. We presented the Question Answering on Speech Transcripts evaluation campaigns framework QAST 2007 5 participants from 5 different countries (France, Germany, Spain, Australia and Japan)  28 runs Encouraging results High loss in accuracy with ASR output Conclusion and future work (1/2)

  15. QAST 2008 is an extension of QAST 2007 (3 languages, 4 domains, definition and factual questions, multiple ASR outputs with different WERs) It’s still time to join QAST 2008 (participation is free) Future work aims at including: Cross lingual tasks, Oral questions, Other domains. Conclusion and future work (2/2)

  16. The QAST Website: http://www.lsi.upc.edu/~qast/ For more information

More Related