1 / 1

The contribution of speech recognition confidence and

Speaks. Reads. ASR. Speaks. Listens. Vocoder. User. Untrained operator. The dialogue so far. User utterances in greyscale and operator utterances in black. . Correction field for the judge. N-best list from the speech recognition. Utterance confidence score in parenthesis.

pomona
Télécharger la présentation

The contribution of speech recognition confidence and

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Speaks Reads ASR Speaks Listens Vocoder User Untrained operator The dialogue so far. User utterances in greyscale and operator utterances in black. Correction field for the judge. N-best list from the speech recognition. Utterance confidence score in parenthesis. The contribution of speech recognition confidence and dialogue context to error detection and correction Jens Edlund and Gabriel Skantze Data analysis Experiment design • Experiment: 8 subjects were asked to correct 4x3x15 speech recognition results • Domain: Pedestrian navigation, similar to Map Task • Question: To what extent can a human judge benefit from the information in: • 5-bestlist • Word confidence scores • Dialogue and task context Results Test corpus collection • Wizard of Oz using speech recognition • Let the operator act freely • Experiment with different operators Task: Navigate in a simulated campus Conclusions • The presence of speech recogniser confidence at word level and the 5-best list help subjects to detect and correct errors • A short context, about one utterance, significantly increases subjects’ ability to detect, but not correct, errors • Longer context has no additional effect • The judges very rarely attempted to correct content words, and when they did the results were below average • Syntactic structure is easier to perceive than content words in poor recognitions? • Easier to identify speech act than propositional content? • A properly designed robust parser may correctly identify speech acts where a keyword spotter would fail

More Related