1 / 12

Is This Conversation on Track?

Is This Conversation on Track?. Utterance Level Confidence Annotation in the CMU Communicator spoken dialog system Presented by: Dan Bohus (dbohus@cs.cmu.edu) Work by: Paul Carpenter, Chun Jin, Daniel Wilson, Rong Zhang, Dan Bohus, Alex Rudnicky Carnegie Mellon University – 2001.

odakota
Télécharger la présentation

Is This Conversation on Track?

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Is This Conversation on Track? Utterance Level Confidence Annotation in the CMU Communicator spoken dialog system Presented by: Dan Bohus (dbohus@cs.cmu.edu) Work by: Paul Carpenter, Chun Jin, Daniel Wilson, Rong Zhang, Dan Bohus, Alex Rudnicky Carnegie Mellon University – 2001

  2. Outline • The Problem. The Approach • Training Data and Features • Experiments and Results • Conclusion. Future Work Is This Conversation on Track ?

  3. The Problem • Systems often misunderstand, take misunderstanding as fact, and continue to act using invalid information • Repair costs • Increased dialog length • User Frustration • Confidence annotation provides critical information for effective confirmation and clarification in dialog systems. Is This Conversation on Track ?

  4. The Approach • Treat the problem as a data-driven classification task. • Objective: accurately label misunderstood utterances. • Collect a training corpus. • Identify useful features. • Train a classifier ~ identify the best performing one for this task. Is This Conversation on Track ?

  5. Data • Communicator Logs & Transcripts: • Collected 2 months (Oct, Nov 1999). • Eliminated conversations with < 5 turns. • Manually labeled OK (67%) / BAD (33%)BAD ~ RecogBAD / ParseBAD / OOD / NONSpeech • Discarded mixed-label utterances (6%). • Cleaned corpus of 4550 utterances / 311 dialogs. Is This Conversation on Track ?

  6. Feature Extraction 12 Features from various levels: • Decoder Features: • Word Number, Unconfident Percentage • Parsing Features: • Uncovered Percentage, Fragment Transitions, Gap Number, Slot Number, Slot Bigram • Dialog Features: • Dialog State, State Duration, Turn Number, Expected Slots • Garble:handcrafted heuristic currently used by the CMU Communicator Is This Conversation on Track ?

  7. Experiments with 6 different classifiers • Decision Tree • Artificial Neural Network • Naïve Bayes • Bayesian Network • Several network structures attempted • AdaBoost • Individual feature-based binning estimators as weak learners, 750 boosting stages • Support Vector Machines • Dot, Polynomial, Radial, Neural, Anova Is This Conversation on Track ?

  8. Evaluating performance • Classification Error Rate (FP+FN) • CDR = 1-Fallout = 1-(FP/NBAD) • Cost of misunderstanding in dialog systems depends on • Error type (FP vs. FN) • Domain • Dialog state • Ideally, build a cost function for each type of error, and optimize for that Is This Conversation on Track ?

  9. Results – Individual Features • Baseline error 32.84% (when predicting the majority class) • All experiments involved 10-fold cross-validation Is This Conversation on Track ?

  10. Results – Classifiers • T-Test showed there is no statistically significant difference between the classifiers except for the Naïve Bayes • Explanation: independence between feature assumption is violated • Baseline error 25.32% (GARBLE) Is This Conversation on Track ?

  11. Future Work • Improve the classifiers • Additional features • Develop a cost model for understanding errors in dialog systems. • Study/optimize tradeoffs between F/P and F/N; • Integrate value and confidence information to guide clarification in dialog systems Is This Conversation on Track ?

  12. Confusion Matrix • FP = False acceptance • FN = False detection/rejection • Fallout = FP/(FP+TN) = FP/NBAD • CDR = 1-Fallout = 1-(FP/NBAD) Is This Conversation on Track ?

More Related