1 / 39

Belief Updating in Spoken Dialog Systems

Belief Updating in Spoken Dialog Systems. Dan Bohus www.cs.cmu.edu/~dbohus dbohus@cs.cmu.edu Computer Science Department Carnegie Mellon University Pittsburgh, PA, 15217. problem. stems mostly from speech recognition spans most domains and interaction types.

bernad
Télécharger la présentation

Belief Updating in Spoken Dialog Systems

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Belief Updating in Spoken Dialog Systems Dan Bohus www.cs.cmu.edu/~dbohus dbohus@cs.cmu.edu Computer Science Department Carnegie Mellon University Pittsburgh, PA, 15217

  2. problem • stems mostly from speech recognition • spans most domains and interaction types spoken language interfaces lack robustness when faced with understanding errors.

  3. more concretely … S: What city are you leaving from? U: Urbana Champaign [OKAY IN THAT SAME PAY] S: Sorry I’m not sure I understood what you said. What city are you leaving from ? U: Urbana Champaign [FOR MINUTE SINCE HEY] S: I’m still having trouble understanding you. To go on, I need you to answer the following question. What city are you leaving from? U: Chicago [CHICAGO] S: traveling from Chicago. Where would you like to go? U: Huntsville [SEOUL] S: traveling to Seoul. What day did you need to travel? U: no no I’m traveling to Birmingham [THE TRAVELING TO BERLIN P_M] S: traveling in the afternoon. Okay what day would you be departing chicago U: /uh/ the tenth of august [AT THE TENTH OF AUGUST] S: a flight tomorrow… I have a Korean Airlines flight departing Chicago at 1:40pm, arrives Seoul at 5pm the next day. Do you want to take that? U: my destination is Birmingham [FLIGHT DESTINATION MR WEEKEND] S: traveling on Saturday, August 12th … I have a flight departing Chicago at 1:40pm arrives Seoul at ………

  4. NON understanding MIS understanding non- and misunderstandings S: What city are you leaving from? U: Urbana Champaign [OKAY IN THAT SAME PAY] S: Sorry I’m not sure I understood what you said. What city are you leaving from ? U: Urbana Champaign [FOR MINUTE SINCE HEY] S: I’m still having trouble understanding you. To go on, I need you to answer the following question. What city are you leaving from? U: Chicago [CHICAGO] S: traveling from Chicago. Where would you like to go? U: Huntsville [SEOUL] S: traveling to Seoul. What day did you need to travel? U: no no I’m traveling to Birmingham [THE TRAVELING TO BERLIN P_M] S: traveling in the afternoon. Okay what day would you be departing chicago U: /uh/ the tenth of august [AT THE TENTH OF AUGUST] S: a flight tomorrow… I have a Korean Airlines flight departing Chicago at 1:40pm, arrives Seoul at 5pm the next day. Do you want to take that? U: my destination is Birmingham [FLIGHT DESTINATION MR WEEKEND] S: traveling on Saturday, August 12th … I have a flight departing Chicago at 1:40pm arrives Seoul at ………

  5. approaches for increasing robustness • gracefully handle errors through interaction • fix recognition • detect the problems • develop a set of recovery strategies • know how to choose between them (policy)

  6. misunderstandings non-understandings detection strategies policy six not-so-easy pieces …

  7. belief updating • construct more accurate beliefs by integrating information over multiple turns misunderstandings detection S: Where would you like to go? U: Huntsville [SEOUL / 0.65] destination = {seoul/0.65} S: traveling to Seoul. What day did you need to travel? U: no no I’m traveling to Birmingham [THE TRAVELING TO BERLIN P_M / 0.60] destination = {?}

  8. belief updating: problem statement • given: • an initial belief Pinitial(C) over concept C • a system action SA • a user response R • construct an updated belief: • Pupdated(C) ← f (Pinitial(C), SA, R) destination = {seoul/0.65} S: traveling to Seoul. What day did you need to travel? [THE TRAVELING TO BERLIN P_M / 0.60] destination = {?}

  9. outline • related work • a restricted version • data • user response analysis • experiments and results • some caveats and future work related work : restricted version : data : user response analysis : experiment & results : caveats & future work

  10. confidence annotation + heuristic updates • confidence annotation • traditionally focused on word-level errors [Chase, Cox, Bansal, Ravinshankar] • more recently: semantic confidence annotation [Walker, San-Segundo, Bohus] • machine learning approach • results fairly good, but not perfect • heuristic updates • explicit confirmation: no → don’t trust ; yes → trust • implicit confirmation: no → don’t trust ; o/w → trust • suboptimal for several reasons related work : restricted version : data : user response analysis : experiment & results : caveats & future work

  11. correction detection • detect if the user is trying to correct the system [Litman, Swerts, Hirschberg, Krahmer, Levow] • machine learning approach • features from different knowledge sources in the system • results fairly good, but not perfect related work : restricted version : data : user response analysis : experiment & results : caveats & future work

  12. integration • confidence annotation and correction detection are useful tools • but separately, neither solves the problem • bridge together in a unified approach to accurately track beliefs related work : restricted version : data : user response analysis : experiment & results : caveats & future work

  13. outline • related work • a restricted version • data • user response analysis • experiments and results • some caveats and future work related work : restricted version : data : user response analysis : experiment & results : caveats & future work

  14. belief updating: general form • given: • an initial belief Pinitial(C) over concept C • a system action SA • a user response R • construct an updated belief: • Pupdated(C) ← f (Pinitial(C), SA, R) related work : restricted version : data : user response analysis : experiment & results : caveats & future work

  15. restricted version: 2 simplifications • compact belief • system unlikely to “hear” more than 3 or 4 values • single vs. multiple recognition results • in our data: max = 3 values, only 6.9% have >1 value • confidence score of top hypothesis • updates after confirmation actions • reduced problem • ConfTopupdated(C) ← f (ConfTopinitial(C), SA, R) related work : restricted version : data : user response analysis : experiment & results : caveats & future work

  16. outline • related work • a restricted version • data • user response analysis • experiments and results • some caveats and future work related work : restricted version : data : user response analysis : experiment & results : caveats & future work

  17. data • collected with RoomLine • a phone-based mixed-initiative spoken dialog system • conference room reservation • search and negotiation • explicit and implicit confirmations • confidence threshold model (+ some exploration) • unplanned implicit confirmations • I found 10 rooms for Friday between 1 and 3 p.m. Would like a small room or a large one? • I found 10 rooms for Friday between 1 and 3 p.m. Would like a small room or a large one? related work : restricted version : data : user response analysis : experiment & results : caveats & future work

  18. corpus • user study • 46 participants (naïve users) • 10 scenario-based interactions each • compensated per task success • corpus • 449 sessions, 8848 user turns • orthographically transcribed • rich annotation: correct concepts, corrections, etc. related work : restricted version : data : user response analysis : experiment & results : caveats & future work

  19. outline • related work • a restricted version • data • user response analysis • experiments and results • some caveats and future work related work : restricted version : data : user response analysis : experiment & results : caveats & future work

  20. user response types • following Krahmer and Swerts • study on Dutch train-table information system • 3 user response types • YES: yes, right, that’s right, correct, etc. • NO: no, wrong, etc. • OTHER • cross-tabulated against correctness of confirmations related work : restricted version : data : user response analysis : experiment & results : caveats & future work

  21. ~10% user responses to explicit confirmations • from transcripts [numbers in brackets from Krahmer&Swerts] • from decoded related work : restricted version : data : user response analysis : experiment & results : caveats & future work

  22. other responses to explicit confirmations • ~70% users repeat the correct value • ~15% users don’t address the question • attempt to shift conversation focus related work : restricted version : data : user response analysis : experiment & results : caveats & future work

  23. user responses to implicit confirmations • Transcripts [numbers in brackets from Krahmer&Swerts] • Decoded related work : restricted version : data : user response analysis : experiment & results : caveats & future work

  24. ignoring errors in implicit confirmations • users correct later (40% of 118) • users interact strategically • correct only if essential related work : restricted version : data : user response analysis : experiment & results : caveats & future work

  25. outline • related work • a restricted version • data • user response analysis • experiments and results • some caveats and future work related work : restricted version : data : user response analysis : experiment & results : caveats & future work

  26. machine learning approach • need good probability outputs • low cross-entropy between model predictions and reality • cross-entropy = negative average log posterior • logistic regression • sample efficient • stepwise approach → feature selection • logistic model tree for each action • root splits on response-type related work : restricted version : data : user response analysis : experiment & results : caveats & future work

  27. features. target. • initial situation • initial confidence score • concept identity, dialog state, turn number • system action • other actions performed in parallel • features of the user response • acoustic / prosodic features • lexical features • grammatical features • dialog-level features • target: was the value correct? related work : restricted version : data : user response analysis : experiment & results : caveats & future work

  28. baselines • initial baseline • accuracy of system beliefs before the update • heuristic baseline • accuracy of heuristic rule currently used in the system • oracle baseline • accuracy if we knew exactly when the user is correcting the system related work : restricted version : data : user response analysis : experiment & results : caveats & future work

  29. results: explicit confirmation Hard error (%) Soft error related work : restricted version : data : user response analysis : experiment & results : caveats & future work

  30. results: implicit confirmation Hard error (%) Soft error related work : restricted version : data : user response analysis : experiment & results : caveats & future work

  31. results: unplanned implicit confirmation Hard error (%) Soft error related work : restricted version : data : user response analysis : experiment & results : caveats & future work

  32. informative features • initial confidence score • prosody features • barge-in • expectation match • repeated grammar slots • concept id related work : restricted version : data : user response analysis : experiment & results : caveats & future work

  33. outline • related work • a reduced version. approach • data • user response analysis • experiments and results • some caveats and future work related work : restricted version : data : user response analysis : experiment & results : caveats & future work

  34. eliminate simplification 1 • current restricted version • belief = confidence score of top hypothesis • only 6.9% of cases had more than 1 hypothesis • extend to • Nhypotheses + 1 (other), where N is a small integer (2 or 3) • approach: multinomial generalized linear model • use information from multiple recognition hypotheses related work : restricted version : data : user response analysis : experiment & results : caveats & future work

  35. eliminate simplification 2 • current restricted version • only updates following system confirmation actions • users might correct the system at any point • extend to • updates after all system actions related work : restricted version : data : user response analysis : experiment & results : caveats & future work

  36. misunderstandings non-understandings detection strategies policy shameless self promotion - rejection threshold adaptation - nonu impact on performance [Interspeech-05] - comparative analysis of 10 recovery strategies [SIGdial-05] • wizard experiment • towards learning nonu recovery policies [Sigdial-05]

  37. shameless CMU promotion • Ananlada (Moss) Chotimongkol • automatic concept and task structure acquisition • Antoine Raux • turn-taking, conversation micro-management • Jahanzeb Sherwani • multimodal personal information management • Satanjeev Banerjee • meeting understanding • Stefanie Tomko • universal speech interface • Thomas Harris • multi-participant dialog • DoD / Young Researchers’ Roundtable

  38. thankyou!

  39. a more subtle caveat • distribution of training data • confidence annotator + heuristic update rules • distribution of run-time data • confidence annotator + learned model • always a problem when interacting with the world • hopefully, distribution shift will not cause large degradation in performance • remains to validate empirically • maybe a bootstrap approach?

More Related