1 / 26

Back Channel Communication

Back Channel Communication. Antoine Raux Dialogs on Dialogs 02/25/2005. Outline. From Back Channel to backchannels Function of the Back Channel Characteristics of the Back Channel The Back Channel in Spoken Dialogue Systems. From back channel….

mbecnel
Télécharger la présentation

Back Channel Communication

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Back Channel Communication Antoine Raux Dialogs on Dialogs 02/25/2005

  2. Outline • From Back Channel to backchannels • Function of the Back Channel • Characteristics of the Back Channel • The Back Channel in Spoken Dialogue Systems

  3. From back channel… • 70s: Conversation Analysts attempt to describe systematic rules for turn-taking management • Goal: minimize gaps and overlaps between speakers • BUT many overlaps in natural speech • E.g.: “mm-hmm”, “okay”, “yeah”… • “Back channel” (Yngve 1970): Parallel channel for communication (Duncan 1972) • “Back channel communication does not constitute a turn or a claim for a turn” • But it “may participate in a variety of communication functions, including the regulation of speaking turns.”

  4. …to backchannels • “Backchannel”: listener-produced signal such as “mm-hmm”, “yeah”…(“To backchannel”: to produce such signals) • Does not imply the will to take the turn • Implies some form of acknowledgment (in general)

  5. Front vs Back Channel

  6. Front-channel cues to back-channel signals • Koiso et al (1998) • Analyze the relationship between different syntactic and prosodic features and the occurrence of backchannels

  7. Koiso et al (Methodology) • Data: 8 dialogs from Japanese Map Task corpus: • replica of the Edinburgh MT • Face-to-face and speech only (no difference) • Features • Syntactic: POS • Duration of last mora (normal/long/short) • F0 pattern of last mora (flat-fall, rise…) • Peak F0 (low/high) • Energy pattern (late-decr, decr, no-decr) • Peak energy (low/high)

  8. Koiso et al (Results) • Frequency of feature values

  9. Koiso et al (Results) • Decision Tree analysis • Compare the loss in performance by not using each feature • POS: single best feature • Prosodic features altogether: as good as POS

  10. Koiso et al (Discussion) • Some POS strongly inhibit BC • Individual prosodic features are not good indicators of BC occurrence • BC occurrence is conditioned by both POS and prosody (as a whole) • What about other languages? • What about BC overlapping with speech?

  11. BC cues in English and Japanese • Ward and Tsukahara (2000) • Tests one hypothesis (“BC are triggered by low pitch cues”) for two languages

  12. The Low Pitch Cue • Both in American English and Japanese, it appears that “after a region of low pitch lasting 110 ms the listener tends to produce back-channel feedback”. • Goal of this paper: quantitatively test this on naturally occurring conversations

  13. Ward and Tsukahara (Methodology) • Data: • English: 8 conversations, 12 speakers (first author participates in 5 conversations!) • Japanese: 18 conversations, 24 speakers • Prediction: • Every 10ms decide BC/no-BC by applying a hand coded rule with 5 parameters tuned to the data

  14. Ward and Tsukahara (Results) • Each predicted BC was considered correct if it fell within 500ms of an actual BC • Low pitch region rule is better than chance both in English and Japanese

  15. Ward and Tsukahara (Results) • Issues: • Evaluation (tolerance window size, speakers produce BCs with different frequencies…) • No actual comparison between languages • Are low pitch regions and BCs simply correlated to other phenomena (syntactic completion, disfluencies…) or is there a direct cause/consequence relationship?

  16. Effects of Native Language and Gender on BC • Feke (2003) • Conversation Analysis study of BC in native-English and native-Spanish, same- and mixed-gender dialogs

  17. Definition of BC • BC: responses of the participant that is “clearly not holding the floor”… • Very loose compared to previous papers: • e.g. “How did you find Quechua?” is a BC • Distinguishes In-Between BC and Overlap BC

  18. Feke (Methodology) • Recorded 8 non-scripted conversations between 8 different speakers (2 native languages x 2 genders x 2 subjects) • Manually coded In-Between BCs and Overlap BCs

  19. Feke (Results) • No differences observed across cultures • Participants of both genders tend to use more BC when conversing with someone of the opposite gender • Difference seems bigger for females than for males

  20. Feke (Discussion) • Interesting/surprising result from the ethnological/sociological point of view • Very few data points, no significance analysis • Only looked at number of BCs • Consequences on SDS? (e.g. using gender information in BC prediction, selecting the gender of an agent…)

  21. BC in Practical Systems… • Takeuchi et al (2003) • Method to determine the timing of turn transitions and aizuchi (≈BC) on Japanese Human-Human corpus

  22. Takeuchi (Approach) • Similar to Koiso et al, but only using automatically extracted features • Every 100 ms decide between: • Take turn • Aizuchi (BC) • Leave turn (wait)

  23. Takeuchi (Approach) • Decision Tree using • Syntax (POS, content/function words) • Utterance duration • Pause duration/pause since last content wd • Content word duration • F0 • Power

  24. Takeuchi (Results) • Precision/Recall of frame classification: • Around 80% on the training set • Less then 50% on a test set • Subjective evaluation: • Artificially insert BC at predicted time • Timing was judged “good” in 70-80% • On real utterances: 72% (!)

  25. Takeuchi (Discussion) • Found that syntactic information did not help (contradicts Koiso?) • Underscores the difficulty of evaluating turn-taking/backchanneling systems

  26. Conclusion • Hard to account for simultaneous turns in conversation • Back Channel framework offers one explanation • But most work remains very specific • Missing a good theory of conversation…

More Related