1 / 40

Speech, Natural Language, & Affect in Tutorial Dialogue Systems

This article discusses the use of speech and natural language in tutorial dialogue systems, highlighting the challenges and advantages. It also explores the integration of affective computing in these systems. The author provides examples and outlines a research approach for speech-based computer tutors.

ehorner
Télécharger la présentation

Speech, Natural Language, & Affect in Tutorial Dialogue Systems

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Speech, Natural Language, & Affect in Tutorial Dialogue Systems Prof. Diane J. Litman Computer Science Department, Intelligent Systems Program, & Learning Research and Development Center http://www.cs.pitt.edu/~litman

  2. A few words about me… • Currently • Professor in CS and ISP • Research Scientist at LRDC • ITSPOKE research group (3 PhD students, 1 CS ugrad, 2 postdocs, 1 programmer) • AI Research (speech and natural language, intelligent tutoring) • Discourse and dialogue • Prosody, spoken dialogue systems • Speech and language technology for education (take my spring seminar!) • Reinforcement learning, user simulation • Affective computing • AI and education • Cognitive science • Previously • Member Technical Staff, AT&T Labs Research, NJ • Assistant Professor, CS at Columbia University, NY • AI Research (speech and NLP, knowledge representation and reasoning, plan recognition)

  3. Speech-based Computer Tutors • What are they? • Example • Tutor: Well, if an object has non zero constant velocity, is it moving or staying still? • Student: Moving • Tutor: Yep. If it’s moving, then its position is changing. So then what will happen to the packet’s horizontal displacement from the point of its release? • Student: It will change • Intersection of two fields: • Intelligent Tutoring Systems (ITS) • Spoken Dialogue Systems (SDS)

  4. Intelligent Tutoring Systems (ITS) • Education • Classroom instruction [most frequent form] • Human (one-on-one) tutoring [most effective form] • Computer tutors – Intelligent Tutoring Systems • Not as good as human tutors • Ways to address the performance gap • Language technologies • Text-based dialogue • Talking heads • Speech-based dialogue: react to how in addition to what • Affective computing

  5. Adding speech to ITS • Spoken Dialogue Systems (SDS) • Systems that interact with users via speech • Advantages • Naturalness • Efficiency • Eye and hands free • Domains • Information access [Raux et al., 2005; Rudnicky et al., 1999; Zue et al., 2000] • Tutoring [Graesser et al., 2001; Litman and Silliman, 2004; Pon-Barry et al., 2006] • Assistants [Allen et al., 2001; Rayner et al., 2005; Acomb et al., 2007]

  6. Challenges in ITS • What does it mean to teach a subject? • What to teach? • Designing instruction • Delivering instruction • Understanding the human learning process

  7. Challenges in SDS • Automated speech recognition (ASR) • Sphinx, Microsoft Speech, Dragon Naturally Speaking • Natural language understanding (NLU) • Dialogue Management (DM) • How to keep the conversation going? Best strategy? • How to detect errors in communication? • How to recover from errors? • Spoken language generation

  8. Outline • ITSPOKE • Main research tools & projects • Comparing systems • Modeling learning • Interactions between phenomena • Other projects

  9. How to do research in speech-based computer tutors • ITSPOKE (Intelligent Tutoring SPOKEn Dialogue System) [Litman and Silliman, 2004] • Speech-enabled version of the Why2-Atlas computer tutor [VanLehn, Jordan, Rose et al., 2002] • Domain: Qualitative physics • Sample ITSPOKE problem • Suppose a man is in a free-falling elevator and is holding his keys motionless right in front of his face. He then lets go. What will be the position of the keys relative to the man's face as time passes? Explain.

  10. Back-end is Why2-Atlas system [VanLehn, Jordan, Rose et al. 2002] • Sphinx2 speech recognition and Cepstral text-to-speech

  11. Back-end is Why2-Atlas system [VanLehn, Jordan, Rose et al. 2002] • Sphinx2 speech recognition and Cepstral text-to-speech

  12. Back-end is Why2-Atlas system [VanLehn, Jordan, Rose et al. 2002] • Sphinx2 speech recognition and Cepstral text-to-speech

  13. Human-Computer Excerpt Tutor26: Let me ask my question in a more general way. What happens to the velocity of a body when there is no force acting on it? Student27: dammit (ASR: it is) Tutor28 : Could you please repeat that? Student29 : same (ASR: i same) Tutor30 : Right. And what happens to the HORIZONTAL velocity of a body when there is no HORIZONTAL force acting on it? Student31: zero (ASR: the zero) Tutor32 : Well... OK, let's consider how forces in the vertical direction affect the motion of a body in a horizontal direction in a different scenario <…omitted…> Student33: oh gravity you already said this (ASR: beats gravity you're exert less) Tutor34: Fine. Are there any other forces acting on the apple as it falls? Student35: no why are you doing this again (ASR: no y and to it yes) Tutor36: Correct. The only force acting on the apple as it falls to the earth is the earth's gravitational force. In which direction does gravitational force act? Student37: downward you computer (ASR: downward you computer)

  14. Problem Essay Q1 Q2 Q3 Dialogue with ITSPOKE How ITSPOKE/WHY works • Simplified conversation structure • Question-answer format • Tutoring information authored in a hierarchical structure - KCDs[VanLehn, Jordan, Rosé et al, 2002]

  15. ESSAY SUBMISSION & ANALYSIS ITSPOKE behavior Q1 Q2 Q5 Q3 Q4 Remediation subdialogue

  16. Sample KCD (Knowledge Construction Dialogue)

  17. Outline • ITSPOKE • Main research tools & projects • Comparing systems • Modeling learning • Interactions between phenomena • Other projects

  18. Comparing systems • Metrics • Subjective metrics • Questionnaire at the end – agreement with statements like: • “It was easy to learn from the tutor” • “I enjoyed working with the tutor” • “It was easy to loose track of where I was in the conversation” • Problems • Unreliable • Need for standardization (psychometrics)

  19. PreTest PostTest Learning Comparing systems (2) • Objective metrics • Learning (gain) • Time spent with the computer tutor • Correctness of student answers • Number of help/hint requrests • Speech recognition performance

  20. Comparing systems - projects • Speech vs. typed tutoring [Litman, Rose, Forbes-Riley et al., 2006] • Human-speech vs. Human-typed • Better posttest scores • Shorter instruction time • Computer-speech (ITSPOKE) vs. Computer-typed (WHY) • No improvement in learning • No degradation due to ASR • Prerecorded vs. Synthesized [Forbes-Riley et al., 2006] • No degradation due to synthesized voice • High ASR errors -> prefer synthesized more • Navigation Map ITSPOKE vs. regular ITSPOKE[Rotaru and Litman, 2007; Rotaru, PhD thesis] • Affective ITSPOKE vs. regular ITSPOKE[Forbes-Riley, Litman et. al 2008a, 2008b, in progress]

  21. The Navigation Map • Outline of the conversation • Theoretical concept: discourse structure • Hypotheses • Easier for users to follow the conversation • Preferred / learn more

  22. Discourse segment hierarchy Discourse segments Intention/purpose structure Solution walkthrough TUTOR1: To analyze this problem we will first describe the motion of the person and his keys while he is holding them. Then we will look at the motion of the person and his keys after he lets go of them. Let's begin by looking at the motion of the man and his keys while he is holding them. How does his velocity compare to that of his keys? USER1: it’s the same TUTOR2: Yes. So while he is holding his keys they both have the same velocity. Now let's see what happens when he releases them. To do so we will first find the individual forces exerted on the man and his keys, then both net forces, both accelerations, and finally we will describe and compare the velocities. So what are the forces exerted on the man after he releases his keys? Please, specify their directions (e.g., vertically up). USER2: vertically down gravity TUTOR3: Good. Because there is just one force, gravity's, it is trivial to determine the NET force (i.e., the vector sum of all forces). So what is the direction of the NET force on the person? .............. Two time frames: before release, after release Before release Man’s velocity ? keys’ velocity After release Recipe: Forces  Net force  Acceleration  Velocity Man: Forces/acceleration Forces on the man Net force on the man …………. …………. ………….

  23. Experiment procedure Problem 1 Problem 2 NM noNM NMSurvey Read Pretest Posttest Interview Questionnaire Questionnaire noNM NM S condition F condition Differences due to NM Experiment design • Within-subjects design • 1 problem with the NM; 1 without the NM (noNM) • Rate tutor after each problem • 16 questions, 1 (Strongly Disagree) – 5 (Strongly Agree) scale • Two conditions (to account for order and problem) • F (First) : 1st problem NM; 2nd problem noNM • S (Second) : 1st problem noNM; 2nd problem NM

  24. Experiment design (2) • ITSPOKE dialogue history was disabled • Compare Audio-Only versus Audio+Visual (NM) NM noNM

  25. Results – subjective metrics • NM trend/significant effects on system perception during the dialogue: Rating scale 1 - Strongly Disagree ……. 5 - Strongly Agree

  26. Outline • ITSPOKE • Main research tools & projects • Comparing systems • Modeling learning • Interactions between phenomena • Other projects

  27. … correctness … time spent PreTest PostTest Learning Modeling learning • Problem: What contributes to/causes learning? • Correlations with learning • Events that significantly correlate with learning • Does not imply causality but it is a requirement for it • What events to measure?

  28. What events? • Time on task (+), number of student words (+)[Litman, Rose, Forbes-Riley et al., 2006] [Forbes-Riley, Rotaru and Litman, 2008] • Student emotions [Forbes-Riley, Rotaru and Litman, 2008] • Neutral on certainty (-) • Neutral on frustration (-) • Type of turns – on human-human [Forbes-Riley et al., 2005] • Student: introduce new concept (+) • Tutor: control dialogue (-) • Discourse structure inspired parameters[Rotaru and Litman, 2006] • Computational implications?

  29. …………… …………… …………… …………… …………… …………… …………… …………… …………… …………… …………… …………… …………… …………… Intuition 1 – Conditioning Student learned? Correctness: Incorrect Correct Correct Incorrect Correct Incorrect Incorrect Correct Correct Incorrect Incorrect Correct Correct Correct • It is more important to be correct at specific “places in the dialogue”. • Phenomena related to performance: • not uniformly important across the dialogue • have more weight at specific places in the dialogue. • Discourse structure can be used to define “places in the dialogue”

  30. Q1 Q2 Q3 Q2.1 Q2.2 Intuition 1 - Results • Transition – correctness parameters Correctness • PopUp–Correct, PopUp–Incorrect • Interpretation: Capture successful learning events or failed learning opportunities • Generalizes across corpora • ITSPOKE modification: engage in an additional remediation dialogue

  31. …………… …………… …………… …………… …………… …………… …………… …………… …………… …………… …………… …………… …………… …………… …………… …………… …………… …………… …………… …………… …………… …………… …………… …………… …………… …………… …………… …………… Intuition 2 – Discrimination Student that learned more Student that learned less Different discourse structure

  32. Intuition 2 - Results • Transition – Transition parameters Q1 Q2 Q3 Q2.1 Q2.2 • Push–Push • Interpretation: system uncovers potential major knowledge gaps Q2.1.1 Q2.1.2

  33. Other events • Psychology inspired • Models of reading comprehension – Landscape Model[Ward and Litman, 2005] • Alignment model – lexical and prosodic convergence[Ward and Litman, 2007a, 2007b] • NLP inspired • Cohesion – lexical co-occurrence [Ward and Litman, 2006]

  34. Q1 Q2 Q3 Q2.1 Q2.2 From Correlations to Causality • Correlation does not imply causality • But can inform modifications • E.g. more instruction after PopUp-Incorrect events • E.g. different instruction depending on student uncertainty Incorrect  more tutoring

  35. Outline • ITSPOKE • Main research tools & projects • Comparing systems • Modeling learning • Interactions between phenomena • Other projects

  36. Interactions between phenomena • Things interact in a dialogue • Student correctness  tutor reply • Student emotion  tutor reply • Why look for interactions? • Capture human tutor behavior • Extract new patterns • Allow us to formulate hypotheses • How to find interactions? • Dependency tests: χ2 (Chi-Square) • Example with 2 windows

  37. Projects • Certainty  human tutor reply [Forbes-Riley and Litman, 2005] • Student uncertainty associated with • Increase in Bottom-up replies • Decrease in Expansions • Student certainty associated with • Increase in Restatements • Speech recognition errors [Rotaru and Litman, 2005, 2006a, 2006b] • Speech recognition errors  Next student state • Increase in frustration • Student State  Speech recognition errors • Incorrect, Uncertain, Frustrated  more speech errors • Discourse Structure  Speech recognition errors

  38. Other projects • Affective computing (Kate Forbes-Riley’s postdoc) • Emotion prediction • What are the important emotions in tutoring • How to predict them • Emotion adaptation/handling • Model human tutor behavior • Formulate hypotheses from empirical analysis • Reinforcement Learning and User Modeling • System learns best way to react from rewards (Min Chi’s PhD) • Needs a lot of data -> user simulations (Hua Ai’s PhD)

  39. Resources • Recommended classes • Introduction to Natural Language Processing • Foundations of Artificial Intelligence • Machine Learning • Knowledge Representation • Seminar classes • Advance Topics in Artificial Intelligence (Speech and Language Technology for Educational Applications (this spring!), Affective Spoken Dialogue Systems, Spoken Dialogue Systems, etc.) • Other resources • ITSPOKE Group Meetings • NLP @ Pitt • DoD @ CMU • YRRSDS • ISP Forum • PSLC

  40. Further information • Visit my homepage and talk with me • http://www.cs.pitt.edu • Take my seminar (CS 3710), projects course (CS 2002) • Talk with members of the ITSPOKE group • http://www.cs.pitt.edu/~litman/itspoke.html

More Related