1 / 69

Realizing the Interactive Speech Interface in a Multi-user Virtual Environment

Realizing the Interactive Speech Interface in a Multi-user Virtual Environment. Advisor Tsai-Yen Li Author Chun-Feng Liao NCCU Department of Computer Science Intelligent Media Lab July 2004. More Abstract / High Level. More Concrete / Low Level. Agenda. Introduction Related Work

Télécharger la présentation

Realizing the Interactive Speech Interface in a Multi-user Virtual Environment

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Realizing the Interactive Speech Interface in a Multi-user Virtual Environment Advisor Tsai-Yen Li Author Chun-Feng Liao NCCU Department of Computer Science Intelligent Media Lab July 2004

  2. More Abstract / High Level More Concrete / Low Level Agenda • Introduction • Related Work • Dialog Management in MUVE (Multi-user Virtual Environment) • The Design of XAML-V Dialog Scripting Language • System Implementation and Design • Conclusion

  3. Agenda • Introduction • Related Work • Dialog Management in MUVE (Multi-user Virtual Environment) • The Design of XAML-V Dialog Scripting Language • System Implementation and Design • Conclusion

  4. Introduction • Applications of 3D virtual environments and voice user interface have received significant attentions recently. • Incorporating VUI into virtual environments can enhance user interaction and immersiveness . • Most related research do not provide an effective mechanism for multi-user dialog management.

  5. Contributions of this Research MUVE = Multi-user Virtual Environment • Suggest a MUVE dialog model based on VoiceXML dialog model. • Propose a way to integrate speech interface into MUVE. • XAML-V : Extend XAML to provide a speech-enabled interactive animation scripting language. • Dealing with implementation problems of XAML-V using software patterns as recipes.

  6. Agenda • Introduction • Related Work • Dialog Management in MUVE (Multi-user Virtual Environment) • The Design of XAML-V Dialog Scripting Language • System Implementation and Design • Conclusion

  7. VUI / VE Integration Problems • [McGlashan 95] identified 3 types of Virtual Environment – VUI integration problems. • Speech Recognition • Language Understanding • Interaction Metaphor • Scott McGlashan is the editor-in-chief of W3C VoiceXML 2.0.

  8. Integration Considerations • Client Interface • Ad hoc [Cernak02] • VRML – EAI - JSAPI[Wauchope03][O.Apaydin02][Descamps01] • Dialog Management • Database : [Wauchope03] • IDE : [Cernak02] • Scripting Language : • Based on VoiceXML: DialogXML [Nyberg02] and Galatea [Sagayama03] • Customize: MPML-VR [Descamps01]

  9. VUI Integration

  10. broadcast broadcast send IMNet – A Client-Server MUVE System IMNet Server IMClient A IMClient B IMClient C

  11. Animation Script Language • Using high-level scripts to control animation characters is not a new idea. • AML focuses on synchronization of facial expression and voice. • lacks the function to extract or modify an existing animation. • STEP can compose new animations from existing animation components. • falls short on specifying detail animation attributes. STEP = Scripting Technology for Embodied Persona AML = Avatar Markup Language

  12. XAML (eXtensible Animation Markup Language) • Describe character animations at various command levels . • Developers can compose a new animation from existing animation clips. • The syntax is extensible by providing plug-in modules.

  13. Server Client VoiceXML • VoiceXML 1.0 was proposed by W3C in 2000. • Used in telephony interactive applications. • Based on HTTP, using a form-based dialog model.

  14. VoiceXML : An Example <vxml version="2.0"> <form> <field name="drink"> <prompt>Would you like coffee, tea, milk, or nothing?</prompt> </field> <block> <submit next="http://www.drink.example.com/drink2.asp"/> </block> </form> </vxml>

  15. VoiceXML Dialog Model Architectural View

  16. Agenda • Introduction • Related Work • Dialog Management in MUVE (Multi-user Virtual Environment) • The Design of XAML-V Dialog Scripting Language • System Implementation and Design • Conclusion

  17. Definitions & Notations • Dialog : Exactly two avatars concentrate on interacting with each other. • Subjects : Avatars in dialog. • Observers : Avatars not in a dialog. • U : Avatars controlled by human. • S : Avatars controlled by system. • Suffix s : Subject avatars. • Suffix i(i=1,2,3,…) : Observer avatars. Ss Us Subjects Ui Observer

  18. VoiceXML Dialog Model • VoiceXML was designed originally for dialogs in telephony systems. • In most cases there are 2 interactive instances in telephony applications.

  19. Problems with VoiceXML Dialog Model in MUVE (1) How is the dialog status with Us ??? Ss Document Server IMNet Server conceptually actually Us

  20. Problems with VoiceXML Dialog Model in MUVE (2) Who is talking with me ??? Document Server Ss actually actually conceptually VRML Browser Us1 Us2 What should I draw ? I’m talking with Ss.

  21. Problems with VoiceXML Dialog Model in MUVE(3) • Conceptually, Us is having Dialog with Ss. • Actually,Us interacting with Document Server which carries Ss’s Dialog script. • VoiceXML is lack of some dialog locking mechanism. • VoiceXML Dialog Model looks unreasonable in MUVE.

  22. Proposed Dialog Model in MUVE • We enhance the originally VoiceXML dialog model to fix this problem. • Proxy Request • Dialog Lock • Dialog State • Dialog Negotiation

  23. Proxy Request Ss Proxy the HTTP request for Us Ss Document Server actually conceptually Us

  24. Benefits of Proxy Request Model • By applying this model in MUVE we have following benefits: • Us didn’t aware of Document Server provides the flexibility to switch different roles. • Ss did aware of dialog status with Us.

  25. Dialogs without Dialog Lock A will confuse if B and C talk to him at the same time. C A B Speech Input to A Speech Output from A It’s impossible for A to accept speech input from multiple avatars at the same time.

  26. Dialog Lock • We suggest only 2 people can be in a dialog at the same time. • Dialog Lock mechanism is used to realize this constraint.

  27. Dialog with Dialog Lock • A is currently in dialog with C A C Dialog Scripts Dialog Lock Broadcasting Scripts Broadcasting Scripts B Speech Input to A Speech Output from A

  28. Dialog States

  29. Initialize a Dialog Enter negotiation-state Send dialog request message Enter negotiation-state Enter in-dialog-state Send dialog accept message Send dialog ack message Enter in-dialog-state Send first xaml-v script Fetch first xaml-v script

  30. Stop a Dialog Enter not-in-dialog-state Send end dialog message Enter not-in-dialog-state

  31. Summary : Proposed Dialog Model in MUVE Ss Us

  32. Agenda • Introduction • Related Work • Dialog Management in MUVE (Multi-user Virtual Environment) • The Design of XAML-V Dialog Scripting Language • System Implementation and Design • Conclusion

  33. The XAML Scripting Language <AnimItem DEF =”WaveWalk” cycle=”2000”> <AnimImport src=”Walk”> <AnimItem DEF=”SimpleWave” cycle=”1000”> <Node target=”r_shoulder”> <OrientationInterpolator key =”…” keyValue=”…” /> </Node> <AnimItem> </AnimItem>

  34. XAML & XAML-V XAML XAML-V

  35. XAML-V Features • Extension of XAML Scripting Language. • Subset of VoiceXML . • Supports form-level and field-level animations. • Realizing the concepts discussed in previous section. • Dialog negotiation • Proxy request • Broadcasting

  36. Nested Plug-in Syntax

  37. Dialog Negotiation

  38. How XAML-V Realize the Proxy Request Model 2. Issue HTTP Command: http://xxx/helloFormResponse.jsp?helpType=no %20thanks Ss Document Server 3. HTTP Response 1. Send proxy request message 4. Return requested dialog script Us

  39. End Dialog

  40. Summary : Benefits of XAML-V • Extended form XAML animation script, XAML-V inherits its strong animation functions. • Can be dynamically generated by various Server-Side Script technologies.(i.e. JSP or ASP) • Dialog model works in MUVE.

  41. Agenda • Introduction • Related Work • Dialog Management in MUVE (Multi-user Virtual Environment) • The Design of XAML-V Dialog Scripting Language • System Implementation and Design • Conclusion

  42. System Design and Implementation • System Architecture • XAML-V Component • Example Scenario • Video DEMO

  43. XAML-V Architecture in MUVE

  44. XAML-V Implementation • XAML Platform delegates XAML-V scripting elements to VoicePluginObject. • Embedded animations are sent back to the Animation Manager.

  45. Implementing XAML-V Components with Software Patterns

  46. XAML-V Components Deployment Diagram

  47. Message Monitor Intercept all the messages passed by server.

  48. Client

  49. Us talk to Ss

  50. Us talk to Us

More Related