1 / 16

Multimodal user interfaces: Implementation

Multimodal user interfaces: Implementation. Chris Vandervelpen chris.vandervelpen@uhasselt.be. Overview. Introduction VoiceXml X+V From models to X + V Demo: ACCESS Netfront Conclusions Questions. Introduction. Focus on speech/direct manipulation on mobile device

isleen
Télécharger la présentation

Multimodal user interfaces: Implementation

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Multimodal user interfaces: Implementation Chris Vandervelpen chris.vandervelpen@uhasselt.be

  2. Overview • Introduction • VoiceXml • X+V • From models to X + V • Demo: ACCESS Netfront • Conclusions • Questions

  3. Introduction • Focus on speech/direct manipulation on mobile device • How can we deploy a multi modal UI • Build our own framework using speech synthesizer/recognizers that interpret the designed models (reinventing the wheel) • Build software that generates standardized markup from the models (use existing technologies)  start point

  4. VoiceXml • Markup language for speech only interfaces • Telephone interfaces • Using grammars for speech recognition • Java Speech Grammar Format (JSGF) • Nuance Grammar Specification Language (NGSL) • Speech output • Synthesis • Prerecorded audio • http://www.voicexml.org

  5. VoiceXml <vxml:form> <vxml:field name=“departure_city“> <vxml:grammar> <![CDATA[ #JSGF V1.0; grammar cities; <city> = brussels | antwerp | amsterdam; ]]> </vxml:grammar> <vxml:prompt> What departure city do you like?? </vxml:prompt> <vxml:catch event="help nomatch noinput"> For example, brussels, antwerp or amsterdam </vxml:catch> <vxml:filled> <vxml:prompt>Your departure city is <vxml:value=“expr=departure_city” /></vxml:prompt> </vxml:filled> </vxml:field> <vxml:field name=“destination_city“> ……… </vxml:field> </vxml:form>

  6. VoiceXml • Mixed-initiative forms • Single user input for several fields • Supports more natural language • For example • I want to fly from “brussels” to “amsterdam” • Filling in departure_city and destination_city fields

  7. X + V • X + V • XHtml: visual channel • VoiceXml snippets: speech channel • Synchronization between modalities using Xml Events • Multimodal browsers supporting X+V • ACCESS Netfront multimodal browser (PocketPC) • Opera • http://www.voicexml.org/specs/multimodal/x+v/12/

  8. X + V <html> <body> <form> <input id=“from” name=“from” size=“20” ev:event=“inputfocus”ev:handler=“#voice_city_from” /> <input id=“to” name=“to” size=“20” ev:event=“inputfocus” ev:handler=“#voice_city_to” /> </form> </body> </html>

  9. X + V <vxml:form id=“voice_city”> <vxml:field name=“departure_city_field“ id=“voice_city_from”> <vxml:grammar> <![CDATA[ #JSGF V1.0; grammar cities; <city> = brussels | antwerp | amsterdam; ]]> </vxml:grammar> <vxml:prompt> What departure city do you like?? </vxml:prompt> <vxml:catch event="help nomatch noinput"> For example, brussels, antwerp or amsterdam </vxml:catch> <vxml:filled> <vxml:assign name=“document.getElementById(‘from)” expr=“departure_city” /> </vxml:filled> </vxml:field> <vxml:field name=“destination_city_field“ id=“voice_city_to” > ……. </vxml:field> </vxml:form>

  10. X + V • Also usable with XForms • VoiceXml snippets and XForms influence same XForms instance model  synchronization

  11. Models to X + V

  12. Models to X + V • Annotate UI description for speech [Shao2003: Transcoding HTML to VoiceXML Using Annotations] • Extend this approach to UIML and X + V • Identify particular information structures • Text areas • Menu/List structures • Top-level visual region • Define their representation in XHTML and VoiceXml • Generate the synchronization XML eventing code

  13. Model to X + V • Define a generic UIML widget vocabulary mapping for both GUI and speech [Plomp2002] • TextEntry • <field> (VoiceXml) • <input type=“text” /> (XHtml) • System.Windows.Forms.TextBox • Collection • <form> (VoiceXml) • <form> (XHtml) • System.Windows.Forms.Panel

  14. Demo • Access Netfront multimodal browser • PocketPC • Ordering pizza • Ordering Chinese

  15. Conclusions • X + V • built-in modality synchronization • alternative to own multimodal implementation • declarative • transformation from UIML possible

  16. Questions?

More Related