1 / 14

Presenting Information with Speech

Presenting Information with Speech. Jaakko Hakulinen (jh@cs.uta.fi). SUI/ASUI/SPI group since 1998 Experiences from several of our applications Bussimies & Interact timetable systems Ovimies (Doorman) Mailman (Postimies) Questions remaining for all areas. TOC.

tricia
Télécharger la présentation

Presenting Information with Speech

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Presenting Information with Speech Jaakko Hakulinen (jh@cs.uta.fi) • SUI/ASUI/SPI group since 1998 • Experiences from several of our applications • Bussimies & Interact timetable systems • Ovimies (Doorman) • Mailman (Postimies) • Questions remaining for all areas

  2. TOC • ”How do users know what to say” • help and guidance • Tutoring • Route descriptions • multimodality, language generation • E-mail reading • multilinguality, e-mail parsing

  3. Help and Guidance • ”How Do Users Know What To Say” (Yankelovich in Interactions november + december 1996) • ”functionality of application is hidden” (in speech only systems) • > prompt design is the key • implicit vs. explicit prompts • incremental prompts / tapering • hints • how to tell (complete) functionality, i.e. create power users?

  4. Help Messages • Don’t rely on help messages • comprehensive help requires a lot of text • speech is slow, way too slow • Using examples is effective, problem is users ”getting stuck” in them • In incremental help, rephrase further messages completely • Postimies has context sensitive, reactive help with ”what next” command (~ in ”Universal Speech Interfaces” Rosenfeld, Olden, Rudnicky, Interactions november + december 2001)

  5. Tutoring • Something we have planned for the Postimies application • the system tutors the user, user learns by doing • requires a user model to guide what to teach • questions: • should the tutor be a separate ”person”, different voice etc.? • will users try to speak to the tutor, should that be possible?

  6. Doorman • Outside the user identifies the person (s)he is coming to see • System opend door (lock) • The system lets the user in (day hours, evening plays doorbell) • Inside the system guides the user to the room where the host is • Current version implemented by Prusi, more info and demo in the afternoon

  7. Route Descriptions • complex thing to describe [those directions] • more visual than linguistic information • multimodality, i.e. Ovimies has puppets with hand gestures • removes left/right hearer/speaker ambiguity

  8. Requirements for NLG • Natural descriptions require some information about surroundings, landmarks etc. • names for places, rooms etc. • information, what is good landmark (uniqueness) • information about visibility

  9. route description generation • system has 2d geometry with of premises ”semantic annotation” and place names • ”unique” targets are defined • first shortest route is searched • route is constructed of legs and turns • start and each leg is described (landmarks, distances, directions, place names) + gestures • text is generated for tts, gestures timed

  10. route description questions • what if we have several robots? • guide to the next robot or all the way • do users stop • how to speak (how much)? • complex information -> slow presentation? • or keep it simple • overall level of descriptions • back there on left side • turn left, forward next to meeting room...

  11. reading e-mails • system reads e-mails in telephone • problems of e-mail domain • e-mail is made to be viewer, not heard • large amount of e-mail hard to navigate • multilingual content • Current Postimies version by Salonen & Helin, more info and demo in the afternoon

  12. Example problems Dear members, Here is your last issue including informations gathered in the last minutes while this issue was under preparation. Chris W. -- ________________________________________________________________________________ Professor Christian J. Wellekens Tel: +33 (0) 4 93 00 26 28 Dpt Multimedia Communications Secr:+33 (0) 4 93 00 26 33 EURECOM Fax: +33 (0) 4 93 00 26 27 2229 route des Cretes BP 193 F-06904 Sophia Antipolis E-mail:welleken@eurecom.fr FRANCE Web: http://www.eurecom.fr ________________________________________________________________________________ • dash (”-”) characters are skipped completely • telephone numbers are parsed (”area code” etc.) • web addresses are parsed (”world wide web”) • visual columns are lost

  13. What do we do to help this • e-mail a grouped to folders (if many of them) • e-mail structured into xml • internet addresses • numbers (telephone, large digits etc.) • horisontal rulers • emoticons • language in paragraph level • reading part decides how to read it all

  14. questions about e-mail reading • how to inform listener, what is interpretation and what is straight text? • can we user different voices? • How about voice families (male/female, young/old ...) • navigation inside a mail • currently paragraph level, smaller units?

More Related