150 likes | 278 Vues
This document investigates the potential of speech as a modality for electronic devices, assessing its applicability across various hardware and architecture designs. It delves into user interactions, exploring dialog structures that enhance communication with devices such as phones, PDAs, and computers. Key aspects such as session management, implicit navigation, and user-driven commands are analyzed through practical scenarios and research questions. The study aims to determine the effectiveness and efficiency of a subset language for users, ensuring a seamless experience across multiple devices.
E N D
Engineering Dialog for Gadgets Thomas K Harris September 12, 2003
???Questions??? • Motivation: Is speech a useful modality for electronic devices? • Hardware: How would one get speech in “other” devices? • Architecture: What should the system look like? • Dialog: What should/will these conversations be like?
Current Speech Application Concept Phone Client PDA Client Computer Client Speech Application Backend
Current Electronic Devices ??? ??? ??? Speech Application Frontend
Protocol-based Architecture Havi adapter Speech Graffiti Personal Universal Controller X10 adapter
Speech Graffiti Dialog • Artificial subset language • Tree-structured functions • Universal conversational primitives • User-directed • Great for recognition • Entirely declarative (and automatic)
Minimal Keywords • hello-james • options • where-am-i, where-was-i • go-ahead, ok • status • goodbye • what-is, what-is-the • how-do-i • more
Session Management • hello-james/goodbye • User: blah blah blah...System: ignoring userUser: hello-jamesSystem: stereo, digital cameraUser: stereoSystem: stereo hereUser: goodbyeSystem: goodbyeUser: blah blah blah...System: ignoring user
Query • what-is path/status • User: what-is-the am frequencySystem: the am frequency is five hundred thirtyUser: what-is randomSystem: random is offUser: what-is-the stereoSystem: the stereo is tuner
help/exploration/implicit navigation • how-do-i.../options/path options • User: control alarm clock radio optionsSystem: alarm, clock, radio, sleep...User: moreSystem: x10, stereoUser: stereo optionsSystem: while turning stereo on: off, am, fm, auxiliary, cd...
invocation/specification/implicit exploration/navigation • Path • User: stereo auxiliarySystem: while turning the stereo on and switching to auxiliary: auxiliaryUser: cdSystem: while switching to cd mode: cdUser: playSystem: while playing a cd: play
list navigation • More • User: radio band am optionsSystem: bracketed list [frequency, kabc, k001, k002, k003, k004, k005][fm][off, volume][alarm, clock, sleep][x10, stereo] rendered: frequency, kabc, k001, k002...User: moreSystem: k003, k004, k005...User: moreSystem: fm, off, volume...
orientation • where-am-i • User: what is the discSystem: the disc is threeUser: where am iSystem: stereo cd disc
Research Questions • Is the subset language learnable? • Once learned, is it efficient? • Are user mistakes infrequent enough? • Are system mistake infrequent enough? • Can one generalize from one device to another? • Is the subset language well retained?