330 likes | 444 Vues
This course module focuses on the integration of speech interfaces in human-computer interaction (HCI), exploring the usability issues involved and providing a comprehensive overview of existing technologies. Students will gain hands-on experience in designing speech-centric interfaces while exercising project skills such as organization, collaboration, and presentation. The module covers dialog management, speech input and output technologies, multimodal interaction, and the complexities of natural language processing. It aims to prepare students for the future of conversational interfaces in various applications.
E N D
Module u1:Speech in the Interface1: IntroductionJacques TerkenHG room 2:40tel. (247) 5254j.m.b.terken@tue.nl U1, Speech in the interface: 1. Introduction
contents • 1. Aims and overview of course • 2. Speech interfaces • 3. Usability issues: introduction • 4. Project U1, Speech in the interface: 1. Introduction
Aims • Acquire insight into usability issues and obtain an overview of state of the art for speech in the interface • Obtain hands-on experience with design of speech-centric interface • Exercise project skills (organisation, collaboration, report, presentation) U1, Speech in the interface: 1. Introduction
Overview of Module • Introduction • Dialog management • Speech input technologies • Speech output technologies • Multimodal interaction • Evaluation • Human Communication • Exercises and project U1, Speech in the interface: 1. Introduction
contents • 1. Aims and overview of course • 2. Speech interfaces • 3. Usability issues: introduction • 4. Project U1, Speech in the interface: 1. Introduction
Speech in the interface U1, Speech in the interface: 1. Introduction
Markets and applications R. Moore 2005 U1, Speech in the interface: 1. Introduction
Speech interfaces • Conversational interfaces: natural language interaction with machines (Star Trek syndrome) • Command & Control applications: voice-based equivalent of command-line interfaces and button interfaces (utterances need to adhere to strict grammar) U1, Speech in the interface: 1. Introduction
Components of conversational interfaces Application Speech Synthesis Language Generation Dialogue Manager Natural Language Analysis Speech recognition U1, Speech in the interface: 1. Introduction
Spin-offs • 1. Dictation systems: what you say Speech Synthesis Language Generation Dialogue Manager Natural Language Analysis Speech recognition Application (e.g. MS-Word) U1, Speech in the interface: 1. Introduction
2. Command-control: what you mean Application (e.g. stereo) Speech Synthesis Language Generation Dialogue Manager Speech recognition (Natural Language) Analysis U1, Speech in the interface: 1. Introduction
3. Text-to-speech conversion LangGeneration: prosody Speech Synthesis Dialogue Manager Natural Language Analysis Speech recognition Application (e.g. E-mail) U1, Speech in the interface: 1. Introduction
contents • 1. Aims and overview of course • 2. Speech interfaces • 3. Usability issues: introduction • 4. Project U1, Speech in the interface: 1. Introduction
Speech in HCI: “yes please” • Among others Zue (MIT): Speech will be key technology of the 21st century U1, Speech in the interface: 1. Introduction
Background • Zue c.s.: • Aim: developing the conversational interface • Motivation: natural language interaction is the most natural form of communication (learned at a very early age); among other things very efficient error handling U1, Speech in the interface: 1. Introduction
Advantages of speech • direct access to functionality • supports mobility • suited for hands busy/dirty - eyes busy situations • no special motor abilities needed, optimal compatibility with communicative abilities of users • compatible with trend towards miniaturisation of equipment U1, Speech in the interface: 1. Introduction
Maturity hypothesis • Speech interfaces not yet mature because of complexity of technology: • R.K. Moore: “Spoken language interaction is the most sophisticated behaviour of the most complex organism in the known universe” U1, Speech in the interface: 1. Introduction
Phylogenetic argumentation • First: direct manipulation (“you do what i want”) • Later: symbolic manipulation (cf. management, commercials) • Physical manipulation and violence considered primitive U1, Speech in the interface: 1. Introduction
Ontogenetic argumentation • Russian educational psychology (Galperin): • knowledge acquisition starts with direct manipulation • later-on symbolic manipulation • ”stay off” warning to children: “look with your eyes not with your hands” U1, Speech in the interface: 1. Introduction
Therefore • Direct manipulation phylogenetically and ontogenetically more primitive and less complex • Maturity hypothesis: same trajectory for HCI: first direct manipulation then symbolic manipulation (speech) U1, Speech in the interface: 1. Introduction
However • UI design principles (Schneiderman ‘86): • transparency: continuous representation of objects and actions • fast, incremental and reversible operations with immediate effect • physical actions or labelled buttons, avoid complex syntax/natural language as much as possible • Design principles difficult to realise in speech interfaces U1, Speech in the interface: 1. Introduction
In addition, language and speech technology is not (yet) very robust, and development costs are high • Getting towards the application semantics is more complicated for (natural) language than for direct manipulation • Finally: HCI is domain in its own right, so there is no a priori reason to model HCI after HHI • SO: avoid natural language U1, Speech in the interface: 1. Introduction
Speech interfaces: yes or no • Speech not suited for all kinds of information or situations (e.g. “a picture is worth a thousand words”) • Nevertheless, speech is useful under certain conditions, e.g. • hands busy - eyes busy • mobility, miniaturisation • disabilities (CTS/RSI!) U1, Speech in the interface: 1. Introduction
use interface design guidelines for design of speech interfaces e.g. http://www.larson-tech.com/MMGuide.html • and in return: offer human communication theory as model for HCI U1, Speech in the interface: 1. Introduction
Speech interfaces (SI) and Direct-manipulation interfaces • Main problems with speech interfaces: • no external support for functionality • unreliability of input technology U1, Speech in the interface: 1. Introduction
Dealing with unreliability • Constrain domain • restricted vocabulary • restricted application / task domain • restricted number of users: speaker-dependent speech recognition • Extensive verification (in connection with error cost) U1, Speech in the interface: 1. Introduction
Dealing with functionality problem • Quick reference card • Training • System-driven dialogue experience need for adaptive systems (e.g. barge-in) U1, Speech in the interface: 1. Introduction
contents • 1. Aims and overview of course • 2. Speech interfaces • 3. Usability issues: introduction • 4. Project U1, Speech in the interface: 1. Introduction
Aim • Provide hands-on experience with design and implementation of a speech-centric interface, involving (at least) voice-based control and speech output. • The topic: speech/multimodal interface for in-car information and entertainment systems. U1, Speech in the interface: 1. Introduction
Tools • Download CSLU toolkit from http://www.cslu.ogi.edu/toolkit (requires registering) U1, Speech in the interface: 1. Introduction
Project stages • Task analysis (requirements gathering) • Design on paper (V0.1) • Wizard of Oz • Redesign, implementation of V1.0 • Validation • Evaluation • Report U1, Speech in the interface: 1. Introduction
Exercise for today • CSLU Exercises: McTear ch. 7, pizza application • Extend the pizza application: • Goto http://www.dominos.nl/ • Click “online bestellen” • Extend the dialogue system to include all the topping options, the side dishes and the drinks (see “menukaart”) • Test the system and discuss your experiences U1, Speech in the interface: 1. Introduction
Composition of project teams U1, Speech in the interface: 1. Introduction