1 / 17

HLT Development

HLT Development. NESPOLE! Pittsburgh Meeting December 4, 2000. Session Agenda. HLT Server demo Partner updates on HLT module development: SR, Analysis, Generation Status of HLT servers and architecture: functionality, coverage, Status of data collection, transcription and annotation

adah
Télécharger la présentation

HLT Development

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. HLT Development NESPOLE! Pittsburgh Meeting December 4, 2000

  2. Session Agenda • HLT Server demo • Partner updates on HLT module development: SR, Analysis, Generation • Status of HLT servers and architecture: functionality, coverage, • Status of data collection, transcription and annotation • Planning of D6: annotated data for 1st showcase • Development timelines

  3. Timeline for Integration and HLT Development • Mar 1,01: Demonstration at EC • Feb 23,01: Complete full system tested • Feb 12,01: Start intensive tests between E/G/F clients and I agent at APT - initial user studies! • Jan 29,01: System integration complete, begin technical tests • Jan 15,01: Each site completes integration with Aethra mediator, starts tests of integration

  4. D6: Annotated Data for SC-1 • Description of scenario development • Description of data collection procedures • Summary of data collected • Annotated data for the four languages? (at least samples) [Fabio - check with PO]

  5. Discussion Issues for Tuesday • Prioritize scenarios, focus on ONE? • Functionality of mediator: audio transmission to both sides. • Status of mediator-HLT integration (timestamps etc.) • Lessons learned from data collection - Celine and Susi

  6. Nespole! HLT Objectives • Scalability- expansion of existing domain: • expanding coverage of IF to broader Travel Domain as required for first showcase • development of analysis and generation approaches that support easy expansion • new broad and general IF representation and • appropriate analysis and generation approaches

  7. Nespole! HLT Objectives • Portability- easy expansion into new domains: • extending existing IF with Domain Actions for other domains (Help Desk for 2nd showcase) • new broad IF representation • new analysis and generation approaches that are appropriate for the new broad IF

  8. Nespole! HLT Objectives • Robustness - ability to handle more corrupt input and graceful degradation of performance: • multiple alternative analysis/translation approaches • better identification of out-of-domain utterances and confidence measures

  9. HLT Server Components • Each HLT Server consists of an Analysis Chain and a Generation Chain • Analysis Chain: • Speech Recognition + analysis into IF • Generation Chain: • Generation from IF + Speech Synthesis • Each site free to develop its own analysis and generation technology • Communication between modules is primarily via IF, using the ComSwitch server and protocol

  10. Main Constraints and Requirements • Maintain site technology freedom and distributed HLT development as much as possible • Leverage off existing C-STAR technology • start with existing analysis and generation engines • use (extend) C-STAR CommSwitch protocol • New server architecture allows: • constant availability for testing and development • plug-and-play of new modules • separation of external API issues from required HLT communication

  11. CMU/UKA Approach • New analysis approach for domain-specific task-oriented language combines rule-based and statistical/trainable methods • New analysis engine for new style IF, using chunk parser followed by new combiner and mapper • Possibly addition of MEMT direct translation approach for coverage and robustness • Effective combination and disambiguation of all above approaches • New generation from IF using GenKit

  12. New Approach: SALT SALT - Statistical Analyzer for Lang. Translation • Combines ML trainable and rule-based analysis methods for robustness and portability • Rule-based parsing restricted to well-defined set of argument-level phrases and fragments • Trainable classifiers (NN, Decision Trees, etc.) used to derive the DA (speech-act and concepts) from the sequence of argument concepts. • Phrase-level grammars are more robust and portable to new domains

  13. Alternative Approach: MEMT Multi Engine Machine Translation • Translates directly into target language (no IF) • Based on Pangloss/Diplomat translation system developed at CMU • Uses a combination of EBMT, phrase glossaries and a bilingual dictionary • English/German system operational • Good fall-back for uncovered utterances

  14. Data Collection for First Showcase • Data collection with APT agent: • real dialogues between users and APT agents • monolingual dialogues • 28+8 English dialogues collected in 4 sessions • 28 dialogues transcribed • none annotated with IF (yet) • Lessons and Comments: • realistic scenario • uneven dialogues: agent dominates conversation • problems with recording/collection setup

  15. Data Transcription andAnnotation • May-00 Goals and Time-line: • 50 dialogues per language, 4 dialogues per hour • data collection by end of August • transcription by end of September • Annotation with IF by end of October • Revised schedule...

  16. Points for Discussion • Definition of the Scenario for SC-1 • Timeline for data annotation • Timeline for HLT module development • Planning D6

  17. Definition of Scenario (May-00) • Analysis of APT email data (Paolo) • 9 main categories • developed ~20 specific scenarios • APT will look at scenarios and prioritize them, and prioritize web pages (for translation to French) within 10 days • We will use existing web pages for APT (in I,G,E), and some translated into French • Goal is to focus on up to 10 scenarios

More Related