1 / 24

Information Retrieval using Intelligent Speech Communication Interface

Information Retrieval using Intelligent Speech Communication Interface. Institute of Informatics of the Slovak Academy of Sciences, Bratislava trnka@savba.sk. Overview. Introduction IRKR system Architecture Pilot applications Realization of service.

anoush
Télécharger la présentation

Information Retrieval using Intelligent Speech Communication Interface

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Information Retrieval using Intelligent Speech Communication Interface Institute of Informatics of the Slovak Academy of Sciences, Bratislava trnka@savba.sk

  2. Overview • Introduction • IRKR system • Architecture • Pilot applications • Realization of service WIKT 2006

  3. WIKT 2006 What is a Speech Communicarion Interface (SCI)? • A SCI, or Spoken Language Dialog System (SLDS) is a computer system that you can talk to in order to carry out some task • Contemporary SLDSs are typically of two kinds: • Transaction-based systems, allowing to undertake some transaction, such as buying or selling stocks, or reserving a seat on a plane • Information-provision systems, providing information in response to a query, such as a request for timetable information or weather information • The circle of typical speech dialog in SCI shows also main components of SCI

  4. Action Meaning ORIGIN_CITY: KOŠICE DESTINATION_CITY: BRATISLAVA FLIGHT_TYPE: ROUNDTRIP GET DEPARTURE DATE WIKT 2006 The Speech Dialog Circle in SLDS Speech Speech TTS Automatic SpeechRecognition Text-to-Speech ASR Data, Rules Words spoken ”I need a flight from Košice to Bratislava roundtrip” Which date do you want to fly from Košice to Bratislava? Response Generation Spoken LanguageUnderstanding RG SLU DM DialogManagement

  5. IRKR • first SLDS which is able to interact in the Slovak language • developed in the period from July 2003 to June 2006 • supported by the National program for R&D “Building of the information society” WIKT 2006

  6. WIKT 2006 IRKR - partners • Technical University of Košice • Institute of Informatics, the Slovak Academy of Sciences • Slovak University of Technology in Bratislava • University of Žilina

  7. WIKT 2006 IRKR - specification • natural interaction • multi-user interaction • slovak language • fixed and mobile telephone networks • access to distributedinformation(on internet)

  8. IRKR - architecture • DARPA Communicator architecture • ‘hub-and-spoke’ • each module seeks services from and provides services to the other modules • modules communicate with them through the central software router - the Galaxy hub • communicator.sourceforge.net WIKT 2006

  9. WIKT 2006 Galaxy – basic overview • Distributed, message-based, hub-and-spoke infrastructure optimized for constructing spoken dialogue systems; • available under a liberal open source license; • not an end-to-end dialogue system, but provides tools for constructing such a system out of a suite of servers; • provides a sophisticated and general transport layer for connecting servers and Hubs, as well as a message syntax (does not provide specifications about semantics); • the core Galaxy Communicator infrastructure is written in C; • support for defining server and connection initialization functions in C, Python, Java and Allegro Common Lisp.

  10. WIKT 2006 IRKR - architecture

  11. WIKT 2006 Automatic speechrecognition server • conversion of incoming speech to a corresponding text • two speech recognizers of freely available for nonprofit research • ATK - htk.eng.cam.ac.uk/develop/atk.shtml • SPHINX - cmusphinx.sourceforge.net • Phoneme acoustic models: • built following REFREC 0.96 training procedure • acoustic features were conventional 39-dimensional MFCCs, including energy and first and second order deltas • 3-state left-to-right HMMs • context dependent (triphone) acoustic models

  12. WIKT 2006 Databases used for ASR training • SpeechDat-E SK • 1000 speakers, PSTN (office, home, phonebooth) • MobilDat SK • 1100 speakers, GSM networks (office, home, street, vehicle, public building) • Both of them balanced for: • age, regional accent, and sex of the speakers • Every speaker pronounced 50 files - numbers, names, dates, money amounts, embedded command words, geographical names, phonetically balanced words, phonetically balanced sentences, Yes/No answers and one longer non-mandatory spontaneous utterance

  13. WIKT 2006 Text-to-speech synthesis • TTS converts outgoing information in text form to speech • intelligibility, naturalness • we developed two TTS modules using two different approaches: • diphone • intelligible speech • flexible and totally domain–independent • computationally inexpensive • small memory-footprint • sounds a bit robotic and tedious • unit-selection • better naturalness • some problems with intelligibility • limited domain

  14. WIKT 2006 TTS architecture Diphone synthesizer Unit selection synthetizer

  15. WIKT 2006 Dialogue manager • The dialogue manager controls the dialogue of the system with the user • The heart of the dialogue manger is the interpreter of VoiceXML mark-up language: • simplifies speech application development • enables distributed application design • accelerates the development of interactive voice response (IVR) environments

  16. WIKT 2006 Dialogue manager architecture

  17. WIKT 2006 Audioserver • provides the whole information system with reliable multiuser connection to the telephone networks • supports telephone hardware - Dialogic D120/41JCT-LSEuro card • The direct (broker) connection between audio server and ASR server or TTS server

  18. WIKT 2006 Dialogue manager architecture

  19. WIKT 2006 Information server - IS • IS connects the system to information sources and retrieves information required by the user • special IS for every pilot application – special web wrapper • a rule based ad-hoc IS searching only several predefined web-servers with a relatively well known structure of pages will do a much better job • returning the data in the XML format • caching of results with user defined expiration

  20. WIKT 2006 IS architecture

  21. WIKT 2006 WEB wrapper • navigation through the web-server • extraction from the web-pages • mapping on to a structured format (XML) • data verification • robust as possible against changes in the web-pages structure

  22. Pilot applications • “Weather forecast in Slovakia“ • www.meteo.sk; www.shmu.sk • weather forecast for about 80 Slovak district towns • Place: District town or holiday locality • Date: relative date / accurate date • „Timetable of Slovak Railways“ • www.cp.sk • information about Slovak railways timetable • Starting place: railway station in Slovakia • Destination place: railway station in Slovakia • Date: relative date (today, tomorrow etc.)/absolute date (“the twentieth of December” etc.) • Time: departure time (hour, minute) WIKT 2006

  23. WIKT 2006 Realization of services • available at: +421 55 602 2297, +421 2 5941 1118 (T-com), +421 911 650 038 (T-Mobile), +421 918 717 491 (Orange), irkr_pub (skype) • IRKR on web - irkr.fei.tuke.sk Here we show a typical dialogue between the user (U) and the system (S): S: Welcome to the IRKR portal. Would you like to play the introduction? U: No. S: Choose one of the services: Weather forecast or Railway’s timetable. U: Weather forecast S: Please, name a city and assign a day, for which you want to get the weather forecast. U: Bratislava, tomorrow. S: Did you say Bratislava, tomorrow? U: Yes S: The weather forecast for Bratislava for tomorrow is: sunny, 32 centigrade...

  24. Thank you for your attention WIKT 2006

More Related