370 likes | 500 Vues
Speak to your customers loudly and clearly. Elan Speech mission statement. Beyond the words As a leading world player in Text to Speech, Elan Speech focuses exclusively on the development and marketing of natural-language interfaces.
 
                
                E N D
Elan Speech mission statement • Beyond the words • As a leading world player in Text to Speech, Elan Speech focuses exclusively on the development and marketing of natural-language interfaces.
Elan Speech brings organisations new ways of interacting with their clients, providing new opportunities to speech-enable their world through revenue-generating applications. Our mission is to vocalize content to the end user with efficiency and accuracy, whatever the situation is » Antoine Kauffeisen, CEO, Elan Speech
Elan Speech profile Private company, headquartered in Toulouse, France. Funded by venture capital (raised in 2002): IRDI, Part’Com,WT. Strong in-house R&D, Elan Sayso™ technology ownership. Wide offer of TTS technologies, with up to 12 languages and more to come. Large Partner & Customer network in Europe, Eastern Europe, North America, Latin America, Japan & India. New management growth oriented, with longterm vision and roadmap. Worldwide speech provider
Elan Speech was created in June 2002, from the assets of previously named Elan Informatique 1980: creation of Elan Informatique 1986: beginning of work on text to speech (LPC technology) 1996: exclusive focus on TTS (diphone concatenation technology: Elan Tempo™) 2000: company sold to Lernout & Hauspie (L&H) 2001: legal battle against L&H, won in November 2001 2002: decision to go chapter 11 (RJ) in Feb 2002 June 2002: creation of Elan Speech, acquisition of all Elan Informatique assets, new management, new capital structure. July 2002: launch of new high-end TTS technology: Elan Sayso™ Elan Speech background
Elan Speech : figures More than 3 million de licenses in automotive and multimedia applications. More than 10,000 ports deployed in telephony services. More than 350 active customers. 12 languages already supported. 3 target markets: Telecom, Multimed and Mobility. 2 text to speech technology families: Elan Tempo™ and Elan Sayso™ Worldwide speech provider
Focus #2 North America Focus #1 : EUROPE Germany, France, UK, Spain, Netherlands, Belgium, Italy, Switzerland … Focus #3Latin America : BrazilChile, Argentina, Venezuela Focus #4Middle east (Arabic), India, Japan, Korea, Australia Elan Speech : Geographical markets
TelecomServer based vocalization of contents for multiple users over the phone for Enterprise : Unified messaging, Auto attendant, CRM for Telcos : Unified messaging, Voice portal, SMS2Voice, directory and reverse directory Automotive and mobile terminalsOn board and off-board speech solutions to free user from reading instructions. On board car navigation systems & Off board car navigation systems Traffic information Telematics, RDS – TMC MultimediaPersonal software on PC & Mac Edutainment software Disabilities assistance Personal productivity Elan Speech’s markets
> TTS component for Telecom High quality High density (ports per server) High reliability (24/24 7/7) Support of markup languages and standard APIs Support of 3 major OS : Windows NT/XP/2000, Solaris Sparc, Linux > TTS component for Automotive and mobile terminals High quality Low footprint (2 to 16 Mb depending on platform) Support of multiple RTOS (VxWorks, WinCE, PSOS, Neutrino, etc..) Support for multiple processors Support for phonetic input/output and phonetic lexicons of proper names > TTS component for Multimedia High quality High flexibility (Speed, pitch adjustment, voice customization) availability for PC & MAC platform Support of Standard APIs Server based vocalization of contents for multiple users over the phone Elan Speech markets’s requirements
Core Technology (TTS licences) Value addedservices aroundthe technology(Custom voice,Quality monitoring,Expertise) Solution, platform Service/consumer product Elan Speech’s direct & indirect business model Elan Speech VAR & OEM (integrator, platform vendor, publisher) End Customer (ASP, Service provider, Telco, car manufactuer) End Customer (Subscriber, mass market user)
Elan Speech TTS technologies >Diphone based concatenative TTS Advantages • High density (over 250 ports per server) • Small footprint (2 to 6 Mb) • Flexible (Pitch, Speed adjustment, prosody copying) • High intelligibility • 12 language supported Disadvantage : • robotic sounding Markets/Application targeted : • Automotive & consumer electronic (low footprint) • High density, short ROI server based TTS (telephony), low cost of ownership • Multimedia software products
>Unit selection concatenative TTS Advantages: Very high quality Highly natural Flexible (Pitch, Speed adjustment, timber alteration, whisper feature) Support for Custom voice (“Speech Brand” Program) Disadvantage: lower density (50 ports/server) larger footprint (16 to 70 Mb) Markets/Application targeted : High end telephony application Mass market telco service (Voice portal, news) Public address High end multimedia software Elan Speech TTS technologies
Comparison of Elan’s TTS technologies Pre-processing Pre-processing Text normalization Text normalization Abbreviation & exception Phonetic transcription Phonetic trans. Prosody calculation Unit selection Synthesizer Decoder Units database Diphone database Audio output Audio output Elan studio
Positioning of the two technologies Quality / Naturalness Human speaker Elan Sayso™25-50 MB Elan Sayso™ Embedded10-16MB Elan Tempo™2-6 MB 4 Mb 12 Mb 32 Mb Footprint
>Elan Studio : a strong R&D set of tools Advantages Automate most r&d tasks Build-in signal processing Build-in linguistic analysis Build-in Phonetic analysis Build-in Database generation Automatic segmentation Voice factory Fast and easy tuning & improvement Optimization tools = Key component for R&D to rollout languages and voices rapidly. R&D approach : Automation & Tools
Elan Speech products framework 5 APIs supported, a 6th to be discussed SAPI 4 SAPI 5 NSC API NVIF JavaSpeech Speechmanager Product layer – OS related level – native API Pre-processing Pre-processing Common product framework for Elan Tempo andElan Sayso™ providing full compatibility Text normalization Text normalization Phonetic trans. Phonetic trans. Prosody model. Unit selection Synthesizer Decoder (HNI) Audio Layer
TelecomContent vocalisation solutions for Operators & Entreprises. Applications Customer services automatisation IVR Voice portal SMS to voice Unified messaging and email reading Elan Speech Offer Elan Sayso™ Telecom & Elan Tempo™ Telecom :>Multilingual, multi-channel, carrier grade TTS engine. > Client server architecture, heterogeneous architecture supported >Load balancing (multi-server architecture), centralized supervision >Dynamic user lexicons (abbreviation, exceptions) Elan Speech’s market (1)
Elan Sayso™ Telecom & Elan Tempo™ Telecom Available for Windows NT/2000/XP, Solaris Sparc, Linux x86 Support for 12 languages, with male and female voice Support for 5 API (SAPI4, SAPI5, NVIF, Elan NSC API, JavaSpeech) Cross-platform integration with Elan NSC API Client server architecture, heterogeneous architecture supported Load balancing (multi-server architecture), centralized supervision Dynamic user lexicons (abbreviation, exceptions) Specific modules included : - E-mail pre-processing- automatic language identification- Markup language supported : SSML (VoiceXML), JSML Elan Speech’s market (2)
Multimedia & Web Products for personal communication and content enhancement. Applications : Edutainment software Aid for the disabled Personal productivity Personal Web assistant (Agent) Voice enabled tutorials Consumer electronics vocal interface Specific support > Elan Sayso™ for Multimedia & Elan Tempo™ for Multimedia, TTS software component for Windows and MAC OS X platforms. > Elan Sayso™ PocketSpeech & Elan Tempo™ PocketSpeech, TTs software component for Pocket PC Elan Speech Markets (3)
Automotive & Mobile devicesTTS multi-platforms for embedded compact solutions. Applications Embedded navigation aid Traffic information Navigation sytems for PDAs Telematics services Vocal interface on professional devices public address services Elan Speech offer > A wide range of portage to serve more than 10 RTOS and 20 procesors specifically adapted to customers’ platforms. > Pocket Speech, specific offer for PDA for Windows CE Elan Speech Markets (4)
Elan Sayso™ PocketSpeech, Elan Tempo™ PocketSpeech Multilingual TTS engine for PDA based applications Support for both Tempo & Sayso technology Available for WinCE 2.Xx, WinCE 3.0 / PocketPC 2002 / WinCE.Net Support for 8 languages, with male and female voice Support for 3 API (SAPI4, SAPI5, Elan NSC API) Tempo PocketSpeech™: small footprint engine, high quality : 3 to 6 MB Sayso PocketSpeech™ : high quality and high naturalness : 8 to 16 MB Elan Speech Markets (5)
Elan Sayso™ Embedded, Elan Tempo™ Automotive Multilingual TTS engine for embedded platforms Supports Tempo technology in 8 languages with male & female voice Available for WinCE 2.Xx, WinCE 3.0 Automotive, QNX, Neutrino, VxWorks, PSOS, µITROn, RTXC, Linux Embedded On Intel X86, Motorola 68332,Motorola 68360, Motorola Power PC, Hitachi Super H(SH3, SH4), Philips Trimedia, OKI 763X, OKI ML2110, StrongARM, MIPS… Support for 3 API (SAPI4, SAPI5, Elan NSC API) Unlimited vocabulary (names, numbers and currencies, dates, free text, e-mail, etc.) High quality voice, smooth and natural intonation with concatenative synthesis. Voice speed and voice pitch control. Female and male voices. User abbreviation lexicon for each language. Text tags. Phonetic input/output (SAMPA, IPA) Elan Speech Markets (6)
A-TTS (Applicative Text-to-Speech) for mix of Prompts and Elan Sayso™ Text input > A-TTS:Applicative Text to Speech means that prompts are fully tunable and updatable (application corpus) and treated like “Sound Exceptions” within the generic TTS system. Pre-processing Text normalization Textual Abbreviations & Exceptions Phonetic trans. Sound Exceptions – Prompts Application dependent (encoded in HNM frames) A Unit selection Generic Units database TTS Decoder (HNM) Audio output
Applicative TTS and recorded messages included in the TTS system Recorded prompts for applicative TTS Recorded or TTS generated applicative prompts stored at 1,6Kbps (22khz, 15ms frames) Prompts generated with the full Elan Sayso version Sayso30-50Mb ATTS Sayso Embedded10-16Mb ATTS 1/3 to ¼ size 6 to 10 Mb Hnm frames : 10ms to 15ms50% removed units (pruning) Hnm frames : 15ms to 20ms>70% removed units (pruning)
D-TTS (Distributed Text-to-Speech) for Web and Telematic applications Elan Sayso TTS server TCP/IP Socket over Internet connexion ActiveX client Java AppletClient Servlet(Java security) HTTP TCP/IP Socket over GPRS/UMTS GPRS/UMTSgateway Embedded JavaClient Application (server) <16Kbps bandwidth used for a 22khz sampling rate streamed TTS
Coder performance for applicative TTS and distributed TTS Skip 1 : 5ms frames: transparent Skip 2 : 10ms frames : no audible change Skip 3 : 15ms frames, slightly degraded acoustic quality, hard to perceiveskip 4 : 20 ms frames, audible degraded acoustic quality, acceptable With Sayso embedded, at 22kHz sampling rate, 1 hour of recorded prompts for applicative TTS will take less than 6,5MB.
> “Digalo cast” Distributed TTS over an IP network (DTSS) High quality server based Sayso and Tempo TTS Small footprint remote client , Java native (100Kb) Low bandwidth connection (<15Kbps) “HiFi” restitution quality (22khz, no degradation) Lips synchronization tags for animated web agent (3D agents) Elan Speech Web solutions Digalo Cast Server running Elan Tempo or Elan Sayso technology serving from 30 to 300 users simulatenously Java Clientfor Digalo Cast(100Kb) IP connexion : less than 15kbps bandwith used, TCP, UDP or HTML encapusalted
Elan Virtual Speaker Voice prompts creation tools for telephony or multimedia application Quick and Easy to use, available for audio updates 24/24 7/7 Automatic generation Batch processing Editing features Multiple output format Pitch, Speed adjustable 8khz, 22khz sampling A-law, µ-law Elan Speech Tools
Elan Prosel Applies natural intonation to synthetic speech Elan Speech Technology tools Elan Lexitool Edit and enrich exception and abbreviations lexicons
Proprietary voice : “Speech Brand” “An exclusive TTS voice based on an existing speaker of your choice. Based on Elan Sayso technology, the new voice will mimic the timber, the intonations and the accent of the original speaker. Technology adaptation & Porting Elan’s core technology adapted to a specific platform (Processor, RTOS), especially for embedded TTS Quality monitoring A global service offer to continuously improve the result of TTS for a specific application. Audit of written contents, specialization of the TTS. Elan Speech Services
Speech Brand : the process to create custom Sayso voices Elan Studiovoice factoryframework Text corpus(5 weeks) Recordings(4 weeks) Autosegmentation(2 weeks) Segmentation verification (manual)(2 to 4 months depending on size) Database generationand optimization(2 weeks) Requires the Speaker.Might be reduced for latin languages Reduced if the language is already available with Sayso Longest part, required to achieve high quality. Currently Investigating reduction and automation a part of this task. Ready for integration Computerprocessing 5 to 7 month process
Elan’s partner program Web, News and tradeshow support Elan news & EvenTTS A monthly newsletter dedicated to customers applications and deployments, sent out to a highly focused database of 13000 e-mails Digalo.com A website dedicated to promoting consumer speech-enabled applications. Joint marketing agreements : A program to refer qualified leads of prospects Elan’s marketing tools