1 / 28

BIOVI Text To Speech (TTS) project

BIOVI Text To Speech (TTS) project. Nordisk sprogmøde 26. – 30. August 2013 Kristinn Halldór Einarsson project manager and chairman for Blindrafélagið, Icelandic organization of the visually impaired (BIOVI). Overwiev. Life quality taken for granted.

tarak
Télécharger la présentation

BIOVI Text To Speech (TTS) project

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. BIOVI TextTo Speech (TTS) project Nordisk sprogmøde 26. – 30. August 2013 Kristinn Halldór Einarssonproject manager and chairman forBlindrafélagið, Icelandic organization of the visually impaired (BIOVI)

  2. Overwiev • Life quality taken for granted. • Visually impaired people and Text to Speech systems. • BIOVI Text to Speech project. • Listening examples and tools presentation.

  3. Quality of life • How would it affect us if we would lose our ability to read? • This is something that will most likely happen to some of us in our retirement years. • What can be done to limit the huge negative impact on the life quality of those who are going to lose their ability to read in a conventional manner? • AND • Can it be accepted that an increasing part of our population could lose their ability to enjoy reading in an independent manner?

  4. Who are they? • 5% of people 70 years and older are affceted by later stage of Age Macular Degeneration (AMD). No effective treatments available today. • AMD affects mainly the central vision (reading vision). • There are around 800 visually impaired individuals in Iceland as a result of later stage AMD. In 2030 the number is expected to double, be 1600. Total poulation of visually impaired in Iceland is 1600. • The organization of people with dyslexia in Iceland claims that up to 25% of grown ups are dealing with dyslexia.

  5. ... a bit of history • 1003 The first known tales of effort to build a talking machine. • 1968 The first computer speech synthesizer is built. • 1988 The Universities of Iceland and Stockholm start cooperation. • 1990 The Swedish company Infovox releases Sturla, the first Icelandic TTS voice. • 2000 Snorri, an updated and improved version of Sturla is released. • 2006 Ragga, a new Icelandic TTS voice is released by Nuance. • 2012 Dóra and Karl, new male and female voices, are released by Ivona.

  6. Text To Speech (TTS) technology? • TTS systems are linguistic tools that transforms text in a digital format to speech. • Modern TTS systems need to be able to operate on different operating systems and tools such as: computers, tablets, smart phones, AMD´s, mp3 players and other computing tools. • TTS voices are built for each language and need to be available in different sizes and qualities. • Quality of TTS voices is measured wrt. listening qualities & closeness to natural reading.

  7. ICT, accessibility & quality of life • ICT (Information and Communication technology) can increase independence and life quality of visually impaired people tremendously as it opens up a whole new educational, leisure and employment possibilities. • A key element is well designed TTS system in the mother tongue of those who are to benefit. The mother tongue is an essential part of every nation's identity, and legal rights. • TTS system is not only beneficiary to visually impaired people but also the much larger learning disability population.

  8. TTS voices are marketing commodities • Producers of TTS voices expect return on investments. • Languages spoken by many people represent a market with a big demand that can generate big supply and attractive business opportunities. • Language spoken by few people represent a market, with little demand and little or no supply, that offers little or no business opportunities. • What is the situation with languages spoken by few people, in terms of having modern ICT linguistic tools that are becoming more and more important in modern communications?

  9. Mother tongue “If you talk to a man in a language he understands, that goes to his head. If you talk to him in his language, that goes to his heart” – Nelson Mandela.

  10. BIOVI Text-to-Speech project The project was based on two pillars: Improved life quality & Cultivation of the Icelandic language

  11. Project´s main definitions • Multiple usage options. • Very good listening qualities. • License fee arrangement. • Open to further development . • Some control over future development. • Sustainable business model.

  12. Selecting TTS producer • After exploring and taking stock on different TTS producers the Polish company Ivona was selected to build the new Icelandic TTS voices. • Royal National Institution of Blind People in UK (RNIB) have enjoyed very good cooperation with Ivona. Ivona was finishing building welsh TTS voices. • The Ivona voices have received many rewards for the accuracy and listening quality they possess.

  13. Ivonacompaired (arsnews.com)

  14. Technology - BrightVoice • BrightVoice - a new age for Text-to-Speech. • BrightVoice technology guarantees a smooth natural speech • New language models provide intelligent text interpretation • Up to 10 times faster speech generation • Crystal clear sound due to noise and distortions reduction

  15. Technology – Rapid Voice Devolopment • Rapid Voice Development – fast building of IVONA Voices • RVD technology (Rapid Voice Development) makes the process of building IVONA Voices fast and relatively cheap. • It uses a set of tools modeling a linguistic issues such as subvocalization, accentuation, intonation. • It also allows to efficiently, quickly and accurately determine the speech signal in original speech recordings.

  16. The Ivona tecnology

  17. Development in number of Ivona voices 18 languages

  18. Operation systems and the Ivona voices • The Ivona voices are capable of operating on: • Windows XP/Vista/7/8 • Mac • Unix • iOS (Apple iPhone & iPad) • Android • Windows mobile

  19. The project in steps • December 2010 – March 2011: Ivona visited, agreement drawn up and signed. • Summer 2011: 10.000 sentences selected from the Icelandic corpus in Leipzig, • voice talents selected, recording of sentences. Voices named Dora & Karl. • February 2012: Ca. 900 sentences released. Valuation and feedback by team of linguistics and users. Beta 1. • Apríl 2012: Valuation and feedback on Beta 2 is concluded. • June 2012: Beta version 3 is released and distribution starts. • October 2012: 10.000 additional pronunciation examples added to the corpus. • June 2013: Final version of Dora and Karl released.

  20. Cost and plans • Total cost was 500.000 Euros (85 million IKR). • The project was close to fully financed when agreement was signed. • Cost and delivery times where according to plans and turned out to be accurate.

  21. Financal contributors • Blindrafelagid (inheritance from Dora Stefánsdottir) 25,0 m.kr. 29% • Lions, national colection The Red feather 19,3 m.kr. 23% • Foundation for disability related projects 15,0 m.kr. 17% • Ministries of welfare and education 11,3 m.kr. 13% • The diability oragnization of Iceland 10,0 m.kr. 12% • Blindravinafélagið (Friends of the blind) 5,0 m.kr 6% • Total 85,6 m.kr. 100%

  22. Valuable contributors • Among valuable advisers, contributors and co-workers where: • Eiríkur Rögnvaldsson, Icelandic professor at the University of Iceland and his people. • Sigrún Helgadóttir at Árnastofnun. • The people behind the Icelandic corpus at the University of Leipzig. • Mrs Vigdís Finnbogadóttir, former president of Iceland, who acted as the project’s patron.

  23. Sustainable business model • The Icelandic Ivona voices, along with Ireader, are given free of charge to all Icelanders who are visually impaired or are dealing with reading impairment. Others can buy the Ireader and the voices for around 50 Euros. • BIOVI handles all sales of the Icelandic Ivona voices and different tools like the text reader, recording studio and the webreader. Customers are individuals, schools, institutions and businesses. Additional voices in other languages can easily be bought from Ivona and added to one’s voice portfolio. • Profits from the sales of the Icelandic voices are meant to finance further development and extra additions that might bee needed.

  24. Linguistic challenges • Dialects: South or north pronunciation? • Emphasize in pronunciation: Difficult to deal with compound words as the rules for stress placement in Icelandic compounds are unclear. • Numbers: Difficult because of so many declensions forms. • Abbreviations: Read them or interpret them? • Foreign words: Solved with an additional dictionary

  25. Main tools • SAPI 5 voices for Windows and reader and mini reader. • Webreader that reads from the cloud. • Android voices for smart phones and tablets. • Recording studio. • Ivona SDK (Software devolopment kit) and voices for, telephone answering, AMD and other computing tools.

  26. Ivona An Amazon company • On the 24th of January 2013 Amazon announced that it has acquired the leading text-to-speech technology company IVONA. • This acquisition strengthens and protects the position of Ivona on a market where there are some much bigger players then Ivona. • Amazon acquiring Ivona is in a way confirmation that others have seen the same thing as we did when it comes to the potential of Ivona TTS products.

  27. Listening examples and tools precentation Snorri Ragga IReader Karl Dóra

  28. Takk fyrir

More Related