120 likes | 135 Vues
W3C Workshop on SSML, Nov2-3,2005, Beijing. Issues of SSML in Japanese. Wataru IMATAKE (ANIMO LIMITED) Makoto AKABANE (Sony Computer Entertainment Inc.) Kazuyo TANAKA (Tsukuba University) JEITA Technical Standardization Group on Speech Input/Output Systems. 1-1 About JEITA.
E N D
W3C Workshop on SSML, Nov2-3,2005, Beijing Issues of SSML in Japanese Wataru IMATAKE (ANIMO LIMITED) Makoto AKABANE (Sony Computer Entertainment Inc.) Kazuyo TANAKA (Tsukuba University) JEITA Technical Standardization Group on Speech Input/Output Systems JEITA Speech Group
1-1 About JEITA • JEITA (Japan Electronics and Information Technology Industries Association) is industry organization about information systems, personal information device, digital appliance, industrial or social system device and electronic parts. • JEITA was established in November 2000, by merging Japanese Electronic Industry Development Association (JEIDA) and Electronic Industries Association of Japan (EIAJ). JEITA Speech Group
1-2 JEITA Speech Group, Activities • Expert Committee on Speech Input/Output Systems (JEIDA Speech Group) was established "JEIDA-62-2000 Standard of Symbols for Japanese Text-to-Speech Synthesizer" as JEIDA standard, in March, 2000. • Revised version of JEIDA-62-2000 was published in March, 2005 , as “JEITA-IT-4002”. • JEIDA-62-2000 included control tags for synthesizers, defined by XML. • However, the control tags are removed in "JEITA-IT-4002“. JEITA Speech Group
2-1 How to specify Japanese pronunciation in phoneme element "JEITA IT-4002: Symbols for Japanese Text-to-Speech Synthesizer " • Two levels for notation: kana level notation with Japanese katakana, and phonemic level with IPA or SAMPA. We suggest that we describe it with "x-JEITA-IT-4002-kana", "x-JEITA-IT-4002-ipa", "x-JEITA-IT-4002-sampa" as alphabet attribute. JEITA Speech Group
2-2 How to specify Japanese pronunciation in phoneme element JEITA Speech Group
3-1 How to specify speaking rate in Japanese • A basic unit of Japanese rhythm is mora. • Mora is called "拍"(haku) in Japanese. For example, a haiku is described in 5-7-5 haku. “こんにちわ”/ko N ni chi wa/ →5 moras “しゃしん”/sya si n/→3 moras Japanese, /sya sin/→2 syllables English • Therefore, it is natural to specify the speaking rate / Japanese phoneme length by a number of mora. • To specify speaking rate in rate attribute of prosody element, use a unit of mora/sec. • By the same token, to specify pause time in time attribute of break element, use a unit of mora. JEITA Speech Group
3-2 How to specify speaking rate in Japanese JEITA Speech Group
4-1 ruby element • There is a lot of different meaning word of the same type (a reading different by the same notation) in a Japanese kanji. • For a long time, the newspaper publishing companies or magazine companies used a ruby to understand kanji words easier for readers. • In addition, there is a function to describe a ruby, and it is generally used for the word processor which is used a lot in Japan. (Ex. Microsoft Word, Justsystem ICHITARO, OpenOffice writer, etc) • Therefore, there are a lot of contents of a text including a ruby in Japan. • Japanese voice synthesis engines can reduce misreading by utilizing a ruby positively. • A ruby is usually described Japanese katakana or a hiragana letter. Therefore, a ruby does not fit a phoneme element. JEITA Speech Group
4-2 ruby element • We know "Ruby Annotation - W3C Recommendation 31 May 2001"(http://www.w3.org/TR/ruby/) , but this is overspecialization for voice synthesis. Layout information is unnecessary for a voice synthesis. • The simplest expression of the ruby is enough for a voice synthesis. • Therefore, we propose that a ruby element be defined newly. JEITA Speech Group
4-3 ruby element JEITA Speech Group
5-1 Expansion of an say-as element • There are different readings (both are right) in Japanese in the same meaning and the same notation. • For example, 「二十日」can be read as [ニジュウニチ」(ni-jyu-ni-chi) and 「ハツカ」(ha-tsu-ka) with same notation. Both mean 20th of the month. • In this case, SSML should provide a function that a creator can choose whether a voice synthesis engine reads "10/20" with "ジューガツハツカ" (jyu-gatsu-ha-tsu-ka) or "ジューガツニジューニチ"(jyu-gatsu-ni-jyu-ni-chi). • Therefore, we propose the attribute that can speak a Japanese language reading of a date for a say-as element. • We are still examining this issues. JEITA Speech Group
5-2 Expansion of an say-as element JEITA Speech Group