Multilingual Speech Synthesis Techniques and Applications
Explore advanced speech synthesis techniques, such as character pronunciation, word boundaries, dialect, sound effects, speaking styles, macros, and translation extensions, developed by Panasonic Beijing Laboratory for various languages.
Multilingual Speech Synthesis Techniques and Applications
E N D
Presentation Transcript
Outline • Character Pronunciation* • Word/phrase Boundaries* • Dialect • Sound Effect • Speaking Style • Macro (Variable, alias) • Say-as Extension: translation Panasonic Beijing Laboratory
Character Pronunciation* • <?xml version="1.0"?> <speak version="1.0" xmlns=“http://www.w3.org/2001/10/synthesis” xmlns:xsi=“http://www.w3.org/2001/XMLSchema-instance” xsi:schemaLocation=“http://www.w3.org/2001/10/synthesis http://www.w3.org/TR/speech-synthesis/synthesis.xsd” xml:lang=“zh-cn”> <!--Polyphone character “将”. --> <p>你定购的衣服<phoneme alphabet="?" ph="jiang1 " pos= " adv " >将</phoneme>按照地址送到您的家中。</p> </speak> jiang1: p, v, adv.jiang4: n. Panasonic Beijing Laboratory
Word/phrase Boundaries* • <?xml version="1.0"?> <speak version="1.0" xmlns="http://www.w3.org/2001/10/synthesis" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.w3.org/2001/10/synthesis http://www.w3.org/TR/speech-synthesis/synthesis.xsd" xml:lang="zh-cn"> <!--Major phrase label: <L3/> --><p><prosody>你定购的午餐<L3/>即将送达。</prosody></p> </speak> L0: syllable boundary;L1: prosodic word boundary;L2: minor phrase boundary;L3: major phrase boundary Panasonic Beijing Laboratory
Dialect • <?xml version="1.0"?><speak version="1.0" xmlns="http://www.w3.org/2001/10/synthesis" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.w3.org/2001/10/synthesis http://www.w3.org/TR/speech-synthesis/synthesis.xsd" xml:lang="en-US"> <!--Use Szechwan-ese to synthesize the following sentence. --> <p xml:lang="zh-cn" ssml:lang2="cn-sc">欢迎来成都游览。</p></speak> sc: Si-Chuan (Szechwan)sx: Shan-Xi… Panasonic Beijing Laboratory
Sound Effect • <?xml version="1.0"?> <speak version="1.0" xmlns="http://www.w3.org/2001/10/synthesis" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance“ xsi:schemaLocation=”http://www.w3.org/2001/10/synthesis http://www.w3.org/TR/speech-synthesis/synthesis.xsd“ xml:lang=”zh-cn“> <!--Use ”some-filter“ to render the synthesized sound. --> <p><prosody post-filter=”some-filter“>你定购的晚餐即将送达。</prosody></p> </speak> Panasonic Beijing Laboratory
Speaking Style • <?xml version="1.0"?> <!--<template name="#1"><prosody rate="-10%" volumn="soft"/></template>--><speak version="1.0" xmlns="http://www.w3.org/2001/10/synthesis" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.w3.org/2001/10/synthesis http://www.w3.org/TR/speech-synthesis/synthesis.xsd" xml:lang="zh-cn"> <!--Speaking style template --> <p ssml:prosody-template="#1">你定购的午餐即将送达。</p> </speak> Panasonic Beijing Laboratory
Macro (Variable, Alias) • <?xml version="1.0"?> <!--<macro name=“date">2005/10/20</macro><macro name=“article">CD随身听、剃须刀</macro>--><speak version="1.0" xmlns="http://www.w3.org/2001/10/synthesis" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.w3.org/2001/10/synthesis http://www.w3.org/TR/speech-synthesis/synthesis.xsd" xml:lang="zh-cn"> <!--Macro --> <p >你定购的物品<macro>article</macro>将会在<macro>date</macro>送达。</p> </speak> Panasonic Beijing Laboratory
Say-as Extension: Translation • <?xml version="1.0"?> <speak version="1.0" xmlns="http://www.w3.org/2001/10/synthesis“ xmlns:xsi=”http://www.w3.org/2001/XMLSchema-instance“ xsi:schemaLocation=”http://www.w3.org/2001/10/synthesis http://www.w3.org/TR/speech-synthesis/synthesis.xsd“ xml:lang=”zh-cn“> <!--Translation say-as element. IBM -> 国际商业机器公司 --> <p ><say-as interpret-as = ”translation“ >IBM</say-as>应用语音技术来促进儿童教育。</p> </speak> Panasonic Beijing Laboratory