1 / 24

Subproject III - Spoken Language Systems

Subproject III - Spoken Language Systems. Members: Lin-shan Lee (PI), Lee-Feng Chien (Co-PI) Hsin-min Wang (Co-PI), Berlin Chen (Co-PI) Other Participants: Sin-Horng Chen, Yih-Ru Wang

willis
Télécharger la présentation

Subproject III - Spoken Language Systems

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Subproject III -Spoken Language Systems Members: Lin-shan Lee (PI), Lee-Feng Chien (Co-PI) Hsin-min Wang (Co-PI), Berlin Chen (Co-PI) Other Participants: Sin-Horng Chen, Yih-Ru Wang Yuan-Fu Liao, Jen-Tzung Chien

  2. Outline • Members • Research Theme • Current Achievements with Demos • Future Directions

  3. Members

  4. Research Theme Information Extraction and Retrieval (IE & IR) Multimedia Network Content Spoken Dialogues Networks Users ˙ Spoken Document Understanding and Organization

  5. Research Roadmap Current Achievements Future Directions Information Navigation across Multimedia/Spoken Documents • Term Extraction/Organization • Term Translation/Indexing • Information Extraction • And Retrieval (IE & IR) • Retrieval Modeling Cross-language Information Processing • Spoken Document Understanding and Organization • Title/Summary Generation Knowledge Discovery and Web Mining • Topic Analysis/Organization • Spoken Dialogues • ….. Speech & LanguageUnderstanding Spoken Language Applications • Distributed Speech Recognition

  6. Information Extraction & Retrieval (IE & IR) • Named Entity Extraction from Text/Spoken Documents • Taxonomy Generation • Term Translation • Retrieval Modeling for Text/Spoken Documents

  7. Named Entity Extraction from Text/Spoken Documents • Global Information for the Entire Document Extracted from Forward/Backward PAT-Trees • Some named entities may not be easily identified from a single sentence, but can be extracted when information in several sentences jointly considered • Named Entity Matching using Retrieved Text Documents to Identify Some Out-of-Vocabulary (OOV) Words

  8. Automatic Taxonomy Generation (1/2) • Problem • Find relationships and associations between terms, and organize them into a hierarchical structure (i.e. taxonomy) • Useful for identifying and analyzing concepts embedded in documents and queries • Method • An approach proposed for clustering terms into comprehensive hierarchical clusters • Web mining techniques -- automatically generating relationships between terms based on relationships between documents retrieved with the terms from the Web

  9. Automatic Taxonomy Generation (1/2) • A Typical Example for Term Taxonomy

  10. Automatic Term Translation (1/2) • Problem • Cross-language information retrieval systems usually rely on bilingual dictionaries; however, search terms are very often missing because they are proper nouns and OOVs • Discovering translations of unknown query terms in different languages • Method • Finding translations of query terms via mining of huge quantities of data obtained from the Web • Correlation/Association patterns extracted from parallel bilingual pages retrieved from the Web, the anchor texts of the pages indicating out-links to multi-lingual pages, etc.

  11. Automatic Term Translation (2/2) • The Live Query Term Translation System (LiveTrans) Machine- Extracted Translations http://wkd.iis.sinica.edu.tw/LiveTrans/lt.html

  12. Retrieval Modeling for Text/Spoken Documents (1/2) • Problem • Conventional retrieval models can not be trained or improved through use • Word usage mismatch between the query and the documents • Method • Literal term matching: HMM/N-gram model trained with ML or MCE criteria • Concept matching: Topical mixture model (TMM), extended from PLSA, trained in either supervised or unsupervised manner

  13. Retrieval Modeling for Text/Spoken Documents (2/2) • HMM/N-gram retrieval model • A document is viewed as a probabilistic generative model for the query • Literal term matching • Topical Mixture Model (extended from PLSA) • A document is composed of a set ofK latent topical distributions (unigrams) for predicting the query • Concept matching

  14. Spoken Document Understanding & Organization (1/2) • Problem • The content of multimedia documents very often described by the associated speech information • Unlike text documents with paragraphs/titles easy to look through at a glance, multimedia/spoken documents are unstructured and difficult to retrieve/browse

  15. Spoken Document Understanding & Organization (2/2) • Spoken Document Transcription • Multimedia/Spoken Document Segmentation • Summarization for Multimedia/Spoken Documents • Title Generation for Multimedia/Spoken Documents • Topic Analysis and Organization for Multimedia/Spoken Documents

  16. … distance computation Spoken Document Segmentation (Broadcast News) • Dividing a one-hour News Episode into News Stories • An improved audio segmentation technique integrating BIC and Divide-and-Conquer Approaches • Viterbi search over the Hidden Markov Model of text clusters

  17. Title Generation for Spoken Documents (Broadcast News) • Training Phase • Generation Phase • For Training Phase • Developing statistical relationships between words in the training documents and their human-generated titles • For New Spoken Documents • Transcribing into term sequences • Identifying suitable terms, and using them to generate a readable title Human-generated Titles of Training Documents T={tj, j=1,2,…,N} (text form) Training Documents D={dj, j=1,2,…,N} (text form) Computer-generated Titles of Spoken Documents T={ti, i=1,2,…,M} (text/speech form) New Spoken Documents D={di, i=1,2,…,N} (speech form)

  18. Topic Analysis and Organization for Spoken Documents (Broadcast News) • Based on Probabilistic Latent Semantic Analysis (PLSA) • Terms (words, syllable pairs, etc.)/documents analyzed by probabilities considering a set of latent topics • Trained by EM algorithm • Related documents don’t have to share common sets of terms, and related terms don’t have to co-exist in the same set of documents • Spoken Documents Clustered by the Latent Topics and Organized in a Two-dimensional Tree Structure, or a Two-layer Map Two-dimensional Tree Structure for Organized Topics

  19. Spoken Dialogues • Analysis and Design Using Quantitative Simulations

  20. Analysis and Design Based on Quantitative Simulations • Problem • Dialogue performance cannot be predicted before the system is on line • The effects of different factors, such as the system’s dialogue strategies, speech recognition and understanding conditions etc., cannot be quantitatively identified and analyzed • Method • Computer-aided analysis and design approaches based on quantitative simulations transaction success rate slot loss rate misunderstanding rate

  21. Demo: Understanding and Organization of Chinese Broadcast News with Interactive Interface

  22. Spoken Document Understanding & Organization (1/2) • Problem • The content of multimedia documents very often described by the associated speech information • Unlike text documents with paragraphs/titles easy to look through at a glance, multimedia/spoken documents are unstructured and difficult to retrieve/browse

  23. Topic Analysis and Organization for Spoken Documents (Broadcast News) • Based on Probabilistic Latent Semantic Analysis (PLSA) • Terms (words, syllable pairs, etc.)/documents analyzed by probabilities considering a set of latent topics • Trained by EM algorithm • Related documents don’t have to share common sets of terms, and related terms don’t have to co-exist in the same set of documents • Spoken Documents Clustered by the Latent Topics and Organized in a Two-dimensional Tree Structure, or a Two-layer Map Two-dimensional Tree Structure for Organized Topics

  24. Future Directions • Information Navigation across Multimedia/Spoken Documents • Fast growing of quantities of multimedia/spoken documents are much more difficult tobrowse compared to text documents • Better approaches to navigate across huge quantities of multimedia/spoken documents using comprehensive presentation (e.g. topic taxonomy) • Cross-language Information Processing Technologies • Reducing language barriers in a future world of multilingual environment • Seeking for international collaboration and resource exchanging • Collaboration between the two major non-English languages may be a good direction • Knowledge Discovery and Web Mining • Web offers live, dynamic and by far the most complete global knowledge the human beings have • Better approaches to explore the Web resources and enhance the language processing technologies

More Related