1 / 30

Knowledge Representation for Natural Language Understanding Chengqing ZONG

Knowledge Representation for Natural Language Understanding Chengqing ZONG Institute of Automation, Chinese Academy of Sciences cqzong@nlpr.ia.ac.cn. Outline. CASIA and NLPR Introduction Some Linguistic Knowledge Bases Approaches to NLU Proposal. CASIA.

vesely
Télécharger la présentation

Knowledge Representation for Natural Language Understanding Chengqing ZONG

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Knowledge Representation for Natural Language Understanding Chengqing ZONG Institute of Automation, Chinese Academy of Sciences cqzong@nlpr.ia.ac.cn

  2. Outline • CASIA and NLPR • Introduction • Some Linguistic Knowledge Bases • Approaches to NLU • Proposal

  3. CASIA Institute of Automation (IA), Chinese Academy of Sciences (CAS) Founded in 1956

  4. Personnel • Faculty members: 320, including 38 full time professors • Post-doc research fellows: 30 • Students (Ph.D. and MSc): 600 • Visiting researchers: 40+

  5. NLPR National Laboratory of Pattern Recognition • Staff: 29 • Ph.D. candidates: 140 • MSc: 120 • Post-Doc.: 7

  6. Directors Academic Committee Management Committee General Office Pattern Recognition and its Cognitive Mechanisms Group Biometric Information Processing Group Visual Information Processing Group NLPR Speech and Language Technology Group

  7. K.B. 1. Introduction • Natural language understanding is a typical task of knowledge processing Text or speech Processor Text or speech

  8. Title Time 1. Introduction • For the different tasks or different approaches, the different representations are necessitated. e.g., for document summarization or information extraction, the knowledge for discourse analyzing and topic understanding is necessary.

  9. Rule-based MT: I saw [a man with a telescope]. I [saw a man] with a telescope. NP  Det NN NP  NP PP NP …… 我用望远镜看见一个男孩。 我看见一个带望远镜的男孩。 • Statistical MT: 1. Introduction For machine translation (MT), the knowledge for sentence analyzing and translating is necessary. e.g., I saw a man with a telescope.

  10. 1. Introduction Questions: • How is about the current linguistic K. B. ? • Is an algorithm designed according to the K. B. or the representation designed for an algorithm?

  11. ? 2. Some Linguistic K. B. 2.1 WordNet (http://wordnet.princeton.edu) • Three basic Preconditions: • Separability hypothesis • Patterning hypothesis • Comprehensiveness hypothesis • Take synset as the building block • Relationships: synonymy / antonymy / hypernymy / hyponymy / meronymy / entailment

  12. 2. Some Linguistic K. B. 2.2 HowNet (http://www.keenage.com) • Knowledge, specifically, the form of knowledge that is computer-operable, is a system encompassing the varied relations amongst concepts as well as those amongst the attributes of concepts. As one acquires more concepts, or rather, captures more relations amongst concepts alongside the links between the attributes attached to the concepts, one simply becomes more knowledgeable; • On the creation of a knowledge base, a common-sense knowledge base constituting a knowledge system should first be constructed. This database shall describe general concepts and map out the relations among them.

  13. 2. Some Linguistic K. B. • Some concepts and relationships are defined.

  14. IP NP-SBJ 。PU VP VP 他PN VP NP-OBJ ADVP 还AD 提出VV QP 一CD NP 和CC 系列M NP NP CLP 具体JJ 措施NN 策略NN 要点NN 2. Some Linguistic K. B. 2.3 UPenn TreeBank http://www.cis.upenn.edu/~treebank/home.html

  15. 2. Some Linguistic K. B. 2.4 FrameNet and Others • FrameNet(frame semantics) http://framenet.icsi.berkeley.edu • PropBank、NomBank http://nlp.cs.nyu.edu/meyers/NomBank.html

  16. 2. Some Linguistic K. B. Summary: • All the presentations motioned above are human-made and human-defined; • The different K. B. is built at different level and based on the different grain, such as at lexical level and tagging lexicons, or at sentence level and annotating the syntactic structure, and so on;

  17. 2. Some Linguistic K. B. • Generally, the K. B. are developed for all-purposes and single linguistic knowledge is expressed in a specific K. B.; • However, are the representations sufficient or even complete for a natural language processing system?

  18. 3. Approaches to NLU Three methods: • Rationalistic • Empirical • Rationalistic + Empirical

  19. Inter-lingual Logical-Form Semantic-Tree Syntactic-Tree Chunk Phrase Word SL TL 3. Approaches to NLU Take MT as an example • Word-to-Word • Phrase-to-Phrase • Chunk-to-Chunk • Chunk-to-String • Tree-to-Tree (Learned, Syntactic or Semantic) • Tree-to-String • Logical-Form-to-Logical-Form p(t|s) vs. p(s|t)×p(t)

  20. Performance Years 3. Approaches to NLU Rule base Dictionary + Machine Learning Corpus base More data is better data.

  21. 3. Approaches to NLU So many hard nuts are still remained to crack: • Word sense disambiguation • Syntactic disambiguation • Semantic analysis and translating • Automatic evaluation of translation … …

  22. Increasing Number of Chinese Webpages The data are from the Information Center of China Internet 3. Approaches to NLU • The number of webpages is exponentially increased • The highest accuracy of Chinese information retrieval (webpage search) in 2006 was only about 36.7% (from 863 report)

  23. 3. Approaches to NLU What is the problem?

  24. 3. Approaches to NLU “One should build the rocket, instead of climbing the tree, if he wants to reach the moon”, Martin Kay • Is it building the rocket or climbing the tree? • Does it currently take the right way to build the rocket?

  25. Input:Speech Text Affective Computing + + Semantic Computing Perception Vision K. B. Output 3. Approaches to NLU • How does a human brain work when it translates a sentence? Dynamic Static

  26. 3. Approaches to NLU _ A man can infer the unknown word sense or sentence structure etc. from his common sense (limited knowledge), but a system can not; _ A man can dynamically and syntheticallyuse multiple knowledge sources (lexical/ syntactic/ semantic/ pragmatic) to process a specific language phenomenon. It is easy to determine what knowledge is necessary and what knowledge is unnecessary, but a system usually can not;

  27. 3. Approaches to NLU _ A man can easily get the new knowledge and renew his memory, but a system is usually difficult to do. However, a computer can memorize a number of words and phrases, do the very fast computing, and so on, but a man can not. Currently, the models for NLU mainly use the capability of computing, but rarely or hardly simulate the human’s cognitive process.

  28. 4. Proposal • For a specific task of NLU, such as word sense disambiguation, syntactic parsing, or translating etc., we need to model the cognitive process of human brain; • According to the models, to build the task-oriented knowledge base.

  29. 4. Proposal e.g., for the speech-to-speech (S2S) translation in a specific domain, the following aspects are addressed: • Investigate the effect of rhythm, tone, and accent; • Model translation in combination with language model, speech model, and common sense model etc.; • Build the knowledge base describing the language, semantic, speech, emotion, and domain-related common sense as well, which are all oriented to the S2S translation and based on the needs of translation model.

  30. thanks 谢谢 !

More Related