1 / 24

Natural Language Understanding

Natural Language Understanding. Outline: Motivation Structural vs Statistical Approaches Syntax Semantics Semantic grammars Augmented Transition Nets NLU in Closed Worlds: Operational Semantics The STONEWORLD program Statistical NLP. Motivation.

ilana
Télécharger la présentation

Natural Language Understanding

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Natural Language Understanding Outline: Motivation Structural vs Statistical Approaches Syntax Semantics Semantic grammars Augmented Transition Nets NLU in Closed Worlds: Operational Semantics The STONEWORLD program Statistical NLP CSE 415 -- (c) S. Tanimoto, 2008 Natural Language Understanding

  2. Motivation Make it easier for people to give commands to computers. Allow computers to perform language translation. Allow computers to listen to lectures and read books, in order alleviate the knowledge acquisition bottleneck. Improve information retrieval services including search engines such as Google. Integrate robots into human society. Better understand human communication and linguistics. CSE 415 -- (c) S. Tanimoto, 2008 Natural Language Understanding

  3. Structural vs Statistical Approaches Structural Approach: Analytical approach based on the linguistic structure of language – esp. syntax as studied by Chomsky. Encompasses handcrafted lexical analyzers, parsers, semantic interpreters, and knowledge bases. Example technique: Augmented Transition Nets based on semantic grammars. Statistical Approach: Grows out of the availability of large language corpora via the Internet, and improvements in machine learning technology. Example technique: Latent Semantic Analysis CSE 415 -- (c) S. Tanimoto, 2008 Natural Language Understanding

  4. Levels of Analysis for NLU(for both structural and statistical approaches) (Read up from the acoustic level to the pragmatic level) Pragmatic level (goals, intents, dialog, rhetorical structure, speech acts) Semantic level (meaning, representation) Syntactic level (grammar, phrase structure) Lexical, Morphological level (words, inflections) Phonological level (acoustic features -- phonemes) Acoustic level (sensing, signal processing) CSE 415 -- (c) S. Tanimoto, 2008 Natural Language Understanding

  5. Syntax, Semantics, Pragmatics By taking a more systematic approach to NLU at these levels (than was done in programs like ELIZA), we will be able to create more useful and reliable natural language interfaces. Issues to resolve: What is the ultimate purpose of language, and how does that influence NLU? How can the phrase structure of natural language be captured in a grammar? How can meaning be interpreted and represented? How can the syntax and semantics of a system be designed to match the needs of an application? CSE 415 -- (c) S. Tanimoto, 2008 Natural Language Understanding

  6. Communicating with Language Language is for communication. Communication usually means sending and receiving information. Sentences describe events, states of the world, objects and ideas, feelings and attitudes, and hypothetical situations. Phrase-structure grammars provide a method of organizing the components of messages, allowing for a great variety of possible meanings. CSE 415 -- (c) S. Tanimoto, 2008 Natural Language Understanding

  7. Syntax Describes the form, not meaning, of sentences in a language. Syntax is traditionally described with formal systems called grammars. A context free grammar can be specified with 4 components: G = (Σ, V, S, P) where Σ is a finite set of terminal symbols called the alphabet. V is a finite set of nonterminal symbols (“syntactic categories,” e.g., noun, noun-phrase, clause, etc.) S is a distinguished member of V called the start symbol (or “the initial sentential form”). P is a finite set of productions (rewrite rules). Each production has the form A  b0 b1 ... bn-1 where A is a nonterminal symbol and each bi is either a terminal or nonterminal. CSE 415 -- (c) S. Tanimoto, 2008 Natural Language Understanding

  8. Example Grammar from a Formal Languages Context G = ({0, 1}, {S, A, B}, S, P), where P = { S  A S B A 0A0 A 1 B 1B1 B 0 } CSE 415 -- (c) S. Tanimoto, 2008 Natural Language Understanding

  9. Example Grammar from a Computational Linguistics context G = ({symbols, are, tools}, {S, N, V}, S, P), where P = { S  NVN N  symbols N  tools V  are } A derivation of a sentence from S: S NVN tools VN toolsare N tools are symbols Each item in the sequence is asentential form. CSE 415 -- (c) S. Tanimoto, 2008 Natural Language Understanding

  10. Exercise For each of the strings below, determine whether or not it is in L(G), the language generated by G. If it’s in the language, give a derivation. 01 λ 011001 01S10 101S101 G = ({0, 1}, {S}, S, P), where P = {S 01S, S 10S, S 0S1, S 1S0, S 01, S 10} CSE 415 -- (c) S. Tanimoto, 2008 Natural Language Understanding

  11. Semantics • The job of semantic analysis is to construct a representation of the meaning of a piece of NL text. • Meaning representations can be • descriptive – like definitions of words in a dictionary • operational – e.g., executable program code • anything in-between • Semantic primitives: Often the meaning of a word or small phrase consists of a reference to a node in a semantic network, such as WordNet. • Semantic compounds: More complex meanings may be represented as case frames, or (relatively) small semantic networks whose nodes in turn reference nodes in a large semantic network or dictionary. CSE 415 -- (c) S. Tanimoto, 2008 Natural Language Understanding

  12. Semantics (cont.) One approach: Representation of meaning using case frames. A frame is an attribute-value structure. In a case frame, the frame has a type that usually corresponds to a verb. The particular kinds of attributes in the frame depend on the type. “Alexander took an exam.” Action: take (write, submit to) Agent: Alexander Object: examination Time: past CSE 415 -- (c) S. Tanimoto, 2008 Natural Language Understanding

  13. Semantic Analysis: Interpretation The process of semantic analysis starts with either NL text or a parse (e.g., parse tree). It produces a representation of the meaning of the text. This process is also called “semantic interpretation” or simply “interpretation”. One successful approach to interpretation for some computer applications involves coordinating parsing and interpretation (similar to syntax-directed translation in some programming language compilers). For this approach, we usually need a “semantic grammar” ... CSE 415 -- (c) S. Tanimoto, 2008 Natural Language Understanding

  14. Semantic Grammar A semantic grammar is a grammar whose syntactic categories correspond directly to groups of words whose meanings can be largely inferred from the parse. <command> <do-word> the <job-word> <do-word> do | perform | start | finish <job-word> job | task | command | activity | operation “start the activity” “do the operation” “finish the job” CSE 415 -- (c) S. Tanimoto, 2008 Natural Language Understanding

  15. Controlled Language A controlled language is a subset of a natural language specified in a computer-based representation or formal system for the purpose of facilitating analysis or understanding by computer. The language generated by a semantic grammar is one type of controlled language. CSE 415 -- (c) S. Tanimoto, 2008 Natural Language Understanding

  16. Augmented Transition Nets An ATN is a language processor that combines parsing and translation. It is based on a collection of transition diagrams. <command> the <do-word> <job-word> <do-word> do, etc. <job-word> job, etc CSE 415 -- (c) S. Tanimoto, 2008 Natural Language Understanding

  17. Stone World A microworld: 2-D cellular space in which various objects can be placed. An agent “Mace” that takes commands from the user, and which inhabits the microworld. Stationary objects: pillars, wells, quarries. Portable objects: stones, gems. Actions: Mace can move and can carry objects. A natural-language interface: Augmented transition network based on a semantic grammar. CSE 415 -- (c) S. Tanimoto, 2008 Natural Language Understanding

  18. Stone World Motivation Demonstrates a full combination of syntax, semantics, actions, and responses. An artificial, closed world permits unambiguous interpretation. Stone World offers a substrate upon which experiments can games can be constructed. Stone World, while simple by comparison, shares these features with the well-known research system SHRDLU, developed by Terry Winograd at MIT. CSE 415 -- (c) S. Tanimoto, 2008 Natural Language Understanding

  19. Stone World’s ATN * TO (DNP1) TOWARD (DNP1) G2 G3 * SHOW * (GO-VERB) G1 T2 T3 T4 LAST (TAKE-VERB) UP, (NP1) * G1 DOWN,  DOWN,  (PUT-VERB) P2 P3 IT CSE 415 -- (c) S. Tanimoto, 2008 Natural Language Understanding

  20. Stone World’s ATN (Cont) NP1 NP2 (ARTICLE) (OBJ-NOUN) (OBJ-NOUN) DNP1 DNP2 (ARTICLE) (DIRECTION-NOUN) CSE 415 -- (c) S. Tanimoto, 2008 Natural Language Understanding

  21. Demonstration of Stone World • The Python Implementation of Stone World consists of two parts: • representation and methods for accessing and transforming the state of the microworld; • the Augmented Transition Network and other support for the natural-language interface. CSE 415 -- (c) S. Tanimoto, 2008 Natural Language Understanding

  22. Sample Conversation WALK NORTH * I UNDERSTAND YOU. OK GO TO THE WEST * I UNDERSTAND YOU. OK GO WEST * I UNDERSTAND YOU. OK TAKE A STONE FROM THE QUARRY * I UNDERSTAND YOU. OK DROP THE STONE TOWARD THE EAST * I UNDERSTAND YOU. OK TAKE A STONE * I UNDERSTAND YOU. OK DROP IT TO THE NORTH * I UNDERSTAND YOU. OK GO SOUTH * I UNDERSTAND YOU. OK CSE 415 -- (c) S. Tanimoto, 2008 Natural Language Understanding

  23. Statistical NLP Statistics has long been a part of computational linguistics. However, interest in the approach has grown rapidly during the 1990s as the Internet has grown. Subareas include corpus-based language description, applications in improving search-engine indexing and retrieval, question answering, and data mining. CSE 415 -- (c) S. Tanimoto, 2008 Natural Language Understanding

  24. Statistical NLP (cont) Latent Semantic Analysis (use of singular-value decomposition of large term-document matrices to create “semantic spaces” in which semantically related words and documents tend to be close together – to be presented later). CSE 415 -- (c) S. Tanimoto, 2008 Natural Language Understanding

More Related