1 / 53

Natural Language Processing

Natural Language Processing. Spring 2007 V. “Juggy” Jagannathan. Course Book. Foundations of Statistical Natural Language Processing. By Christopher Manning & Hinrich Schutze. Chapter 3. Linguistic Essentials January 22, 2007. Parts of Speech and Morphology.

jennis
Télécharger la présentation

Natural Language Processing

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Natural Language Processing Spring 2007 V. “Juggy” Jagannathan

  2. Course Book Foundations of Statistical Natural Language Processing By Christopher Manning & Hinrich Schutze

  3. Chapter 3 Linguistic Essentials January 22, 2007

  4. Parts of Speech and Morphology • Syntactic/Grammatical categories – Parts of Speech (POS) • Nouns – refer to people, animal, concepts & things • Verbs – to express action in a sentence • Adjectives – describe properties of nouns • Substitution test for adjectives • Ex: The {sad, intelligent, green, fat…} one is in the corner.

  5. Word class/lexical categories • Open or lexical categories • Nouns, verbs and adjectives that have a large membership and continually grows as new words are added to the language • Closed word or functional categories • Prepositions and determiners • Ex. Of, on, the, a • Words are listed in a “dictionary” referred to by linguists as the “lexicon”

  6. Tags • Parts of Speech tagging – 8 categories – referred to as POS tags. • Corpus Linguists use more fine grained tagging • Various corpus have been tagged extensively and the pioneering one is the Brown corpus. • Adjectives in Brown corpus are referred by the tag “JJ”

  7. Morphological process • Source: http://www.sil.org/LINGUISTICS/GlossaryOfLinguisticTerms/WhatIsAMorphologicalProcess.htm • “Definition A morphological process is a means of changing a stem to adjust its meaning to fit its syntactic and communicational context.” • Examples • Plural form (dog-s) derived from (dog)

  8. Morphological processes • Major forms of morphological processes • Inflection • Systematic modification of a root (stem) form by means of prefixes and suffixes • Inflection does not change the meaning of the word but does change word features such as tense and plurality. • All of the inflectional forms of a word are grouped as manifestation of a “lexeme” • Derivation • Can dramatically change the meaning of the derived word. • Ex: Adverb “widely” derived from adjective “wide” • Ex: suffix use – weak-en; soft-en; understand-able; accept-able; teach-er; lead-er; • Compounding • Merging of two or more words into a new word (concept) • Ex. Disk drive, tea kettle, college degree, down market, mad cow disease, overtake

  9. Nouns and Pronouns • Nouns – refers to people, animals and things • Dog, tree, person, hat, speech, idea, philosophy • Inflection is a process by which stem of a word can be modified to create new word • English the only form of inflection is one indicating whether a noun is singular or plural • Ex. Dogs, trees, hats, speeches, persons • Irregular inflection examples: women • Other languages use inflection to convey “gender – masculine, feminine, neuter” and “case – nominative, genitive, dative, accusative).

  10. Gender forms • Pronouns • Masculine (he), feminine (she), neuter (it) • Case relationship in English – the genitive case • Ex: the woman’s house; the students’ grievances • Possessive pronouns • Ex: my car • Second possessive form of pronoun: a friend of mine • Reflexive pronouns – ex. Herself, myself • Ex: • Mary saw herself in the mirror. • Mary saw her in the mirror. • Also referred to as “anaphors” must refer to something nearby in the text.

  11. Brown tags ** Examples from: http://www.tameri.com/edit/doubles.html

  12. Pronoun forms and Brown Tags

  13. Words that accompany nouns: determiners and adjectives • Determiners – describe the particular reference of a noun • Articles – refers to someone or something • “the” refers to someone or some thing we already know about and is being referenced • Ex. “the tree” refers to a known tree. • “a” or “an” introduces a new reference to some thing that has not appeared before or its identity cannot be inferred from the context.

  14. Determiners and adjectives • Demonstratives • “this” or “that” • Adjectives • Describe properties of nouns • ex: a red rose, this long journey, many intelligent children, a very trendy magazine. • The above is also referred to as: attributive or adnominal. • Predicative form of adjective (appearing in the object place of a sentence) • Ex. The rose is red. The journey will be long.

  15. Agreement • Agreement, here refers to congruence in gender, case and number between the determiner, adjective and the noun. Many languages, this can be quite complex.

  16. Adjectives and Brown tags • Positive – the basic form of an adjective [JJ] • Ex. Rich, trendy, intelligent • Comparative [JJR] • Ex. Richer, trendier • Superlative [JJT] • Ex. Richest, trendiest • Semantically superlative adjectives [JJS] • Ex. Chief, main and top • Numbers – are subclasses of adjectives • Cardinals [CD] • Ex. One, two, and 6,000,000 • Ordinals [OD] • Ex. First, second, tenth • Periphrastic forms - forms made by using auxiliary words • Ex. More intelligent, most intelligent

  17. Brown tags for determiners, quantifiers • Determiners • Articles [AT] • Singular determiners [DT] • This, that • Plural determiners [DTS] • These, those • Determiners that can be both singular or plural [DTI] • Some, any • Double conjunction determiners [DTX] • Either, neither • Quantifiers • Words that express ideas like “all”, “many”, “some” • Pre-quantifier [ABN] • All, many • Nominal pronoun [PN] • One, something, anything, something • Interrogative pronouns • [WDT] – wh-determiner – what, which • [WP$] – possesive wh-pronoun: whose • [WPO] – objective wh-pronoun: whom, which, that • [WPS] – nominative wh-pronoun: who, which, that

  18. Verbs

  19. Phrase Structure

  20. Phrase Structure • Noun phrases [NP] • Noun is the head of the noun phrase • Prepositional phrases [PP] • Headed by preposition and contain a NP complement • Verb phrases [VP] • Headed by a verb • Ex. Getting to school on time was a struggle. • Adjective phrases [AP] • She is very sure of herself • He seemed a man who was quite certain to succeed.

  21. Phrase Structure Grammars • Syntactic analysis allows us to infer the meaning – meaning completely different in the following two sentences that use the same words • Mary gave Peter a book • Peter gave Mary a book • Some languages the order of the words does not matter – free word order language

  22. Rewrite rules

  23. Labeled bracketing

  24. Non-local and long-distance dependencies • Subject-verb agreement • The women who found the wallet were given a reward. • Long-distance relationship • Which book should Peter buy? • These dependencies impact statistical NLP approaches

  25. Dependency: Arguments and adjuncts • Dependency • Concept of dependents • “Sue watched the man at the next table” • Sue and man are dependent on watched. • The PP “at the next table” is dependent of man. It modifies man. • The two phrases can be viewed as “arguments” of the verb “watched”. • Semantic roles • Agent of an action is the person or thing doing the action [also viewed as subject] • Patient – is the person or thing that is being acted on [also viewed as the object]

  26. Active & Passive voice • Example • Children eat candy. • Candy is eaten by children

  27. Adjuncts

  28. Sub categorization Frame The set of arguments that a verb can appear with is referred to as sub categorization frame.

  29. Selectional restrictions or selectional preferences

  30. X’ Theory • N’ – “N bar nodes” • http://en.wikipedia.org/wiki/X-bar_theory

  31. Phrase Structure Ambiguity

  32. Garden Paths • Parsing the following sentence • The horse raced past the barn fell. • Garden path parse is the phenomenon by which a parse that is generated from “the horse raced past the barn” will have to be abandoned to accommodate “fell”.

  33. Ungrammatical constructs • Parsing may fail or can get multiple parses due to ungrammatical constructs • Slept children the • Some sentences may be grammatically correct but meaningless • Colorless green ideas sleep furiously. • The cat barked.

  34. Semantics and Pragmatics Lexical Semantics: study of how meanings of individual words are combined into the meaning of sentences. Hypernymy vs Hyponymy animal is a hypernym of cat cat is a hyponym of animal Antonym – words with opposite meanings Meronymy – part belonging to a whole tire is a meronym of a car Holonym – whole corresponding to a part Synonyms – words with similar meanings Homonyms – words that are spelled the same but have different meanings bank – river bank; bank – a financial institution Senses Polyseme – if the different senses (meanings) of the word are related. Example “branch” could mean part of a tree; could mean dependant part of an organization. Ambiguity – lexical ambiguity refers to both homonymy and polyseme Homophony – homonyms that are also pronounced the same. “bass” for example could mean a fish or low pitched sound – and is NOT a homophone.

More Related