930 likes | 944 Vues
Explore the complexities and models of vagueness in AI, with practical applications in natural language generation. Topics include linguistic theories, generative models, and implications for computational semantics.
E N D
Vagueness: a problem for AI Kees van Deemter University of Aberdeen Scotland, UK Harbin Institute of Technology, NLP Summer School, July 2008
Overview • Meaning: the received view • Vague expressions in NL • Why vagueness? a. vague objects b. vague classes c. vague properties • Vagueness and context Harbin Institute of Technology, NLP Summer School, July 2008
Overview (ctd.) • Models of vagueness a. Classical models b. Partial Logic c. Context-based models d. Fuzzy Logic • Probabilistic models • Generating vague descriptions • Natural Language Generation 7. Project: advertising real estate Harbin Institute of Technology, NLP Summer School, July 2008
The plan • The lectures will give you a broad overview of some of the key concepts and issues to do with vagueness • Lots of theory (very briefly of course) • Some simple/amusing examples • The project will let you apply some of these concepts practically • The project gives you a lot of freedom: you can do it in your own way! Harbin Institute of Technology, NLP Summer School, July 2008
Not Exactly:In Praise of Vagueness Oxford University Press To appear, 2009 Harbin Institute of Technology, NLP Summer School, July 2008
1. Meaning: the received view • Linguistic Theory (e.g., Kripke, Montague) : in a given “world”, each constituent denotes a set-theoretic object. • For example, Domain = {a,b,c,d,e,f,g,h,i,j,k}[[cat]] = {a,b,f,g,h,i,j}[[dog]] = {c,d,e,k} • There are no unclear cases. Everything is crisp. (Nothing is vague.)Sentences are always true or false. Harbin Institute of Technology, NLP Summer School, July 2008
1. Meaning: the received view • Linguistic Theory (e.g., Kripke, Montague) : in a given “world”, each constituent denotes a set-theoretic object. • Example: Domain of animals {a,..,k} in a village[[black]] ={a,b,c,d,e}[[white]] = {f,g,h,i,j,k}[[cat]] ={a,b,f,g,h,i,j}[[healthy]] = {d,e,f}[[dog]] = {c,d,e,k} [[scruffy]] = {a,b,h,i}[[black cat]] = {a,b,c,d,e} {a,b,f,g,h,i,j} = {a,b} [[All black cats are scruffy]]= {a,b} {a,b,h,i} = 1 Harbin Institute of Technology, NLP Summer School, July 2008
Complications • This was only a rough sketch ... • One needs syntax to know how to combine the meanings of constituents • The Example neglects intensionality: • [[fake cats]] = [[fake]][[cat]] ?? • These and other issues are tackled in tradition started by Richard Montague • (Not the focus of these lectures) Harbin Institute of Technology, NLP Summer School, July 2008
Complications • There’s more to communication. • Why was the sentence uttered? • warn a child not to approach a particular cat? • The same information could be expressed differently • ``Cats are always scruffy`` • ``Don’t touch that cat`` • What motivates the choice? • This is a key question when you try to let a computer generate sentences: Natural Language Generation (NLG) Harbin Institute of Technology, NLP Summer School, July 2008
Modern Computational Semantics • ... is still broadly in line with this ``crisp`` picture • Statistical methods do not really change matters • Example: recent work on logical entailment • uses stats to learn entailment relations and test theories by comparing with human judgments • Entailment itself is typically assumed to be crisp • But what if the extension of a word is vague? • clearly scruffy: a,b,h,i • clearly not scruffy: f • unclear cases: d,g,j : maybe scruffy? a little scruffy? • Doesx is scruffyentailx is not healthy?(Perhaps just a bit?) A subtler semantic theory is needed, which takes vagueness into account Harbin Institute of Technology, NLP Summer School, July 2008
2. Vague expressions in NL Vagueness in every linguistic category. E.g., • adjectives: large, small, young, old, ... • adverbs: quickly, slowly, well, badly, ... • determiners/articles: many, few, ... • nouns: girl, boy, castle, fortune, poison (?)... • verbs: run, improve (significantly), cause (?) ... • prepositions: (right) after, (right) above, ... • intensifiers: very, quite, somewhat, a little Harbin Institute of Technology, NLP Summer School, July 2008
In some categories, vagueness is normal: • Adjectives: hard to find crisp ones • Adverbs: hard to find crisp ones British National Corpus (BNC): 7 of the 10 most frequent adjectives are vague; the others are ... borderline cases of adjectives ( `last`, `other`, `different` ): Harbin Institute of Technology, NLP Summer School, July 2008
British National Corpus (BNC) last (140.063 occurrences) Adj? other (135.185) Adj? new (115.523) good (100.652) old (66.999) great (64.369) high (52.703) small (51.626) different (48.373) Adj? large (47.185) Harbin Institute of Technology, NLP Summer School, July 2008
Non-lexical vagueness • Generics: ``Cats are scruffy``.All cats?? ``Dutchmen are tall``. All Dutchmen?? • Temporal succession: ``He came, he saw, he conquered’’. In quick succession! • Irony: ``George is not the brightest spark`. George is probably quite stupid. • Aside: Same mechanisms in Chinese? Many possible linguistic research projects! Harbin Institute of Technology, NLP Summer School, July 2008
3. Why do we use vagueness? • Vague objects Example: Old Number One. (A court case made famous by a recent article by the philosopher Graeme Forbes.) Harbin Institute of Technology, NLP Summer School, July 2008
Old Number One [Details omitted here, but see the following animation] Harbin Institute of Technology, NLP Summer School, July 2008
Harbin Institute of Technology, NLP Summer School, July 2008
Harbin Institute of Technology, NLP Summer School, July 2008
Harbin Institute of Technology, NLP Summer School, July 2008
Harbin Institute of Technology, NLP Summer School, July 2008
Harbin Institute of Technology, NLP Summer School, July 2008
Harbin Institute of Technology, NLP Summer School, July 2008
Harbin Institute of Technology, NLP Summer School, July 2008
Harbin Institute of Technology, NLP Summer School, July 2008
Harbin Institute of Technology, NLP Summer School, July 2008
Harbin Institute of Technology, NLP Summer School, July 2008
Harbin Institute of Technology, NLP Summer School, July 2008
Harbin Institute of Technology, NLP Summer School, July 2008
Other objects are vague too ... • Rowing boats are repaired by replacing planks. When no plank is the same, is it the same boat? • Think of a book or PhD thesis. How is it written? Doesn’t it change all the time? What if it’s translated? What if it’s translated badly? • How many languages are spoken in China? Every linguists gives a different answer. • Are you the same person as 3 weeks after conception? After you’ve lost your memory? • The cat is loosing hair. Is it still part of the cat? Harbin Institute of Technology, NLP Summer School, July 2008
Why vagueness? 2. Vague classes Biology example made famous by Richard Dawkins: the ring species called ensatina. Standard principle: a and b belong to the same species iff a and b can produce fertile offspring. Harbin Institute of Technology, NLP Summer School, July 2008
Why vagueness? 2. Vague classes Biology example made famous by Richard Dawkins: the ring species called ensatina. Standard principle: a and b belong to the same species iff a and b can produce fertile offspring. It follows that species overlap, and that there are many more of them than is usually assumed! Harbin Institute of Technology, NLP Summer School, July 2008
The ensatina salamander [Details omitted here] Harbin Institute of Technology, NLP Summer School, July 2008
Why vagueness? 3. Vague predicates Why do we use words like `large/small`, `dark/light`, `red/pink/...`, `tall/short` ? Question: would you believe me if I told you that Dutch has no vague word meaning `tall`, but only the word `lang` which means `height 1.85cm`? For example,Kees is lang means height(Kees) 1.85cm Harbin Institute of Technology, NLP Summer School, July 2008
Why vagueness? `Kees is taller than 1.85cm` A The threshold of 185cm makes a crisp distinction between A : height >185cm B: height 185cm 1.85cm B Harbin Institute of Technology, NLP Summer School, July 2008
Why vagueness? `Kees is taller than 185cm` A x 185.001cm Some elements of A and B are too close to be distinguished by anyone! Example: x in Ay in B y 184.999cm B Harbin Institute of Technology, NLP Summer School, July 2008
Vagueness can have other reasons, including • We may not have an objective measurement scale (e.g., `John is nice`) • You may not share a scale with your audience (Celcius/Fahrenheit) • You may want to add your own interpretation (e.g., `Blood pressure is too high`) • You’re passing on words verbatim (e.g., I tell you what I heard in the weather forecast) Many of these reasons are relevant for NLP (including NLG)! Harbin Institute of Technology, NLP Summer School, July 2008
4. Vagueness and context Harbin Institute of Technology, NLP Summer School, July 2008
4. Vagueness and Context • Vague expressions can often not be interpreted without context • We know (roughly) how to apply `tall` to a person. Roughly: Tall(x) Height(x) >> 170cm • If we used the same standards for buildings as for people, than no building would ever be tall! • Context-dependence allows us to associate one word with very different standards. Very efficient! (Cf. Barwise & Perry, “Situations and Attitudes”) Harbin Institute of Technology, NLP Summer School, July 2008
Variant of old example [[animal]] = {a,...,k} [[black]] = {a,b,c,d,e}[[cat]] = {a,b,f,g,h,i,j}[[elephant]] = {c,d,e,k} [[black cat]] = {a,b,c,d,e} {a,b,f,g,h,i,j} = {a,b} [[small elephant]] = ? [[small]] = ? Any answer implies that x is a small elephant x is a small animal Harbin Institute of Technology, NLP Summer School, July 2008
Other types of vagueness are also context dependent • [[many]] = how many? • [[(increase) rapidly]] = how rapidly? • temporal succession. Compare the time lags between the events reported: 1. ``John entered the room. [..] He looked around.less than a second 2. ``Caesar came, [..] he saw, [..] he conquered``weeks or months 3. ``The ice retreated. [..] Plants started growing. [..] New species appeared.``many years Harbin Institute of Technology, NLP Summer School, July 2008
5. Models of vagueness Harbin Institute of Technology, NLP Summer School, July 2008
5. Models of Vagueness A problem that we would like our models to shed light on: The sorites paradox Oldest known version (Euboulides): 1 stone makes no heap; if x stones don’t make a heap then x+1 stones don’t make a heap. So, no finite heaps exist (consisting of stones only). Henceforth: `~` = `indistinguishable` `<<` = observably smaller than Harbin Institute of Technology, NLP Summer School, July 2008
Sorites paradox (version 2) Short(150.000cm) Short(200.000cm) 150.000cm ~150.001cm therefore Short(150.001cm) 150.001cm ~ 150.002cm therefore Short(150.002cm) ... therefore Short(200.000cm) Harbin Institute of Technology, NLP Summer School, July 2008
Sorites paradox • We derive a contradiction • Of course there is something wrong here But what exactly is the error? • One bit that we would like to ``honour``: Principle of Tolerance: things that resemble each other so closely that people cannot notice the difference should not be assigned (very) different semantic values Harbin Institute of Technology, NLP Summer School, July 2008
5. Models of Vagueness • Partial Logic. A small modification of Classical Logic. Three truth values: True, False, Undecided (in the “gap” between True and False). Harbin Institute of Technology, NLP Summer School, July 2008
Partial Logic Tall Partial Logic makes some formulas true, others false, yet others undecided. In the partial model on the right, if Height(John)=170cm then [[Tall(John)]] = undecided 185cm Gap 165cm Not tall Harbin Institute of Technology, NLP Summer School, July 2008
Some connectives defined [[p q]] = True iff [[p]]=True or [[q]]=True [[p q]] = False iff [[p]]=[[q]]=False [[p q]] = Undecided otherwise [[p]] = True iff [[p]] = False [[p]] = False iff [[p]] = True [[p]] = Undecided otherwise Harbin Institute of Technology, NLP Summer School, July 2008
Consider ... [[Tall(John) Tall(John)]] (an instance of the Law of Excluded Middle) Is this true, false, or undecided? Harbin Institute of Technology, NLP Summer School, July 2008
Consider ... [[Tall(John) Tall(John)]] [[Tall(John)]] = undecided, therefore [[Tall(John)]] = undecided, therefore [[Tall(John) Tall(John)]] = undecided Harbin Institute of Technology, NLP Summer School, July 2008
Repair by means of supervaluations Suppose I am uncertain about something(e.g., the exact threshold for ``Tall``) Suppose p is true regardless of how my uncertainty is resolved ... Then I can conclude that p Harbin Institute of Technology, NLP Summer School, July 2008