1 / 45

INFORMATICA UMANISTICA D: LESSICOGRAFIA E COMPUTER

INFORMATICA UMANISTICA D: LESSICOGRAFIA E COMPUTER. Semantica lessicale Tesauri WordNet. SEMANTICA LESSICALE. Nella lezione 2 iniziammo a discutere la caratterizzazione del significato delle parole nei dizionari contemporanei

ermin
Télécharger la présentation

INFORMATICA UMANISTICA D: LESSICOGRAFIA E COMPUTER

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. INFORMATICA UMANISTICA D: LESSICOGRAFIA E COMPUTER Semantica lessicaleTesauriWordNet

  2. SEMANTICA LESSICALE • Nella lezione 2 iniziammo a discutere la caratterizzazione del significato delle parole nei dizionari contemporanei • In questa lezione discuteremo piu’ in dettaglio queste definizioni, e parleremo di altri tipi di dizionari che cercano di caratterizzare questi significati in modo piu’ preciso: tesauri e WordNet

  3. TIPI DI DEFINIZIONI IN UN DIZIONARIO • GENUS E DIFFERENTIA: • “stating the superordinate concept next to the definiendum together with at least one distinctive feature” • SINONIMIA • TIPICALITA’ • USO

  4. GENUS DIFFERENTIAE GENUS E DIFFERENTIA horsenoun 1 a solid-hoofed plant-eating domesticated mammal with a flowing mane and tail, used for riding, racing, and to carry and pull loads New Oxford Dictionary of English

  5. LIMITI DELLA DEFINIZIONE VIA GENUS & DIFFERENTIA (lez.2) • Putnam: • `faggio’ / `olmo’ • `diamante’ / `zircone’ • Jackson: happen vs occur vs befall vs transpire • Everything is illuminated: `harmonize’ vs `agree’,

  6. TIPI DI DEFINIZIONI IN UN DIZIONARIO • GENUS E DIFFERENTIA • SINONIMIA • Molte parole, specialmente astratte, difficili da definire in modo analitico • In questo caso si usano sinonimi • TIPICALITA’ • USO

  7. CIRCOLARITA DEFINIZIONE PER SINONIMIA miserable 1 very unhappy, wretched 2 causing misery 3 squalid 4 mean unhappy 1 sad or depressed 2 unfortunate or wretched wretched 1 miserable or unhappy 2 worthless Collins Pocket English Dictionary (2000)

  8. TIPI DI DEFINIZIONI IN UN DIZIONARIO • GENUS E DIFFERENTIA • SINONIMIA • TIPICALITA’ • La definizione specifica cos’e’ “tipico” del referente • USO

  9. DEFINIZIONE PER TIPICALITA’ day of rest a day set aside from normal activity, typically, Sunday on religious grounds measles an infectious viral disease causing fever and a red rash, typically occurring in childhood Concise Oxford Dictionary

  10. TIPI DI DEFINIZIONI IN UN DIZIONARIO • GENUS E DIFFERENTIA • SINONIMIA • TIPICALITA’ • USO • La definizione spiega l’uso di una parola • Tipica specialmente per le parole funzionali (articoli, preposizioni, etc)

  11. RELAZIONI DI SIGNIFICATO • Molte di queste definizioni stabiliscono il significato di una parola tramite relazioni di significato con altre parole: • IPONIMIA: cane / animale • SINONIMIA: scemo / cretino • ANTONIMIA: giusto / sbagliato • MERONIMIA: cavallo / criniera

  12. IPONIMIA • HYPONYMY is the relation between a subclass and a superclass: • CAR and VEHICLE • DOG and ANIMAL • BUNGALOW and HOUSE • Generally speaking, a hyponymy relation holds between X and Y whenever it is possible to substitute Y for X: • That is a X -> That is a Y • E.g., That is a CAR -> That is a VEHICLE. • HYPERNYMY is the opposite relation

  13. SINONIMIA • Two words are SYNONYMS if they have the same meaning at least in some contexts • E.g., PRICE and FARE; CHEAP and INEXPENSIVE; LAPTOP and NOTEBOOK; HOME and HOUSE • I’m looking for a CHEAP FLIGHT / INEXPENSIVE FLIGHT • From Roget’s thesaurus: • OBLITERATION, erasure, cancellation, deletion • But few words are truly synonymous in ALL contexts: • I wanna go HOME / ?? I wanna go HOUSE • The flight was CANCELLED / ?? OBLITERATED / ??? DELETED

  14. ANTONIMIA • La relazione di antonimia lega lemmi con significati opposti: • giusto / sbagliato; piccolo / grande • Alle volte anche antonimia ‘estesa’ • destra / sinistra; cane / gatto

  15. ANTONIMIA artificial not real conventional not spontaneous or sincere or original vacant not occupied Concise Oxford Dictionary 9

  16. MERONIMIA • La relazione tra le parti ed il tutto: • Criniera / cavallo; ruota / auto

  17. HYPERNYM PARTI MERONIMIA NELLE DEFINIZIONI horsenoun 1 a solid-hoofed plant-eating domesticated mammal with a flowing mane and tail, used for riding, racing, and to carry and pull loads New Oxford Dictionary of English

  18. QUANTI SIGNIFICATI? • horsenoun • 1 a solid-hoofed plant-eating domesticated mammal with a flowing mane and tail, used for riding, racing, and to carry and pull loads • Equus caballus, family Equidae (the horse family), descended from the wild Przewalski’s horse. The horse family also includes the asses and zebras. • An adult male horse; a stallion or gelding. A wild mammal of the horse family • 2 a frame or structure on which something is mounted or supported, especially a sawhorse. • 3 [mass noun] informal heroin • 4 informal a unit of horsepower: the huge 63-horse 701-cc engine • 5 Mining an obstruction in a vein • New Oxford Dictionary of English

  19. QUANTI SIGNIFICATI? horsen 1 a domesticated perissodactyl mammal, Equus caballus, used for draught work and riding: family Equidae 2 the adult male of this species; stallion. 3 wild horse. 3a a horse (Equus caballus) that has become feral. 3b another name for Przewalski’s horse. 4a any other member of the family Equidae, such as the zebra or ass. 4b (as modifier): the horse family5 (functioning as pl) horsemen, especially cavalry: a regiment of horse6 Also called: buckGymnastics: a padded apparatus on legs, used for vaulting, etc 7 a narrow board supported by a pair of legs at each end, used as a frame for sawing or as a trestle, barrier, etc 8 a contrivance on which a person may ride and exercise 9 a slang word for heroin10Mining a mass of rock within a vein or ore. 11Nautical. A rod, rope or cable, fixed at the ends, along which something may slide by means of a thimble, shackle, or other fitting; traveller. 12Chess. An informal name for knight. 13Informal. Short for horsepower. 14 (modifier) drawn by horse or horses: a horse cart. Collins English Dictionary 4

  20. OMONIMIA E POLISEMIA • OMONIMIA: I significati sono ben distinti (e.g., etimologie diverse) • BANK • ‘SCANNARE’ come ‘fare a pezzi’ / ‘italianizzazione di TO SCAN’; GRU come uccello / macchina per sollevare pesi • POLISEMIA: i significati sono collegati • MOUTH • VERDE’ come ‘avente un certo colore’ e come ‘ricco di vegetazione’

  21. QUANTI SIGNIFICATI? The `lumpers’ like to lump meanings together and leave the user to extract the nuance of meaning that corresponds to a particular context, whereas the `splitters’ prefer to enumerate differences of meaning in more detail; the distinction corresponds to that between summarizing and analysing. Allen, R. Lumping and splitting, English today, 16(4), 61-3

  22. CRITERI ? • GRAMMATICALI • Sensi nominali vs verbali • Usi transitivi & intransitivi (Hirst, 1987) • Ross KEPT staring at Nadia’s decolletage • Nadia KEPT calm and made a cutting remark • Ross wrote of his embarassment in the diary that he KEPT. • COLLOCAZIONI • isometric da CED4: • (of a crystal or system of crystallization) having three mutually perpendicular equal axes • (of a method of projecting a drawing in three dimensions) having the three axes equally inclined and all lines drawn to scale • ETIMOLOGIA

  23. PROBLEMI • Gia’ menzionato: distinzioni di senso non sempre facili • Circolarita’ • Relazioni non usate in modo coerente

  24. EAT-LEX-1 SEMANTICA & LESSICO: UN RIASSUNTO “eat” “eats” eat0600 eat0700 “ate” “eaten” WORD-FORMS LEXEMES SENSES

  25. STOCK-LEX-1 STOCK-LEX-2 STOCK-LEX-3 L’ORGANIZZAZIONE DEL LESSICO stock0100 stock0200 stock0600 “stock” stock0700 stock0900 stock1000 WORD-FORMS LEXEMES SENSES

  26. CHEAP-LEX-1 CHEAP-LEX-2 INEXP-LEX-3 SINONIMIA cheap0100 “cheap” …. …… cheapXXXX inexp0900 “inexpensive” inexpYYYY WORD-FORMS LEXEMES SENSES

  27. DIZIONARI ORGANIZZATI SULLA BASE DEL SIGNIFICATO • Tesauri • WordNet

  28. TESAURI • Dizionari organizzati per argomenti sono apparsi simultaneamente a quelli organizzati alfabeticamente (Ǽlfric: Glossary, ~ 1000) • Piu’ famoso dizionario tematico: Peter Mark Roget, Thesaurus of English Words and Phrases, apparso per la prima volta nel 1852

  29. ROGET THESAURUS: CLASSI • ABSTRACT RELATIONS Sezioni: Existence, relation, quantity, order, number, time, change, causation • SPACE • MATTER • INTELLECT • VOLITION • AFFECTIONS

  30. ROGET’S THESAURUS: SEZIONI & INSIEMI DI PAROLE • ABSTRACT RELATIONS • ….IV. ORDER • 1. GENERAL 58 Order 59 Disorder 60 Arrangement 61 Derangement • 2. CONSECUTIVE 62 Precedence 63 Sequence 64 Precursor 65 Sequel 66 Beginning 67 End 68 Middle

  31. ALTRI TESAURI • A THESAURUS OF OLD ENGLISH (Roberts, 1995) • HISTORICAL THESAURUS OF ENGLISH (Christian Kay) • LONGMAN DICTIONARY OF SCIENTIFIC USAGE

  32. WORDNET • A lexical database created at Princeton • Freely available for research from the Princeton site • http://www.cogsci.princeton.edu/~wn/ • Information about a variety of SEMANTICAL RELATIONS • Three sub-databases (supported by psychological research as early as (Fillenbaum and Jones, 1965)) • NOUNs • VERBS • ADJECTIVES and ADVERBS • Each database organized around SYNSETS

  33. SYNSETS • Senses (or `lexicalized concepts’) are represented in WordNet by the set of words that can be used in AT LEAST ONE CONTEXT to express that sense / lexicalized concept: the SYNSET • E.g., {chump, fish, fool, gull, mark, patsy, fall guy, sucker, shlemiel, soft touch, mug}(gloss: person who is gullible and easy to take advantage of)

  34. IL DATABASE DEI NOMI • About 90,000 forms, 116,000 senses • Relations:

  35. IPERNIMIA 2 senses of robin                                                       Sense 1robin, redbreast, robin redbreast, Old World robin, Erithacus rubecola -- (small Old World songbird with a reddish breast)       => thrush -- (songbirds characteristically having brownish upper plumage with a spotted breast)           => oscine, oscine bird -- (passerine bird having specialized vocal apparatus)               => passerine, passeriform bird -- (perching birds mostly small and living near the ground with feet having 4 toes arranged to allow for gripping the perch; most are songbirds; hatchlings are helpless)                   => bird -- (warm-blooded egg-laying vertebrates characterized by feathers and forelimbs modified as wings)                       => vertebrate, craniate -- (animals having a bony or cartilaginous skeleton with a segmented spinal column and a large brain enclosed in a skull or cranium)                           => chordate -- (any animal of the phylum Chordata having a notochord or spinal column)                               => animal, animate being, beast, brute, creature, fauna -- (a living organism characterized by voluntary movement)                                   => organism, being -- (a living thing that has (or can develop) the ability to act or function independently)                                       => living thing, animate thing -- (a living (or once living) entity)                                           => object, physical object --                                                => entity, physical thing --

  36. MERONIMIA wn beak –holon Holonyms of noun beak 1 of 3 senses of beak Sense 2 beak, bill, neb, nib PART OF: bird

  37. VERBI • About 10,000 forms, 20,000 senses • Relations between verb meanings:

  38. RELAZIONI TRA SIGNIFICATI VERBALI V1 ENTAILS V2 when Someone V1 (logically) entails Someone V2- e.g., snore entails sleep TROPONYMY when To do V1 is To do V2 in some manner- e.g., limp is a troponym of walk

  39. AGGETTIVI & AVVERBI • About 20,000 adjective forms, 30,000 senses • 4,000 adverbs, 5600 senses • Relations:

  40. COME USARLO • Online: http://cogsci.princeton.edu/cgi-bin/webwn • Scaricatevelo, poi da command line: • Get synonyms: • wn –synsn bank • Get hypernyms: • wn –hypen robin • (also for adjectives and verbs): get antonyms • wn –antsa right

  41. I LIMITI DI WORDNET • Coverage • words not in WordNet • Crocidolite, spinoff (spin-off) • Missing information: MERONYMY • Context-dependent senses: • slump, crash, bust all synonyms in the WSJ corpus • The structure of WordNet • Some information is encoded in complex ways (room, wall, floor) • But: MOVING TARGET!!

  42. MERONIMIA IN WORDNET: UN ESPERIMENTO • 100 bridging descriptions in a mereological relation • Ran a script trying to find a direct link in WordNet (1.7) between one of the senses of the BD and one of the senses of any of the previous NPs • Results: in only 6 cases there is in WordNet a direct lexical relation between a BD and one of the CFs

  43. ARTIFACT IS-A IS-A HOUSING BUILDING IS-A IS-A PART-OF HOUSE HOME ROOM PART-OF PART-OF WALL FLOOR John looked at the HOUSE. The WALL was crumbling.

  44. SOLUZIONE: ACQUISIZIONE LESSICALE • Parziale (aggiungi informazioni a WordNet, specialmente per domini specialistici) • Totale (crei un nuovo lessico a partire da zero)

  45. LETTURE • Jackson, cap. 8 • C. Fellbaum. WordNet: An electronic lexical database. MIT Press, 1998 • cap. 1

More Related