1 / 57

Lexicons … and Complex Expressions: towards Multilingual Linking Nicoletta Calzolari Copenhagen , October 2001

Lexicons … and Complex Expressions: towards Multilingual Linking Nicoletta Calzolari Copenhagen , October 2001. What is SIMPLE ?. A set of 12 harmonised computational lexicons for HLT applications, geared for multilingual links. A common rich model representation language

jason
Télécharger la présentation

Lexicons … and Complex Expressions: towards Multilingual Linking Nicoletta Calzolari Copenhagen , October 2001

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Lexicons … and Complex Expressions: towards Multilingual Linking Nicoletta Calzolari Copenhagen, October 2001 Copenhagen, Oct. 2001

  2. What is SIMPLE? A set of 12 harmonised computational lexicons for HLT applications, geared for multilingual links A common • rich model • representation language • methodology of building the lexicon common Template Types, with default obligatory info(Type defining), and indication of optional info • First time: on a large scale, for so many languages • Lexical meaning represented in terms of integrated combinations of different sorts of information (semantic type, argument structure, relations, features, etc. ) • Ontology-based information comes together with predicative representation and syntactic linking • A shared set of SemUs (from EWN) (about 700) of the 12 Lexicons cross-lingually related Copenhagen, Oct. 2001

  3. PAROLE/SIMPLE Architecture + CLIPS Italian National Project MuS SynU SemU SemU SemU SemU 60,000 lemmas 55,000 lemmas 55,000 SemU MuS SynU Sem Info Sem Info Sem Info Sem Info TEMPLATE Sem. Rel Sem. Feat Lexical Rel Copenhagen, Oct. 2001

  4. Semantic information in SIMPLE Word senses encoded as Semantic Units(SemUs),containing the following info: • Semantic type * • Domain * • Lexicographic gloss * • Extdended Qualia structure • Reg. Polysemy altern. • Event type • Derivation relations • Synonymy • Collocations • Argument structure for predicative SemUs * • Selection restrictions on the arguments * • Link of the arguments to the syntactic subcategorization frames (represented in the PAROLE lexicons) * Copenhagen, Oct. 2001

  5. Semantic Multidimensionality and NLP NLP tasks (IE, WSD, NP Recognition, etc.) need to access multidimensional aspects of word meaning, represented in SIMPLE with the Extended Qualia Relations Is_a_part_of Member_of la pagina del libro (the page of the book) il difensore della Juventus (Juventus fullback) il suonatore di liuto (the lute player) il tavolo di legno (the wooden table) Telic Made_of Copenhagen, Oct. 2001

  6. SemU Predicate, arguments, Selection restrictions Qualia Derivation Polysemy Event Type Overall Organization ... Greek lexicon Danish lexicon Type Ontology 150 types Catalan lexicon Template Instantiation Italian lexicon Pred. Layer … Copenhagen, Oct. 2001

  7. Semantic Information The SIMPLE Way The Core Ontology represents a first level of organization of the semantic type system Each type is associated to a Template consisting of a cluster of information (relations, features, argument structure, event type, etc.) that defines the type The information characterizing a Semantic Unit includes: a.The type defining information (associated to the template the SemU instantiates) b.Additional information (other relations or features, selectional restrictions, terminology, cross-part of speech relations, polysemy, etc.) Copenhagen, Oct. 2001

  8. Type System Coordinates Predicative Layer Qualia Structure Contextual/ Polysemy Information Template “redundancy” Copenhagen, Oct. 2001

  9. VerbExamples: hear, smell, etc. • NounExamples: sight, look, etc. • Linguistic Tests: …. • Levin Class: 30.1 (See verb, e.g. detect, see, notice), 30.4 (Stimulus subject, e.g. • look, smell) • Comments: Processes involving an experiencing relation, …. • SemU:1 <guardare_2> (look) • Usyn: • BC Number:105 • Template_Type:[Perception] • Template_Supertype: [Psychological_event] • Domain:General • Semantic Class:Perception • Gloss://free// osservare con attenzione • Event type:process • Pred _Rep.:Lex_Pred (<arg0>,<arg1>) • Derivation:<Derivational relation> • Selectional Restr.: arg0 =Animate//concept// arg1:default = [Entity] • Formal:isa (1,<SemU>:[Perception]>)<percepire>:[Psych_ev] • Agentive:<Nil> • Constitutive:instrument(1, <SemU>:[Body_part]) <occhio> • intentionality={yes,no} //optional//={yes} • Telic:<Nil> • Collocates:Collocates (<SemU1>,...<SemUn>) • Complex:<Nil> Template for “Perception” Copenhagen, Oct. 2001

  10. Modular Representation of a SemU SemU Semantic Relations Flexibility: an extendable framework to allow coherent future extensions & tuning for specific applications/text types Pred. Layer Predicate, arguments, selection restrictions, .. Rel. Layer Relations betw. SemUs Features Qualia multiple meaning dimensions in a sense Derivation cross-PoS relations Polysemy regular polysemous classes Collocation collocational information Copenhagen, Oct. 2001

  11. Top Telic Formal Constitutive Agentive Is_a Is_a_part_of Property Created_by Agentive_cause Indirect_telic Purpose ... Contains ... Instrumental Is_the_habit_of Used_for Used_as Semantic Relations .. Activity .. .. 100 Rels. • The targets of relations identify: • prototypical semantic information associated with a SemU • elements of dictionary definitions of SemUs • typical corpus collocates of the SemU Copenhagen, Oct. 2001

  12. Semantic Relations Ala (wing) <fabbricare> make Agentive SemU: 3232 Type: [Part] Part of an airplane <volare> fly Used_for Is_a_part_of <aeroplano> airplane Isa SemU: 3268 Type: [Part] Part of a building <parte> part Isa Used_for Isa SemU: D358 Type: [Body_part] Organ of birds for flying <edificio> building Is_a_part_of Is_a_part_of SemU: 3467 Type: [Role] Role in football <giocatore> player <uccello> bird Isa Copenhagen, Oct. 2001

  13. Relations and Predicates Pred_SELL<ARG0>, <ARG1>, <ARG2>, <ARG3> SemU Sell V Is_the_agent_of SemU Seller N SemU Sale N Event_noun Copenhagen, Oct. 2001

  14. Argument Structure Comprendere V Comprensione N SemU: 61725 Type: [Cognitive_event] To understand SemU: 61726 Type: [Cognitive_event] Understanding master SemU: 6962 Type: [Constitutive_state] To include verb_nominalization Comprendere#1 <Arg1 [Human]>, <Arg2 [ Semiotic]> Comprendere#2 <Arg1 [Group]>, <Arg2> master problems with selection restrictions !!! Copenhagen, Oct. 2001

  15. SIMPLE/CLIPS figures (now) (11,000 Lex. Units) 16,903SemUs • Nouns:12161 • Verbs:3476 • Adjectives:1266 • Predicates:4368 • Templates • Instrument 734 • Human 712 • PsychologicalProperty 586 • Profession 541 • Purpose_Act 535 • Part 503 • Human_Group 502 • Relational_Act 521 • AgentTemporaryActivity 320 • Domain 303 • Features & Relations • Agentive 1945 • EventTypeProcess1846 • EventTypeTransition1463 • AgentiveCause 1175 • Usedfor 1488 • Synonym 1258 • ResultingState 1197 • Isapartof 909 • Hasaspart 800 • Istheactivityof 611 • Objectoftheactivity 598 • AntonymGrad 575 • Createdby 525 • Agentverb 454 • Concerns 421 Copenhagen, Oct. 2001

  16. Core Lexicons enlarged inNational Projects PAROLE/SIMPLE/EWNstartproviding the common platform • For the subsidiarity concept the process started at the EU level is continued at thenational level: extended in (at least) 9 National Projects (Danish, Greek, Italian, Portuguese, Swedish, ...) (to be) used in applications True Infrastructure of harmonised LRs in EU Basis for Multilingual LR ENABLER(coord. A. Zampolli) Copenhagen, Oct. 2001

  17. Harmonisation:Need for a Global View • Interaction/sharing of data & software/tools • Need of compatibility among various components • An “exemplary cycle”: Formalisms Grammars Software: Taggers, Chunkers, Parsers Representation Annotation LexiconCorpora Software: Acquisition Systems I/O Interfaces Languages Copenhagen, Oct. 2001

  18. SIMPLE wrt EAGLES/ISLEStandards for Multilingual Lexical resources EAGLES guidelines for syntactic and semantic lexicons PAROLE/SIMPLE Lexicons MT systems ISLE recommendations for multilingual lexicons Multilingual Lexicons Copenhagen, Oct. 2001

  19. Mission(http://lingue.ilc.pi.cnr.it/EAGLES96/isle/ISLE_Home_Page.htm)Mission(http://lingue.ilc.pi.cnr.it/EAGLES96/isle/ISLE_Home_Page.htm) • MT and multilingual HLT need to enhance production, maintenance & extension of computational lexical resources • ISLE goals • provide a common environment for the development, integration, interchange & sharing of lexical resources with various types of linguistic information • establish a virtuous circle betw. research, applications, & standardization process: lay down a bridge betw. the worlds of research and application • mark the boundary between well-consolidated practice and theoretical achievements in multilingual HLT, and areas still open to research but critical for future technological improvements • Crucial role of intercontinental cooperation for preparing ISLE recommendations and for their validation Copenhagen, Oct. 2001

  20. ISLE and MT • Academic and industrial members of the MT community actively involved in the ISLE group • Microsoft, NMSU, Sail Labs, Systran, UMIACS, UPenn, ISI, etc. • Survey phase: • a number of lexical resources for MT systems surveyed by ISLE • MT systems requirements provide the main reference points for ISLE work, to determine: • types of lexical information critical to SL  TL mapping • criteria to create bilingual resources from existing monolingual ones • common data structures to develop reusable multilingual resources • critical areas of the lexicon:MWEs, complex transfer cases, collocational/example-based information, etc. MWE parenthesis Copenhagen, Oct. 2001

  21. MWE in ISLE & XMELLT - 2 types of MWE: 1st • (Deverbal) nominalisations +support (light) verbs • make an acquisition1 (noun.act; verb.possession) • complete an acquisition1 • undertake an acquisition1 • make an application1 (noun/verb.communication) • have an application1 in • decide on an application1 (consider, hear) • get an application1 (receive, take) • submit an application1 (file) • Noun(/Adj/Poss)+Noun MW(Ital.: N+PP/N+Adj/N+Vinf/...) • air pollution • job application • murder suspect • police action; police scandal • coltello damacellaio butcher's knife • carta di credito credit card • carta telefonica (adj) phone card • agenzia di viaggi travel agency • film per adulti adult movie (adj) • macchina da scriveretypewriter (comp.) 2nd No equivalent structures Copenhagen, Oct. 2001

  22. 1st The Boundaries:·Support Verbs: more than Light Verbs? · Nominalisations: …. to a broader set Both verbs,combined with an event noun, whose subjects are : • participants in the event identified by the noun • related to some scenario associated with the event • Type 1: take an exam, give an exam • Type 2: pass an exam, fail an exam, grade (evaluate) an exam • Type 1: perform an operation, undergo an operation • Type 2: survive an operation But also … enlarge the concept of nominalisation to • event/result/abstract nouns not morphologically derived • dare un ceffone (to slap) • provare rancore (to bear sb. a grudge) • fare una festa (to have a party) • fare festa (to have a holiday) • fare festa a qno(to give sb. a warm welcome) • prestare attenzione (to pay attention) • fare la guerra(to wage war) • fare una cessione (cedere) vs.make? a cession (…) • avere una cessazione (cessare) delle ostilita vs.have? a cessation of hostilities (…) No verb (for diachronic reason) Copenhagen, Oct. 2001

  23. 1st Hypothesis for encoding:“Mel’cuk type” Lexical Functions (LF) • to record semantic contribution and/or aspectual properties conveyed by the V • to express argument-sharingbetw 2 arg structures • Oper1: perform an operation;made an apology • Oper2: undergo an operation; merits discussion;had a visit • Func0: silence reign • Laborij: take into consideration • Incep: start the attack • Cont: maintain influence • Fin: complete the acquisition • Liqu: eradicate the disease • Real: keep the promise, approve the application • AntiReal: turn down, withdraw the application • …. Copenhagen, Oct. 2001

  24. 1st Nominalisations: examples from Corpus accusa (supp-v: formulare, lanciare, muovere, rivolgere,...(Oper1) subire[default], beccarsi, attirarsi, rischiare,...(Oper2) mettere, porre,... sotto a.(Laborij) rintuzzare, rigettare,smontare, …(Liqu) Problematic?: ritorcere, rovesciare… (...) sostenere,… (...) ripetere,… (...) ….. ____________________________________________________________ acquisizione (supp-v: (fare)[default], condurre, curare,effettuare,...(Oper1) varare,...(Incep) perfezionare, completare,concludere, …(Fin) evitare, compromettere, …(Liqu) sfumare, …(LiquFunc0) Problematic?: annuciare, dichiarare,… (say) decidere, proporre, promuovere, stimolare,… (...) consentire, permettere, proporre, garantire,… (...) ….. Automatic acquisition Copenhagen, Oct. 2001

  25. 1st Support Verbs:whatto listfor multilingual lexicons? • Decide if to include/list, for a noun • all the verbs usable for a Melcukian LF • INCEP: cominciare [default] vs. varare, intraprendere, … • INCEP: begin [default] vs. open (an investigation), … • OPER1:say a prayer(not make, like with other speech act nouns) • OPER1:pay attention • only those lexically dedicated to that noun (needed for generation) (not the general & available by default for a LF) • begin an exam/operation or finish an exam/operation • similar words preferentially select different verbs to express similar meanings (same lexical functions): lexical preference Copenhagen, Oct. 2001

  26. 2nd Complex nominalsin a multilingual framework • Different syntactic patterns in L1 & L2 • N+Nh (= head noun) in English is usually Nh+PP in Italian • tooth brush spazzolino da denti • & the syntactic pattern is not predictable • hair/clothes brush spazzola per capelli/abiti • nail brush spazzola per le unghie • travel agency agenziadi viaggi • real estate agency agenzia immobiliare • marriage bureau agenzia matrimoniale • A MWE in L1 corresponding to a fully compositional phrase • cucchiaino da caffè coffee spoon??? • For MT implies some conceptual (interlingual?) representation • but the “encoding” process must find an appropriate MWE if it is called for • analogous to blocking/pre-emption:a regular/compositional process is not carried out (dispreferred) because the semantic space occupied by the concept associated with that formation is already claimed by some ready-made expression Fillmore Copenhagen, Oct. 2001

  27. 2nd Broader scope :extension to non MWE? If look at devices in grammar that allow to produce new MWEs a continuum: N+PP>collocation>multi-word>idiom • productivemechanisms in the language • but idiosyncratic information at the borderline betw. grammar & lexicon Amounts to: • describeproductive modification relation of Nin general: • in particular those lexically selected/preferred by a N (its semantic paradigm) MWE are a subset of these (give good hints to discover most prominent relations??) • look at thesemantic structure of Nouns: i.e. at the variety of modifiers they can select by virtue of their meaning Fillmore Copenhagen, Oct. 2001

  28. 2nd Noun Compounds/Complex Nominals…are pervasive Fillmore Busa • There is a motivation in most N+N construction: • the context provides it • The FrameNet (SIMPLE) way • appeal tospecific frame structures (qualia structures) associated with the head noun, • determine from corpus attestationswhich frame elements (qualia) can get instantiatedas a modifier word • “container”:complex nominals can specify: • material (aluminium c., glass c., …) • contents (food c., trash c., …) • size (3 quart c., …) • function (shipping c., storage c., …) • ... Copenhagen, Oct. 2001

  29. 2nd Noun Compounds/Complex Nominals& multidimensional semantic approaches a. FrameNet Container Frame: Frame Elements: Material,Contents,Size,Function • Material:aluminum container, glass c., metal c., tin c. • Contents:food container, beverage c., trash c., water c., milk c., fuel c. • Size:3 quart container • Function:shipping container, storage c. b. SIMPLE Qualia Relationsof "container" used in compounds: • Constitutive:made_of [MATERIAL] aluminum container, glass c., metal c., tin c. • Telic:contains [ENTITY] food container, beverage c., trash c., water c., milk c., fuel c. • Constitutive:size [QUANTITY] 3 quart container • Telic:is_used_for [EVENT] shipping container, storage c. Copenhagen, Oct. 2001

  30. 2nd Complex Nominals/Lexical Constructionsin a multilingual context… describe vs. list? • if a compound noun is clearly lexicalized, it's simply one of the words in L1 • but if it is an instance of some productive word-formation rule, we should describe it both describe & list: • list explicitly in the lexical entry • what isidiomatic/idiosyncratic wrt generation for • lexical selection • mucca pazzavs. matta • prestare attenzionevs. pay attention • structural pattern • travel agency agenzia di viaggi • marriage bureauagenzia matrimoniale (*di matrimonio) • real estate agency agenzia immobiliare • but also,an apparatusto describehow word semantics of Ns interact when they co-occur (co-selection, co-composition, ...) Copenhagen, Oct. 2001

  31. 2nd In a multilingual context… ...regularities in each language, but they don’t match • Both for decoding & encoding, we need both: • a linguistic apparatus for interpretation (e.g. to go to a language where it is not a MWE: cucchiaino da caffèfor a Japanese useful to know … “used for”) • lists for idioms…, for unpredictable/idiosyncratic • Same apparatus to interpret both MWE & regular N constructions(similar power of expressiveness): general principles of semantic constitution of lex. items & their combinatorics in terms e.g. of frames/qualia/…: • basic sem. notions & • a general schema to characterise the problem, e.g. • frame (qualia) structure of the headN • semantic Type of the modifier N • allow the headN to impose its interpretation on the modification rel. • ... Copenhagen, Oct. 2001

  32. 2nd Complex nominals, e.g.knife (coltello) triggers • a “cutting frame” (FrameNet) • specific SIMPLE dimensions of meaning • extensively evaluate whetherqualia roles(already) encoded in SIMPLE correspond to what is necessary to interpret N-N modification relations SIMPLE Extended Qualia structure for the interpretation of the semantic relation betw. Ns (internal relational structure of MWE) • butcher’s knife (coltello da macellaio) TELIC (used_by) Y [Human] PPda • plastic knife (coltello di plastica) CONST(made_of) X [Material]PPdi • table knife (coltello da tavola)TELIC (used_in) Z [Location]PPda • hunting knife (coltello da caccia) TELIC (used_in_activity) E [Activity] Ppda • piatto di legnoCONST (made_of) X [Material]PPdi • piatto di pasta CONST(contains) X [Food]PPdi PP disambig. Copenhagen, Oct. 2001

  33. 2nd In SIMPLE: possible extension • Deverbal nominalisation: • noun murder (uccisione, delitto, omicidio(different sem. pref.) PPdiPRED:MURDER(uccidere) PPda_parte_di, diARG1:agent[Hum/Anim?] • verbmurder (uccidere)ARG2:patient[Hum/Anim?] subj:NPMOD1:instr[Weapon] obj:NPMOD2:means[Action] MOD3:...[...] :instr: PPcon [Weapon] (knife m., con coltello) :means: PPper [Action] (strangulation m., per strangolamento) :loc: Ppploc|di [Location] (Kent State murders, nel ...) :time: Ppptime|di [Time] (1983 murders, del 1983) As if it were a Situation Copenhagen, Oct. 2001

  34. … Monolingual Linguistic Representation Strategy: • consider as the starting point for MILE the edited union of the basic notions represented in the existing syntactic/semantic lexicons (their models) • evaluate their notions wrtEAGLES recommendations for syntax and semantics • evaluate their usefulness & adequacy for multilingual tasks • evaluate integrability of their notions in a unitary MILE • look for deficient areas, e.g. MWE • ... To be decided: should ISLE reach a consensus at the level of the “types” of information only, or also at the level of their “token” values? …. different answers for diff. notions Copenhagen, Oct. 2001

  35. … the Multilingual ISLE Lexical Entry(MILE) • General methodological principles (from EAGLES): • Basic requirements for theMILE: • Discover and list the (maximal) set ofbasic notionsneeded to describe the MILE (up to which level standardisation is feasible?) • Granularity • The leading principle for the design of the MILE: theedited unionof existing lexicons/models (redundancyisnot a problem) • Modular and layered: various degrees of specification possible • Allow for underspecification (& hierarchical structure) Copenhagen, Oct. 2001

  36. The MILE • Main features • factor out primitive units of lexical information • explicit representation of information to be targeted by multilingual NLP tools • rely on lexical analyses with the highest degree of inter-theoretical agreement • avoid framework-specific representational solutions • open to different paradigms of multilinguality • oriented to the creation of large-scale lexical databases Copenhagen, Oct. 2001

  37. MILE • Objective: definition of the MILE • as a meta-entryto act as acommon format for resource sharing and integration/architecture for lexical data encoding  its basic notions  general architecture • formalizedas an entity-rel. model (XML, RDF, etc.) • with a tool to support it open to task- & system-dependent parameterisation Copenhagen, Oct. 2001

  38. Agreed Principles • MILE builds on the monolingual entry & expands it • MILEincorporates previous EAGLES recommendations • is the “complete” entry • adopt as starting point the PAROLE/SIMPLE DTD • to be revised, augmented, ... We consider 2 broad categories ofapplications : • MT • CLIR(linking module may be simpler/ontology based) • (label info types wrt application) Copenhagen, Oct. 2001

  39. Modularity in MILE • Advantages: • Flexibility of representation • Easy to customise andupdate • Easy integration of existing resources • High versatility towards different applications Modularity at least under three respects: • in themacrostructureandgeneral architectureof the MILE • in themicrostructureof the MILE • monolingual linguistic representation(previous EAGLES revised/updated) • collocational/corpus-driven information(new) • multilingual apparatus (e.g. transfer conditions and actions; interlingua)(new) • in the specific microstructure of theMILE word-sense Copenhagen, Oct. 2001

  40. Meta-information Architecture 1. Coarse-grained 2. Fine-grained 1. Monolingual 2. Collocational 3. Multilingual Modularity in MILE A. MILE Macrostructure C. Word-Sense Microstructure MILE B. MILE Microstructure Copenhagen, Oct. 2001

  41. The MILE ArchitectureMonolingual Lexical Description • three independent and yet linked layers characterising the MILE in a source language • possibly corresponds to the typology of information contained in major existing lexicons, such as PAROLE-SIMPLE, (Euro)WordNet, COMLEX, FrameNet, etc. • simple and complex lexical unit (to account for MWEs) • various degrees of granularity of lexical units representation semantic layer correspondence conditions syntactic layer morphological layer Copenhagen, Oct. 2001

  42. The MILE ArchitectureMultilingual Layer • acts as an (independent) interface layer between monolingual lexicons multilingual layer semantic layer correspondence conditions syntactic layer Lexicon 1 Lexicon 2 morphological layer Copenhagen, Oct. 2001

  43. The MILE Multilingual Layer….(NEW) • Correspondences can be established between different types of linguistic objects (strings, syntactic descriptions, semantic elements, predicates, etc.) • Transfer tests and actionsto target various types of lexical information in the monolingual layers • constrain syntactic positions and their fillers • lexicalize syntactic positions • add positions or arguments • add new features to define more fine-grained sense distinctions relevant at the multilingual level • restructuring argument configurations • collocational information • ... Copenhagen, Oct. 2001

  44. a list of critical information types that will compose each module of the MILE Paths to Discover theBasic Notions of MILE • clues in dictionaries to decide on target equivalent • guidelines for lexicographers • clues (to disambiguate/translate) in corpus concordances • lexical requirements from various types of transfer conditions and actions in MT systems • lexical requirements from interlingua-based systems • … Copenhagen, Oct. 2001

  45. Organisational Proposal: division of labour • Highlighted somehot issues& assignedtasks: • sense indicators (EU) • selection preferences (EU) • lexicographic relevance (EU) • argument structure (US) • MWE (EU & US) • collocations & parallel corpora (US) • modifiers (EU) • semantic relations (EU) • transfer conditions (EU & US) • collocational patterns (US) • ontology (US) • metaphors (EU) • interlingua requirements (US) • spoken lexicon (EU) • meta-representation (US & EU) • ... Copenhagen, Oct. 2001

  46. Organisational ProposalThe tasks will lead to: • an in-depth analysis of eacharea aiming at identifying: • the most stable solutions adopted in the community • linguistic specifications and criteria • possible representational solutions, their compatibility, etc. • evaluation of their respective weight/importance in a multilingual lexicon (towards a layered approach to recommendations) • open issues and current boundaries of the state-of-the-art (which cannot be standardised yet) • model limitations through creation of a sample dictionary • … • see how the various pieces fit together & can be merged in a unified proposal • evaluate if we can combine in a “hybrid super-model” the transfer & interlingua approaches Copenhagen, Oct. 2001

  47. Information Types: examples Selectional preferences • How to represent them (e.g. features, reference to an ontology, word-senses, etc.) • Different status of the preferences • Criteria to identify them • Expressive limits of existing formal resources Ontology • Architectural issues (types of ontologies: e.g. taxonomies, “Qualia”-based type systems, etc.) • Inheritance • Which roles for ontologies in the MILE • Representational issues • Customisation and development criteria Transfer conditions and actions • Identification of categories of transfer phenomena • Ranking of hard cases • Possible parameterisation wrt language types • How to formalise them • Types of actions Copenhagen, Oct. 2001

  48. CLWG Ongoing Activities … to prepare a preliminary proposal of the MILE: • existing models for lexical representation and data interchange (Genelex, Olif, etc.) are explored • model limitations and expressive power are tested through creation of sample entries in a few languages • groups at work • lexical description and information: types of relevant info • lexicographic exploration: systematic summary & classification of types of transfer tests (also extracted from MRDs) • multilingual correspondences • lexical data modeling: format & representation issues • tool development Copenhagen, Oct. 2001

  49. Representation issues • Working with GENELEX,lexicon development work is (can be) affected by: • impossibility (or difficulty) of defining abstract and general classes or types of objects • lack of inheritance mechanisms • lack of default expression and default rewriting mechanisms Cf. Lexical templates in SIMPLE: • not included in the GENELEX data-structure • implemented in the editing sw. tool • very useful to capture relevant lexical generalizations, enhance consistency in encoding, speed-up lexicographers’ work, etc. Copenhagen, Oct. 2001

  50. CLWG Ongoing Activity MILE Lexical Objects Formal Specifications MILE Lexical Entry Formal Specifications MILE Shared Lexical Objects User Defined Lexical Objects Monolingual & Multilingual Lexicons Copenhagen, Oct. 2001

More Related