1 / 17

Outline

KYOTO ( ICT - 211423) Y ielding O ntologies for T ransition-Based O rganization Intelligent Content and Semantics WordNet LMF Monica Monachini – CNR-ILC. Outline. Background: a KYOTO format for lexical resources WordNet-LMF The KYOTO Lexical Grid. KYOTO: the lexical resource perspective.

rufina
Télécharger la présentation

Outline

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. KYOTO (ICT-211423)Yielding Ontologies for Transition-Based Organization Intelligent Content and Semantics WordNetLMFMonica Monachini – CNR-ILC Monica Monachini – 1° KYOTO review – Luxembourg 3/17/2009

  2. Outline • Background: a KYOTO format for lexical resources • WordNet-LMF • The KYOTO Lexical Grid Monica Monachini – 1° KYOTO review – Luxembourg 3/17/2009

  3. KYOTO: the lexical resource perspective • KYOTO objectives • “ … facilitating the exchange of information across languages, domains and cultures” • “ … allow definition of word meaning in a shared Wiki platform” • from the point of view of linguistic resources … • needs to share lexical and knowledge bases, both general and domain-related, under the form of lexical repositories and ontologies Monica Monachini – 1° KYOTO review – Luxembourg 3/17/2009

  4. WnJP Wn IT WnNL WnEN WnES WnJP WnCH WnEU Wn IT WnNL WnEN WnES WnEU WnCH A common representation format for WordNets Seven WordNets • similar but not identical  hampered interoperability • to be accessed both intra- and inter-linguistically need to support easier integration • endow WordNet with a representation format to allow easy access, integration and interoperability among resources Monica Monachini – 1° KYOTO review – Luxembourg 3/17/2009

  5. Standard and Interoperability: SoA To be achieved • Existing standards developed in isolation (not widely accepted) • Disagreement concerning theories/linguistic annotation • Lack of standard representation format(s)/framework(s) • Lack of accessibility Achievements • SubCommittee devoted to standards for linguistic annotation • Catalogues of linguistic categories and annotation schemas • Interest group (ACL) for developing standard annotation of language data • Efforts towards interlinked resources • Harmonized systems and frameworks • International conferences/workshops • EU-funded common resources and technology infrastructure; roadmap for achieving interoperability Monica Monachini – 1° KYOTO review – Luxembourg 3/17/2009

  6. LMF • Specifically designed to accommodate as many models of lexical representation as possible • Its pros: • Meta-model: a high-level specification ISO24613 • Data Category Registry: low-level specifications ISO12620 Monica Monachini – 1° KYOTO review – Luxembourg 3/17/2009

  7. Main Features • Not a monolithic model rather a modular framework • LMF library provides the hierarchy of lexical objects (with structural relations among them) • Data Category Registry provides a library of descriptors to encode linguistic information associated to lexical objects (N.B. Data Categories can be also user-defined) Monica Monachini – 1° KYOTO review – Luxembourg 3/17/2009

  8. Structural skeleton to represent the basic hierachy of a lexicon Components required to describe additional classes and relations Monica Monachini – 1° KYOTO review – Luxembourg 3/17/2009

  9. DCR Monica Monachini – 1° KYOTO review – Luxembourg 3/17/2009

  10. Centralized WordNet DC Registry A list of 85 sem.rels as a result of a mapping of the KYOTOWordNet grid Intra-WN Inter-WN Monica Monachini – 1° KYOTO review – Luxembourg 3/17/2009

  11. Principles of WordNet-LMF Balance between: • Maintain adherence to architectural principles of LMF • Main conceptual building blocks and structural relationships between them maintained • The expression of the linguistic info (synset relations) falls in the realm of DCs • Adapt standard LMF to suit efficiency needs • Promote feat-att structures to element attributes • Use of bracketing elements Monica Monachini – 1° KYOTO review – Luxembourg 3/17/2009

  12. Monica Monachini – 1° KYOTO review – Luxembourg 3/17/2009

  13. Data Categories LexicalResource 1..* 0..1 1..1 GlobalInformation Lexicon SenseAxes 1..* 0..* 1..* 0..1 Meta Synset SenseAxis LexicalEntry 0..1 0..1 0..* 0..1 0..1 1..1 MonolingualExternalRefs InterlingualExternalRefs Lemma Sense Definition SynsetRelations 0..1 0..* 1..* 1..* 1..* MonolingualExternalRefs MonolingualExternalRef InterlingualExternalRef Statement SynsetRelation 0..1 0..1 0..1 1..* MonolingualExternalRef Meta Meta Meta 0..1 Meta Monica Monachini – 1° KYOTO review – Luxembourg 3/17/2009 Diagram of the WordNet-LMF format

  14. <?xml version='1.0' encoding="UTF-8"?> <!ELEMENT LexicalResource (GlobalInformation, Lexicon+, SenseAxes?)> <!ELEMENT GlobalInformation EMPTY> <!ATTLIST GlobalInformation label CDATA #IMPLIED> <!ELEMENT Lexicon (LexicalEntry+, Synset*)> <!ATTLIST Lexicon languageCoding CDATA #FIXED "ISO 639-3" label CDATA #IMPLIED language CDATA #REQUIRED owner CDATA #REQUIRED version CDATA #REQUIRED> The triplets encodes the basic building blocks WN3.0 <footprint_1 footmark_1> 06645039-n <!ELEMENT LexicalEntry (Meta?, Lemma, Sense*)> <!ATTLIST LexicalEntry id ID #IMPLIED> <!ELEMENT Lemma EMPTY> <!ATTLIST Lemma writtenForm CDATA #IMPLIED partOfSpeech CDATA #REQUIRED> <!ELEMENT Sense (Meta?, MonolingualExternalRefs?)> <!ATTLIST Sense id ID #REQUIRED synset IDREF #REQUIRED> <!ELEMENT MonolingualExternalRefs (MonolingualExternalRef+)> <!ELEMENT MonolingualExternalRef (Meta?)> <!ATTLIST MonolingualExternalRef externalSystem CDATA #REQUIRED externalReference CDATA #REQUIRED relType (at|plus|equal) #IMPLIED> links a Sense to another resource WordNet-LMF administrative and core packagesRepesentation of synset variants Monica Monachini – 1° KYOTO review – Luxembourg 3/17/2009

  15. clusters together senses of different Lexical Entries WN3.0 <footprint_1 footmark_1> 06645039-n <!ELEMENT Synset (Meta?, Definition?, SynsetRelations, MonolingualExternalRefs)> <!ATTLIST Synset id ID #REQUIRED baseConcept (1|2|3) #REQUIRED> <!ELEMENT Definition (Statement*)> <!ATTLIST Definition gloss CDATA #REQUIRED> <!ELEMENT Statement EMPTY> <!ATTLIST Statement example CDATA #REQUIRED> <!ELEMENT SynsetRelations (SynsetRelation+)> <!ELEMENT SynsetRelation (Meta?)> <!ATTLIST SynsetRelation target IDREF #REQUIRED relType CDATA #REQUIRED> <!ELEMENT MonolingualExternalRefs (MonolingualExternalRef+)> <!ELEMENT MonolingualExternalRef (Meta?)> <!ATTLIST MonolingualExternalRef externalSystem CDATA #REQUIRED externalReference CDATA #REQUIRED relType (at|plus|equal) #IMPLIED> harmonized Kyoto Data categories WordNet-LMF semantic levelRepesentation of synset and synset relations represents the variuos relations holding between synsets Monica Monachini – 1° KYOTO review – Luxembourg 3/17/2009

  16. <!ELEMENT SenseAxes (SenseAxis+)> <!ELEMENT SenseAxis (Meta?, Target+, InterlingualExternalRefs?)> <!ATTLIST SenseAxis id ID #REQUIRED relType CDATA #REQUIRED> <!ELEMENT Target EMPTY> <!ATTLIST Target ID CDATA #REQUIRED> <!ELEMENT InterlingualExternalRefs (InterlingualExternalRef+)> <!ELEMENT InterlingualExternalRef (Meta?)> <!ATTLIST InterlingualExternalRef externalSystem CDATA #REQUIRED externalReference CDATA #REQUIRED relType (at|plus|equal) #IMPLIED> IWN <fuoco_1, fiamma_1> 00001251-n SWN <fuego_3, llama_1> 09686541-n groups together monolingual synsets that correspond each other and share the same relations to English WN3.0 <fire_1 flame_1 flaming_1> 13480848-n specifies the type of correspondence link to ontology/(ies) Monica Monachini – 1° KYOTO review – Luxembourg 3/17/2009 WordNet-LMF multilingual levelRepesentation of cross-lingual synset relations

  17. Kyoto Knowledge Base Domain WnJP Domain Domain WnIT WnNL Domain Ontology Ontology Domain Domain Ontology WnES WnEN Domain Domain WnEU WnCH Monica Monachini – 1° KYOTO review – Luxembourg 3/17/2009

More Related