170 likes | 311 Vues
This paper discusses the necessity of linguistic grounding in ontologies to facilitate easier development and enhance automatic information extraction. Traditional W3C standards like RDFS, OWL, and SKOS fail to adequately capture the linguistic relationships required for effective ontology utilization. The authors propose a unified model, LexInfo, which incorporates existing frameworks such as LingInfo and LexOnto, aiming to separate linguistic and ontological levels while allowing flexible coupling. This integration aims to standardize semantic web practices, making it an essential contribution to future developments in this field.
E N D
Towards Linguistically Grounded Ontologies Paul Buitelaar, Philipp Cimiano, Peter Haase, and Michael Sintek Proceedings of the 6th European Semantic Web Conference (ESWC’09) Heraklion, Greece, May/June 2009, 111-125 Buitelaar et al.
1 Introduction • Ontologies need linguistic grounding because: • Easier for human developers • Automatic information extraction is easier • Helps in “verbalizing” an ontology • RDFS, OWL, and SKOS not adequate - W3C standards • Present a unified model LexInfo based on: • LingInfo • LexOnto • Lexical Markup Framework (LMF) – ISO standard • Basis for future Semantic Web standardization Buitelaar et al.
2 Motivation • Separation between Linguistic and Ontological Level • Flexible Coupling of the Ontological and Language Systems • Subcategorization and Predicate-Argument Structure • Why Related Work is Not Enough Buitelaar et al.
Separation of Levels • rdfs:label is not good enough: <rdfs:Class about="#Cat"> <rdfs:labelxml:lang="en">cat</rdfs:label> <rdfs:labelxml:lang="en">cats</rdfs:label> <rdfs:labelxml:lang="de">Katze</rdfs:label> <rdfs:labelxml:lang="de">Katzen</rdfs:label> </rdfs:Class> • Fails to capture linguistic relationships • Linguistic data does not belong in domain ontology • Capture in a separate linguistic model - lexicon Buitelaar et al.
Flexible Coupling of Layers • Options for ‘Schweineschnitzel’ (pork cutlet) • ‘Schweineschnitzel’ => class Schweineschnitzel • ‘Schweineschnitzel’ => • ‘schnitzel’ => class schnitzel • ‘schnitzel’ => class schnitzel and ‘Schweine’ => pork • Need flexibility in ontology linguistic relations • Not “fully synchronized” Buitelaar et al.
Subcategorization and Predicate Arguments • Part-of-speech information is essential: • (Germany, capital, Berlin) – capital is a noun • Need subcategorization frames: • (Rhein, flowsThrough, Karlsruhe) – flow is intransitive, requires through phrase, flow => flows • Must capture variation of expression: • locatedAt: passes by, connects, goes through • Map verb arguments to predicate arguments: • [The A8: subject] connects [Karlsruhe: direct object] => (Karlsruhe, locatedAt, A8) Buitelaar et al.
Why Related Work is Not Enough • More expressive models are needed: • Capture morphology separately • Represent decomposition and linking of components • Model complex linguistic patterns, eg. subcat. frames • Specify meaning with respect to a domain ontology • Clearly separate linguistic and ontological levels • SKOS, LMF, LexOnto, NLP frameworks, and LWF all fail to meet some of the requirements Buitelaar et al.
3 Towards an Ontological and Linguistic Joint Model • Previous Work • LingInfo – direct connection of linguistic information to classes and properties • LexOnto – subcategorization frames and relation to properties • Lexical Markup Framework (LMF) – core package plus extensions for morphology, syntax, and semantics • The LexInfoModel – built on LMF, integrates LingInfo and LexOntomodels Buitelaar et al.
The LexInfo Model • Req. 1: Morphology Relations • Already done in LMF • Req. 2: Decomposition of Complex Terms • ListOfComponents extends LMF morphology • Make owl:Entity subclass of lmf:Sense • Req. 3: Subcategorization Frames • Link lmf:SyntacticBehavior to lmf:PredicativeRepresentation • Additional sublclasses for LMF classes • Req. 4: Relate to Domain Ontologies • Automatic by linking to domain ontologies • Req. 5: Separation Between Linguistics and Ontologies • Fully separate, related by OWL2 meta-ontology Buitelaar et al.
4 Conclusions • Language/knowledge interface too complex for RDFS/OWL/SKOS alone • LingInfo allows publishing reusable models • Other models fall short of requirements • LexInfo integrates LingInfo and LexOnto models using LMF as the “glue” • Ontologies and Java API available on Web • Intend to continue developing and working with the LFM working group • Basis for further standardization Buitelaar et al.