90 likes | 269 Vues
Modeling language resources in OWL. A missing link to ISOCat ? Laurent Romary – INRIA & HUB-IDSL. Return to basics. XML document models <gen> f </gen> Controlled by DTDs/Schemas/Schematron rules etc. One model, one spec syndrom (cf. TEI/ODD) The ISO 11179 view…
E N D
Modeling languageresources in OWL A missing link to ISOCat? Laurent Romary – INRIA & HUB-IDSL
Return to basics • XML document models • <gen>f</gen> • Controlled by DTDs/Schemas/Schematron rules etc. • One model, one spec syndrom (cf. TEI/ODD) • The ISO 11179 view… • Data elements vs. Data element concepts • Cf. TMF (ISO 16642) one implements the other • <gen> vs. /grammaticalGender/ • How to make this operational • Describing models • Checking model subsumption/compatibility (interoperability) • Inferring models (e.g. from samples)
With using OWL? • OWL-DL combines • Expressive power (hierarchy of concepts, relations) • Simplicity • Inferential capacities (Description Logic) • Available tools: Protégé
Example application: checking model compatibility My LMF app.
Modelling in ISO TC 37: TMF <termEntry> <subjectField>Medicine</subjectField> <definition>An acute viral infection involving the respiratory tract. It is marked by inflammation of the NASAL MUCOSA; the PHARYNX; and conjunctiva, and by headache and severe, often generalized, myalgia.(MeSH)</definition> <langSet> <tigxml:lang='fr'> <term>grippe</term> <POS>N</POS> <register>vernacular</register> </tig> </langSet> <langSet> <tigxml:lang='en'> <term>influenza</term> <POS>N</POS> <register>all</register> </tig> </langSet></termEntry> /language/ /subjectField/ /definition/ /term/ /partOfSpeech/ /register
Consequences on ISOCat • Maintain stable identifiers • /grammaticalGender/ !! • Define basic OWL exports • Simple and complex DC as elementary declarations • Simple-complex DC relations: hasDCValue • Integrate components as part of the DCR • Provide a space for further OWL declarations • Further constraints between DatCats
Relations: Drawing the limits for the DCR • Could/should be in • ROT: Core relations – wide usage • Simple – complex (open?) relation • Generic-specific relations: provide general entry points for simple applications or for further refinements (e.g. /determiner/, /preposition/) • Should probably be out • ROT: Difficult to get a consensus on content – fewer usages • Record instances (except specific case – standardised instances; cf. language codes) • Constraints across categories: very language specific • Need for an owl based registry of constraint sentences
The linguistic view • What is gender: • “a classification of nominals, as shown by agreement” • E.g. die Katze – derHund • Determiners, adjectives, numerals, verbs • E.g. Control by anaphoric pronouns (cf. en) • Die Katze… sie… • Not present in all languages • Number of genders (Greville G. Corbett)
Misc. • UML vs. OWL