1 / 30

Linguistic Web Services for Semantic Web

Linguistic Web Services for Semantic Web. BT Short Term Research Fellowship. Dr. Vassil T. Vassilev London Metropolitan University. Part I Semantic Web and Linguistic Data Processing. Content. 1 Project Background: Semantic Web and NLP 2 RDF – Lingua Franca of Semantic Web

lona
Télécharger la présentation

Linguistic Web Services for Semantic Web

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Linguistic Web Services for Semantic Web BT Short Term Research Fellowship Dr. Vassil T. Vassilev London Metropolitan University July - October 2003

  2. Part I Semantic Web and Linguistic Data Processing

  3. Content 1 Project Background: Semantic Web and NLP 2RDF – Lingua Franca of Semantic Web 3 The need for linguistic support of Semantic Web 4WordNet: Universal Linguistic Resource • WordNet as a model of the word semantics • WordNet as an online thesaurus • WordNet as a relational database 5 Step One: Putting WordNet on the Web 6 Step Two: Extending WordNet 7 Step Three: LinguaShare 8 Problems and Directions

  4. 1 Project Background: Semantic Web and NLP Semantic Web:Model-driven framework for semantically rich data processing over the Web – • RDF – Dublin Core (1999), W3C (1999) • DAML – DARPA (2000); OIL – FP5 (2000) http://www.w3c.org/2001/sw/ http://www.dublincore.org/documents/dces/ Semantic Thesaurus: Linguistic database containing word meanings and semantic relations • WordNet – George Miller, Princeton Univ. (1990) • EuroWordNet – FP4 (1997); BalkaNet – FP5 (2000) http://www.cogsci.princeton.edu/~wn/ http://www.hum.uva.nl/~ewn#EuroWordnet

  5. 1.1. Semantic data processing over the Web • Syntactic markup of the data (RDF,Topic Maps) • Using a kind of a meta-language (schema) for providing intended semantics of the data represented (RDFS, DAML) • Specify domain ontologies for representing the restrictions, dependencies, regularities and rules for inference (KIF, OIL, OWL)

  6. Layer Cake (McGuiness, 2002)

  7. 1.2. Computer-based semantic thesaurus • Explaining the meaning of the words • Finding other words with the same meaning (synonyms) • Finding of other words with similar meaning in the same context (synonymous usage) • Finding of semantically independent, related or dependent word forms (semantic referencing)

  8. Determining ontological information using lexical information EXAMPLE:Type inference through analysis of the argument structure of verb phrases and their syntactic appearance in texts: • The varieties of argument structure for EVENT-verbs suggests seven major subtypes: PHENOMENON, ASPECTUAL, STATE, ACT, PSYCHOLOGICAL_EVENT, CHANGE and CAUSE_CHANGE • Based on them, we can differentiate COGNITIVE_EVENT (experiencer is syntactic subject, e.g. fear) from ACT (experiencer is syntactic object, e.g., frighten)

  9. 1.3 Project definition Aims: • utilizing the full potential of WordNet multilingual thesauri as an universal linguistic ontology for semantic verification of specialist terminology • embedding it in applications for semantic data processing over the Web • using contemporary Semantic Web Services technologies and tools Methodology: • Analytical research (WordNet) • Modeling (relational models, UML) • Software prototyping (Tomcat, MySQL)

  10. 2 RDF – Lingua Franca of Semantic Web • Language to describe resources primarily on the Web (has semantics); can be used not only on the Web – e.g. Dublin Core for library catalogues • Use XML as a syntax representation of RDF statements (serialization syntax); there are alternative serializations (e.g. triplets), but XML is the most popular • The language can formulate statements about the language itself (meta-description); RDF Schema or RDFS • The statements can be stored, processed and transported over the Web (data persistence)

  11. 2.1 RDF Model Resources – Things being described by RDF expressions. Resources are named by URIs Examples: HTML document, XML element within the document, Collection of pages, Book Properties – Specific attributes or relations used to describe a resource. Attributes and relations can be also used as resources. Examples: Creator, Title, Name Values – Simply literals or references to resources Statements, e.g. Predicate(Property)  Subject(Resource) Object(Value)

  12. Example “Vassil Vassilev whose e-mail is v.vassilev@londonmet.ac.uk is the creator of web page http://www.lgu.ac.uk/~vassil/index.html” Subject (Resource): ‘http://www.lgu.ac.uk/~vassil/index.html’ Predicate (Property): ‘Creator’ Object (Value): ‘Vassil Vassilev’

  13. Graphical representation

  14. Serialized representation in XML <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:dc="http://purl.org/dc/elements/1.1/” xmlns:vcard="http://imc.org/vCard/3.0#"> <rdf:Description about=“http://www.lgu.ac.uk/~vassil/index.htm” <dc:creator> <rdf:Description> <vcard:FN>Vassil Vassilev</vcard:FN> <vcard:EMAIL>v.vassilev@londonmet.ac.uk</vcard:EMAIL> </rdf:Description> </dc:creator> </rdf:Description> </rdf:RDF>

  15. 2.2 Semantic Web Applications • Context-based Information Retrieval (search after semantic patterns) • Personalized Information Delivery(data presentation based on user profiles) • User tracking(dynamic construction of user profiles based on log analysis) • Document summarizing (text generation based on models of the meaning) • Automatic translation (text transformation which uses meaning models)

  16. 2.3 Semantic Web Tools • Persistent storage and query interpreters (XML databases/XQuery, RDF repositories/RQL) • Ontology visualizers and editors (OntoEdit, Protégé, etc.) • Ontology navigators and semantic searchengines (AskJeeves, RDF Quiz, OntoSearch) • Ontology-based inference engines (Cyc, Kaon, OMM)

  17. Some observations • Layers separation (data storage, data communication, information description, terminology definition, fact inference) • Layers isolation (syntactic wrapping vs. semantic mapping) • Information processing concentrated on the most abstract level (ontology) • Hierarchy of languages SQL  XMLRDF  RDFS  OWL

  18. 3 The Need for Linguistic Support of Semantic Web Why: • For combining multiple namespaces and syntactic names reconciliation • For word disambiguation in text analysis • For semantic indexingof text corpora • For resolvingsemantic inaccuracies in texts (esp. similarity, alternatives, exclusion, generalization,etc) • For representing text meaning in transformations which use an intermediate model of the meaning

  19. 4 WordNet as Universal Linguistic Resource • Word forms (nouns, verbs, adjectives and adverbs) and lexical relations between them • Synsets and meaning relations (synonymy, antonymy, hyponymy, meronymy, troponimy, etc) • Lexicaldatabase (set of indexed files or a database) • Command language interface (originally Tcl/tk scripts for direct file manipulation, but APIs for Java and other languages also available) • Multi-lingualthesauri (network of WordNet databases for most of the languages)

  20. 4.1 WordNet semantics • Relational model with both standard (ATTRIBUTE, ANTONYM, ENTAILMENT, CAUSE) and transitive relations (HYPERNYM,HOLONYM, MERONYM) • Formally can be interpreted in first-order relational structures (Kripke structures) – requires modal logic • For adequate representation of the relations either object-relational, or relational database with additional indexing of the transitive relations (transitive closure) is necessary

  21. Fig. 1 WordNet Relations

  22. 4.2 Relational schema of the original WordNet thesaurus word represents the syntactic word forms divided into four main categories – noun phrases, verb phrases, adjectives and adverbs synset defines the different meaning sets used for giving semantic interpretation of the word forms sense many-to-many relationship between word forms and synsets lexrel purely lexical relationships which hold between the word forms semrel semantic relationships between the word forms which contains the semantic thesaurus

  23. Fig. 2 Relational schema of WordNet

  24. 5 Putting WordNet on the Web • Synchronous query/response model of working (CGI calls) • Purely relational database for storing the thesaurus (MySQL) • Front-end implemented as a set of servlets which query the thesaurus on behalf of other applications • XML format of the data returned as a result of the queries • Separated from the applications and use of independent server (Tomcat)

  25. Tabl. 1 Servlets to explore word relations

  26. Part II LinguaShare: Linguistic Web Service for Semantic Web

More Related