1 / 24

Hermes: News Personalization Using Semantic Web Technologies

Hermes is a framework for personalizing news using Semantic Web technologies. It allows users to query and retrieve news items based on their interests and temporal constraints. The framework includes news classification, querying, and results presentation components. This text provides an overview of the Hermes framework, its architecture, and an example of its implementation.

dmedina
Télécharger la présentation

Hermes: News Personalization Using Semantic Web Technologies

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Hermes: News Personalization Using Semantic Web Technologies Flavius Frasincar frasincar@ese.eur.nl Erasmus University Rotterdam

  2. Contents • Motivation • Hermes Framework: • News Classification • News Querying • Results Presentation • Hermes News Portal: • An example • Conclusions • Future Work

  3. Motivation • Large quantity of news on the Web: • Difficult to find the ones of interest • News messages have a strong impact on stock prices • Limited annotation of RSS feeds: • Broad categories (business, cars, entertainment, etc.) • Google finance shows direct news which pertain to a certain portfolio: • Indirect news (competitors of Google like Microsoft) are not presented • Not possible to ask time-related queries about news

  4. Hermes Framework • Input: • News items from RSS feeds • Domain ontology linked to a semantic lexicon (e.g., WordNet) • User query • Output: • News items as answers to the user query • Three phases: 1. News Classification: • Relate news items to ontology concepts 2. News Querying • Allow the user to express his concepts of interest and the temporal constraints 3. Results Presentation • Present the news items that match user’s query

  5. Hermes Architecture

  6. 1. News Classification • Concept defined in the ontology (class or individual) • Multiple lexical representations for the same concept: • Ontology synonyms (e.g., New York→ New York, Big Apple) • Semantic lexicon synonyms (e.g., buy→acquire) • Concepts without subclasses or instances: • Semantic lexicon hyponyms (e.g., company→dot-com) • Lookup ontology concepts into news items • A longer match supersedes a shorter match (European Central Bank supersedes European)

  7. 1. News Classification 1.1 Tokenization (words, punctuation signs) 1.2 Sentence splitting (sentences) 1.3 Part-of-speech tagging (e.g., noun, verb, adj., etc.) 1.4 Morphological analysis (e.g., lemma “read” for “reading” as a verb) 1.5 Word sense disambiguation (e.g., Structural Semantic Interconnection (SSI) based on word context) 1.6 Adding “hits” between news items and the domain ontology

  8. 2. News Querying 2.1 Query Formulation • Present the domain knowledge as directed labeled multi-graph: • with the additional constraint that arcs between two nodes are not allowed to share the same label (called conceptual graph) • User selects the concepts of interest in the conceptual graph (e.g., Google) • User is able to add to its selection concepts related to the concepts of interests using specified relations (e.g., hasCompetitors: Microsoft, eBay, and Yahoo) • The selected concepts are presented in a separate graph (called search graph)

  9. 2. News Querying • News are time stamped • User is able to specify that only news in a certain time interval should be retrieved • Time constraints: • Last hour • Last day • Last year • [2007-03-01T00:00:00.000+00:01, 2007-05-31T00:00:00.000+00:01 ] • [Future: order constraints (e.g., order by time)]

  10. 2. News Querying 2.2 Query Execution • Generate the query in a semantic query language: • Map concepts of interest to query restrictions (current: disjunctive queries) • Map temporal constraints to query restrictions • Execute the semantic query • The order of the relevant news items is not important here

  11. 3. Results Presentation 3.1 News Sorting • Return news items that match a query • Sort the news items based on their relevance degree to the query • The relevance degree is determined empirically: • based on a weighted sum of the number of hits in title (higher weight) and body (lower weight) of the news item • News items that have the same relevance degree are sorted in descending timestamp order

  12. 3. Results Presentation 3.2 News Presentation • Present the concepts involved in the query • Per each news items show a summary: • Title • Source • Date • Few beginning lines from the news item ([Future: snippet]) • Emphasize the hits (found concepts from the ontology) in the retrieved news items • Show the icons of the most important query concept found in a news item: • based on a weighted sum of the number of hits in title (higher weight) and body (lower weight) of a concept in a news item

  13. Hermes News Portal • Hermes News Portal (HNP) is an implementation of the Hermes framework • Implementation language: Java • Ontology represention language: OWL (e.g., cardinality restrictions, inverses, etc.) • Semantic lexicon: WordNet • Graph visualization: Prefuse (OWL2Prefuse) • Query language: SPARQL • SPARQL extended with custom time functions (e.g., currentDate(),currentTime(), etc.) • Natural language processing: GATE

  14. An Example • Query: Which are the news items about Google or one of its competitors from the past six months?

  15. 1. News Classification – Import News

  16. 1. News Classification – Conceptual Graph

  17. 2. News Querying- Search Graph Individuals Classes Selected concepts Concepts related to the selected node Concepts from keyword search

  18. 2. News Querying - Search Graph

  19. 2. News Querying- SPARQL PREFIX hermes: <http://hermes-news.org/news.owl#> SELECT ?title WHERE { ?news hermes:title ?title . ?news hermes:time ?date . ?news hermes:relation ?relation . ?news hermes:relatedTo ?concept . FILTER ( ?concept hermes:relatedTo hermes:Google || ?concept hermes:relatedTo hermes:Micosoft || ?concept hermes:relatedTo hermes:Ebay || ?concept hermes:relatedTo hermes:Yahoo ) FILTER ( ?date > "2009-02-01T00:00:00.000+00:01" && ?date < "2009-07-31T00:00:00.000+00:01" ) } • SPARQL query:

  20. 2. News Querying- tSPARQL • Custom time functions:

  21. 2. News Querying- tSPARQL PREFIX hermes: <http://hermes-news.org/news.owl#> SELECT ?title WHERE { ?news hermes:title ?title . ?news hermes:time ?date . ?news hermes:relation ?relation . ?news hermes:relatedTo ?concept . FILTER ( ?concept hermes:relatedTo hermes:Google || ?concept hermes:relatedTo hermes:Micosoft || ?concept hermes:relatedTo hermes:Ebay || ?concept hermes:relatedTo hermes:Yahoo ) FILTER ( ?date > hermes:dateTime-substract(hermes:now(), P0Y6M) && ?date < hermes:now() ) } • tSPARQL query:

  22. 3. Results Presentation

  23. Conclusions • Hermes Framework: presents news items that match the user interests • Hermes Framework: • News Classification • News Querying • Results Presentation • Hermes News Portal (HNP): an implementation of the Hermes framework • HNP based on: • WordNet semantic lexicon, OWL ontology, (extended) SPARQL queries, Prefuse visualization, GATE natural language processing

  24. Future Work • Word Sense Disambiguation: • GAMBL (supervised learning algorithm) • Ontology updates: • Learning from news items • Check if the extracted information obeys the ontology axioms: • Faulty extraction • Ontology axioms update • Simplify the query interface: • Allow users to ask English queries based on a limited vocabulary • Evaluate the tool outside the university lab

More Related