1 / 12

Clustering and Exploring Search Results using Timeline Constructions

Clustering and Exploring Search Results using Timeline Constructions. Presenter: Tsai Tzung Ruei Authors: Omar Alonso, Michael Gertz , Ricardo Baeza -Yates. 國立雲林科技大學 National Yunlin University of Science and Technology. CIKM 2009. Outline. Motivation Objective

jerod
Télécharger la présentation

Clustering and Exploring Search Results using Timeline Constructions

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Clustering and Exploring Search Results usingTimeline Constructions Presenter: Tsai TzungRuei Authors: Omar Alonso, Michael Gertz, Ricardo Baeza-Yates 國立雲林科技大學 National Yunlin University of Science and Technology CIKM 2009

  2. Outline • Motivation • Objective • Time annotated document model • Methodology • Experiments • Conclusion • Comments

  3. Motivation • Any of the current search engines does not exploit the temporal information embedded in the documents. • Do you think current timelines for organizing or clustering search results (such as in Google’s timeline) are useful for some of your daily search activities? • Do you use (or would use) timelines to explore search results? • Please indicate some search scenarios where you use timelines or would like to use timelines to organize search results. • Please give some examples of search scenarios where current search engines do not sufficiently support the concept of timelines to organize and explore search results? • What other features would you like to see in the context of timelines? 時間軸

  4. Objective • To present an add-on to traditional information retrievalapplications in which we exploit various temporal informationassociated with documents to present and cluster documentsalong timelines.

  5. TIME ANNOTATED DOCUMENT MODEL • Time and Timelines • Temporal Expressions • Temporal Document Profiles Our base timeline, denoted Td, is an interval of consecutive day chronons.EX: “March 12, 2002; March 13, 2002;March 14, 2002” implicit temporal expression EX:“Valentine's Day 2006” Explicit temporal expressions EX:December 2004 Relative temporal expressions EX:“today” Explicit implicit timestamps Relative

  6. Methodology • PROTOTYPE • Process Overview Alembic (POS tagger) GUTime temporal tagger • XML • Document • (tdp) Corpora Oracle

  7. Methodology • TCluster • Constructing a Time Outline for the documents in the hit list Lq. • Document Clustering • Ranking Documents in a Cluster a hit list Lq =[d1, d2, . . . , dk] of k documents

  8. Experiments • DMOZ • Introduction :a multilingual open content directory 2010, 2006, 2002, 1998 and 1994 document clusters Result documents are well classified by users in terms of the actual event. World Cup documents pre-defined categories(5)< TCluster (21) Each World Cup document has a single event as the main theme.

  9. Experiments • The TimeBank 1.2 corpus • It contains news articles that have been annotated using TimeML with temporal expressions related to events, times and temporal links between events and times. Result A 50% increase in the number of clusters discovered by TCluster

  10. Experiments • Relevance Evaluation using AMT • It is a crowdsourcing platform Result The average response was 4.04 (with an 80% agreement level)

  11. Conclusion • MAJOR CINTRIBUTION • TCluster algorithm provides great flexibility and allows users to explore clusters of search result documents that are organized along well-defined timelines, supporting different levels of time granularity. • The utility of the time-based clustering over existing approaches that cluster documents only based on document timestamps. • FUTURE WORK • To want to study the weighting of relative temporal expressions as well as different sentence distance functions for determining the rank of documents in a cluster.

  12. Comment • Advantage • Provides a new method of time searching • Drawback • Some mistakes • Application • information retrieval • Clustering

More Related