1 / 19

Finding Support Sentences for Entities

Finding Support Sentences for Entities. Roi Blanco, Hugo Zaragoza SIGIR‘10 Advisor: Jia Ling, Koh Speaker: Yu Cheng, Hsieh. Outline. Introduction Notations Features for Ranking Support Sentences Entity Ranking Sentence Ranking with Entity Ranking Information Experiment. Introduction.

tarika
Télécharger la présentation

Finding Support Sentences for Entities

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Finding Support Sentences for Entities Roi Blanco, Hugo Zaragoza SIGIR‘10 Advisor: Jia Ling, Koh Speaker: Yu Cheng, Hsieh

  2. Outline • Introduction • Notations • Features for Ranking Support Sentences • Entity Ranking • Sentence Ranking with Entity Ranking Information • Experiment

  3. Introduction • Ranking entities (e.g. experts, locations, companies, etc) became a standard information retrieval task. • Only show entities without explanations is not enough for users to indentify the relevance between query and entities. • Retrieving and ranking entity support sentences to explain the relevance of an entity with respect to a query.

  4. Notations • : a collection of sentences (paragraphs or text window of fixed size) • : contexts surround • : sentence-entity matrix, , if entity j is present in sentence i , otherwise • :

  5. Notations • : top k relevant sentences for query q • : augmentby adding contexts with respect to each • : candidate support sentences in for an entity e • : candidate support sentences in for an entity e

  6. Features for Ranking Support Sentences • : using the original score of the sentence (measured by BM25) • : context-aware model using BM25F Only consider the relevance between query and sentences

  7. Entity Ranking • : number of sequences containing the entity e (like tf) • : penalize very frequent entities (like idf) • : discover special entities

  8. Sentence Ranking with Entity Ranking Information

  9. Position Consideration the distance between the last match of query and entity

  10. Experiment • Using Semantically Annotated Snapshot, which contains - 1.5 M documents - 75M sentences - 20.3M unique name entities (using 12 first level Wall Street Journal entity types) • Built dataset of 226 (query, entity) pairs with 45 unique queries manually.

  11. Experiment • Assessors produce queries about topic they know well. • System produces a set of candidate entities • Assessors eliminate the non-relevant entities with respect to the query • System produces candidate sentences for each (query, entity) pair

  12. Experiment • Assessors evaluate four levels of relevance: 1. Non-relevant 2. Fairly relevant 3. Relevant 4. Very relevant • A triple is considered relevant iff

  13. Experiment • Measurement - MRR - NDCG - P@1 - MAP Tie-aware evaluation is used

  14. Experiment • functions operate on a top-k set for a given query that can be augmented with a context • The context of a sentence was defined as - The surrounding four sentences - The title of its Wikipedia entry • Represent each sentence in three fields - First: the sentence s - Second: the surrounding sentences - Third: Wikipedia title

  15. Result Combination > KLD > Frequency > Rarity Sum > Average

  16. The Role of Context • Given a fixed query q and a fixed entity e - Correct support sentence for (q, e) - The context in the ranking function itself

  17. Conclusions & Future work • Developed several features embracing different paradigms to tackle the problem • The context of a sentence which can be effectively exploited using the BM25F • The methods might have a bias for longer sentences – apply sentence normalization • Pursuing other linguistic features of sentences

More Related