1 / 11

Section Based Relevance Feedback

Section Based Relevance Feedback. Student: Nat Young Supervisor: Prof. Mark Sanderson. Relevance Feedback. SE user marks document(s) as relevant E.g. “find more like this” Terms are extracted from full document Whole document may not be relevant

sonja
Télécharger la présentation

Section Based Relevance Feedback

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Section Based Relevance Feedback Student: Nat Young Supervisor: Prof. Mark Sanderson

  2. Relevance Feedback • SE user marks document(s) as relevant • E.g. “find more like this” • Terms are extracted from full document • Whole document may not be relevant • Could marking a sub-section relevant be better?

  3. Test Collections • Simulate a real user’s search process • Submit queries in batch mode • Evaluate the result sets • Relevance Judgments • QREL: <topicId, docId> pairs (1 … n) • Traditionally produced by human assessors

  4. Building a Test Collection • Documents • 1,388,939 research papers • Stop words removed • Porter Stemmer applied • Topics • 100 random documents • Their sub-sections (6 per document)

  5. Building a Test Collection • In-edges • Documents that cite paper X • Found 943 using the CiteSeerX database • Out-edges • Documents cited by paper X • Found 397 using pattern matching on titles

  6. QRELs • Total • 1,340 QRELs • Avg. 13.4 QRELs per document • Previous work: • Anna Richie et. al. (2006) • 82 Topics, Avg. 11.4 QRELs • 196 Topics, Avg. 4.5 QRELs • Last year • 71 Topics, Avg. 2.9 QRELs

  7. Section Queries • RQ1 Do the sections return different results?

  8. Section Queries • RQ2 Do the sections return different relevant results? Avg. = The average number of relevant results returned @ 20. E.g. Abstract queries returned 2 QRELs

  9. Section Queries Average intersection sizes of relevant results E.g. Avg(|Abstract ∩ All|) = 0.63 Avg(|Abstract \ All|) = 1.37 100 - ((0.63 / 2) * 100) = 68.5% difference

  10. Section Queries Average set complement % of relevant results E.g. Section X returned n% different relevant results than section Y

  11. Next • Practical Significance • Does SRF provide benefits over standard RF?

More Related