Assessing The Retrieval

Assessing The Retrieval A.I Lab 2007.01.20 박동훈

Contents • 4.1 Personal Assessment of Relevance • 4.2 Extending the Dialog with RelFbk • 4.3 Aggregated Assessment : Search Engine Performance • 4.4 RAVE : A Relevance Assessment Vehicle • 4.5 Summary

4.1 Personal Assessment of Relevance • 4.1.1 Cognitive Assumptions • Users trying to do ‘object recognition’ • Comparison with respect to prototypic document • Reliability of user opinions? • Relevance Scale • RelFbk is nonmetric

Relevance Scale

RelFbk is nonmetric • Users naturally provides only preference information • Not(metric) measurement of how relevant a retrieved document is!

4.2 Extending the Dialog with RelFbk RelFbk Labeling of the Retr Set

Query Session, Linked by RelFbk

4.2.1 Using RelFbk for Query Refinment

Fig 4.7 Change documents!? More/less the query that successfully / un matches them 4.2.2 Document Modifications due to RelFbk

4.3 Aggregated Assessment : Search Engine Performance • 4.3.1 Underlying Assumptions • RelFbk(q,di) assessments independent • Users’ opinions will all agree with single ‘omniscient’ expert’s

4.3.2 Consensual relevance Consensuallyrelevant

4.3.4 Basic Measures • Relevant versus Retrieved Sets

Contingency table • NRet : the number of retrieved documents • NNRet : the number of documents not retrieved • NRel : the number of relevant documents • NNRel : the number of irrelevant documents • NDoc : the total number of documents

4.3.4 Basic Measures (cont)

4.3.5 Ordering the Retr set • Each document assigned hitlist rank Rank(di) • Descending Match(q,di) • Rank(di)<Rank(dj) ⇔ Match(q,di)>Match(q,dj) • Rank(di)<Rank(dj) ⇔ Pr(Rel(di))>Pr(Rel(dj)) • Coordination level : document’s rank in Retr • Number of keywords shared by doc and query • Goal:Probability Ranking Principle

A tale of tworetrievals Query1 Query2

Recall/precision curve Query1

Retrieval envelope

4.3.6 Normalized recall Best Worst ri : i번째 relevant doc 의 hitlist rank

4.3.8 One-Parameter Criteria • Combining recall and precision • Classification accuracy • Sliding ratio • Point alienation

Combining recall and precision • F-measure • [Jardine & van Rijsbergen71] • [Lewis&Gale94] • Effectiveness • [vanRijsbergen, 1979] • E=1-F, α=1/(β2+1) • α=0.5=>harmonic mean of precision & recall

Classification accuracy • accuracy • Correct identification of relevant and irrelevant

Sliding ratio • Imagine a nonbinary, metric Rel(di) measure • Rank1, Rank2 computed by two separate systems

Point alienation • Developed to measure human preference data • Capturing fundamental nonmetric nature of RelFbk

4.3.9 Test corpora • More data required for “test corpus” • Standard test corpora • TREC:Text Retrieval Evaluation Conference • TREC’s refined queries • TREC constantly expanding, refining tasks

More data required for “test corpus” • Documents • Queries • Relevance assessments Rel(q,d) • Perhaps other data too • Classification data (Reuters) • Hypertext graph structure (EB5)

Standard test corpora

TREC constantly expanding,refining tasks • Ad hoc queries tasks • Routing/filtering task • Interactive task

Other Measure • Expected search length (ESL) • Length of “path” as user walks down HitList • ESL=Num. irrelevant documents before each relevant document • ESL for random retrieval • ESL reduction factor

4.5 Summary • Discussed both metric and nonmetric relevance feedback • The difficulties in getting users to provide relevance judgments for documents in the retrieved set • Quantified several measures of system perfomance

Assessing The Retrieval

Assessing The Retrieval

Presentation Transcript

Assessing the Acutely Ill

Assessing the evidence

Assessing the curriculum

Retrieval

Assessing the Assessment

Assessing the Marketing Environment

Assessing the Landscape

Assessing

Assessing the Low Back

Assessing the victim

Assessing the Assessments:

Assessing on the RUN!

Assessing the Retrieval

Assessing the effectiveness of your current search and retrieval function

Retrieval

Assessing

Retrieval

Assessing the Landscape

Assessing the Ineffable

Assessing the Curriculum

Retrieval

The Information Retrieval Problem