1 / 24

Information Retrieval

This article discusses the importance of evaluating information retrieval (IR) systems and outlines the four levels of evaluation: effectiveness, benefits, cost-effectiveness, and cost benefits. It introduces various measures of effectiveness, such as recall, precision, relevance, pertinence, and timeliness, and highlights the criteria commonly used to evaluate retrieval performance. The article also explores the objective and subjective knowledge in IR evaluations and examines the components of an evaluation process. Finally, it discusses the limitations of the classic IR model and other factors that influence the use of IR systems.

adcockr
Télécharger la présentation

Information Retrieval

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Information Retrieval Evaluation and the Retrieval Process

  2. Why evaluate an IR system? • To select between alternative systems • To determine if a system meets expressed and unexpressed needs of current users and non-users • To improve IR systems and determine if improvement actually occurred • To develop cost models

  3. 4 levels of evaluation - Lancaster • Effectiveness • Benefits • Cost effectiveness • Cost benefits

  4. Effectiveness • What a system does well, e.g., percentage of reference questions answered accurately, the recall and precision of a literature search • There are a number of measures of effectiveness

  5. Measuring Effectiveness

  6. Recall Precision Relevance Pertinence or utility Novelty ratio Fallout and noise Timeliness Coverage Generality Measures of Effectiveness

  7. Recall and Precision Ratios • Recall a/(a+c): proportion of relevant items retrieved out of the total number of relevant items contained in a database • Precision a/(a+b): a signal-to-noise ratio--proportion of retrieved materials that are relevant to a query • Used together, the 2 ratios express the filtering capacity of the system • Recall and precision tend to be inversely related

  8. Relevance and Pertinence • Relevance (or generality ratio) (a+c)/(a+b+c+d): the number or proportion of materials in a system that are relevant to a query. Can be hard to ascertain without scanning the entire database. • Pertinence: the relationship between a document and an information need. Utility refers to the subset of a that is actually used.

  9. Novelty Ratio, Fallout and Noise • Novelty ratio: a subset of a that is actually new to the person evaluating relevance • Fallout and Noise: the subset b of retrieved items that are not relevant

  10. Timeliness, Coverage, and Generality • Timeliness and coverage: factors that affect assessments of relevance and pertinence • Generality: the number of documents related to a particular request in the entire database. The more dense the ratio, the easier a search should be • Accuracy

  11. Criteria Commonly Used to Evaluate Retrieval Performance • Recall • Precision • User effort • Amount of time a user spends conducting a search • Amount of time a user spends negotiating his inquiry and then separating relevant from irrelevant items • Response time • Benefits • Search costs • Cost effectiveness • Cost benefits

  12. Objective vs. Subjective Knowledge • Factual or artifactual knowledge vs. how knowledge is constructed or modeled within an individual’s mind • Subjective knowledge (and therefore relevance judgments) varies from person to person, e.g., individual aesthetic judgments or problem solving methods

  13. Benefits • What good a system does, e.g., how an information system benefits its users • Hard to measure

  14. Search Costs • Economics of using different databases • Using natural language indexing can shift effort onto the searcher

  15. Cost Effectiveness • Relationship of cost criteria to quality criteria, e.g., unit cost per relevant or new item retrieved

  16. Cost Benefits • Cost savings through use of one information system over another • Increased, or avoidance of loss of, productivity • Improved decision-making or reduction of personnel needed to make decisions • Avoidance of duplication of effort

  17. Components of an Evaluation 1. Defining the scope of the evaluation - Formative vs. summative 2. Designing the evaluation program 3. Execution of the evaluation 4. Analysis and interpretation of the results 5. Modifying the system based on the results 6. Iteration if necessary (go back to step 3)

  18. Real Life vs. Experimental Systems • Experiments and benchmark tests - • standardized collections, queries, and relevance judgments • tested against multiple systems • evaluated on recall and precision • biases often built into system design • Predictive evaluation - • expert reviews • usage simulation such as walthroughs • Real life - • observing users’ interactions with system • eliciting users’ opinions

  19. Classic IR Model - Bates • Document --> Document representation • matched up with • Query <-- Information need

  20. Users cannot use their own language Different users have different needs Users have different information needs at different times Users are not always able to read and write Information need may evolve during the search process Some users are not concerned about precision and recall Users may want to eliminate known items Users may want more cues to assist in assessing relevance Problems with Classic IR Model

  21. Other factors influencing use • Accessibility - physical, intellectual, and psychological - and ease of use are the most important determinants of whether an information service is used • Principle of Least Effort • Perceived technical quality also affects the choice of first source • Perceptions of accessibility ar einfluenced by experience

  22. Berrypicking • Search queries are not static, but evolve • Searchers gather information in bits and pieces • Searchers use a variety of search techniques • Searchers use a variety of other sources as well as databases

  23. Footnote chasing Citation searching Journal run Area scanning Subject searches in bibliographies, abstracts, and indexes Author searching Search Strategies

  24. Making Retrieval More Effective • The more techniques used, the more effective a search is likely to be • Users should be able to search in ways that are already familiar or that they have found to be effective • A visual representation of the contents of a system may aid users in orienting themselves

More Related