1 / 23

WIRED Week 3

WIRED Week 3. Syllabus Update (next week) Readings Overview Quick Review of Last Week’s IR Models (if time) Evaluating IR Systems Understanding Queries Assignment Overview & Scheduling Leading WIRED Topic Discussions Web Information Retrieval System Evaluation & Presentation

dai
Télécharger la présentation

WIRED Week 3

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. WIRED Week 3 • Syllabus Update (next week) • Readings Overview • Quick Review of Last Week’s IR Models (if time) • Evaluating IR Systems • Understanding Queries • Assignment Overview & Scheduling • Leading WIRED Topic Discussions • Web Information Retrieval System Evaluation & Presentation • Projects and/or Papers Discussion • Initial Ideas • Evaluation • Revise & Present

  2. Evaluating IR Systems • Recall and Precision • Alternative Measures • Reference Collections • What • Why • Trends

  3. Why Evaluate IR Systems? • Leave it to the developers? • No bugs • Fully functional • Let the market (users) decide? • Speed • (Perceived) accuracy • Relevance is relevant • Different types of searches, data and users • “How precise is the answer set?” p 73

  4. Retrieval Performance Evaluation • Task • Batch or Interactive • Each needs a specific interface • Setting • Context • New search • Monitoring • Usability • Lab tests • Real world (search log) analysis

  5. Recall and Precision • Basic evaluation measurement for IR system performance • Recall: the fraction of relevant documents retrieved • 100% is perfect recall • Every document that is relevant is found • Precision: the fraction of retrieved documents which are relevant • 100% relevancy is perfect precision • How good the recall is

  6. Recall and Precisions goals • Everything is found (recall) • The right set of documents is pulled from the found set (precision) • What about ranking? • Ranking is an absolute measure of relevance for the query. • Ranking is Ordinal in almost all cases

  7. Recall and Precision Considered • 100 documents have been analyzed • 10 documents relevant to the query in the set • 4 documents are found and all are relevant • ??% recall, ??% precision • 8 documents are found, but 4 are relevant • ??% recall, ??% precision • Which is more important?

  8. Recall and Precision Appropriate? • Disagreements over perfect sets • User errors in using results • Redundancy of results • Result diversity • Metadata • Dynamic data • Indexable • Recency of information may be key • A single measure is better • Combinatory • User evaluation

  9. Back to the User • User evaluation • Is one answer good enough? Rankings • Satisficing • Studies of Relevance are key

  10. Other Evaluation Measures • Harmonic Mean • Single, combined measure • Between 0 (none) & 1 (all) • Only high when both P & R are high • Still a percentage • E measure • User determines (parameter) value of R & P • Different tasks (legal, academic) • An interactive search?

  11. Coverage and Novelty • System effects • Relative recall • Relative effort • sMore natural, user understandable measure • User knows some % documents are relevant • Coverage = % documents user expects • Novelty = % of documents user didn’t know of • Content of document • Document itself • Author of document • Purpose of document

  12. Reference Collections • Testbeds for IR evaluation • TREC (Text Retrieval Conference) set • Industry focus • Topic-based or General • Summary tables for tasks (queries) • R & P averages • Document analysis • Measures for each topic • CACM (general CS) • ISI (academic, indexed, industrial)

  13. Trends in IR Evaluation • Personalization • Dynamic Data • Multimedia • User Modeling • Machine Learning (CPU/$)

  14. Understanding Queries • Types of Queries: • Keyword • Context • Boolean • Natural Language • Pattern Matching • More like this… • Metadata • Structural Environments

  15. Boolean • AND, OR, NOT • Combination or individually • Decision tree parsing for the system • Not so easy for the user when advanced queries • Hard to backtrack and see differences in results

  16. Keyword • Single word (most common) • Sets • “Phrases” • Context • “Phrases” • Near (# value in characters, words, documents links)

  17. Natural Language • Asking • Quoting • Fuzzy matches • Different evaluation methods might be needed • Dynamic data “indexing” problematic • Multimedia challenges

  18. Pattern Matching • Words • Prefixes “comput*” • Suffixes “*ology” • Substrings “*exas*” • Ranges “four ?? years ago” • Regular Expressions (GREP) • Error threshold • User errors

  19. Query Protocols • HTTP • Z39.50 • Client – Server API • WAIS • Information/ database connection • ODBC • JDBC • P2P

  20. Assignment Overview & Scheduling • Leading WIRED Topic Discussions • # in class = # of weeks left? • Web Information Retrieval System Evaluation & Presentation • 5 page written evaluation of a Web IR System • technology overview (how it works) • a brief history of the development of this type of system (why it works better) • intended uses for the system (who, when, why) • (your) examples or case studies of the system in use and its overall effectiveness

  21. How can (Web) IR be better? Better IR models Better User Interfaces More to find vs. easier to find Scriptable applications New interfaces for applications New datasets for applications Projects and/or Papers Overview

  22. Project Idea #1 – simple HTML • Graphical Google • What kind of document? • When was the document created?

  23. Project Ideas • Google History: keeps track of what I’ve seen and not seen • Searching when it counts: Financial and Health information requires guided, quality search

More Related