1 / 71

Today

Interfaces for Information Retrieval Ray Larson & Warren Sack IS202: Information Organization and Retrieval Fall 2001 UC Berkeley, SIMS lecture authors: Marti Hearst, Ray Larson, Warren Sack. Today. What is HCI? Interfaces for IR using the standard model of IR

binh
Télécharger la présentation

Today

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Interfaces for Information RetrievalRay Larson & Warren SackIS202:Information Organization and RetrievalFall 2001UC Berkeley, SIMSlecture authors: Marti Hearst, Ray Larson, Warren Sack IS202: Information Organization & Retrieval

  2. Today • What is HCI? • Interfaces for IR using the standard model of IR • Interfaces for IR using new models of IR and/or different models of interaction IS202: Information Organization & Retrieval

  3. Human-Computer Interaction (HCI) • Human • the end-user of a program • Computer • the machine the program runs on • Interaction • the user tells the computer what they want • the computer communicates results (slide adapted What is HCI? from James Landay) IS202: Information Organization & Retrieval

  4. Organizational & Social Issues Task Design Technology Human What is HCI? (slide by James Landay) IS202: Information Organization & Retrieval

  5. IS202: Information Organization & Retrieval

  6. Shneiderman on HCI • Well-designed interactive computer systems promote: • Positive feelings of success, competence, and mastery. • Allow users to concentrate on their work, rather than on the system. IS202: Information Organization & Retrieval

  7. Usability Design Goals • Ease of learning • faster the second time and so on... • Recall • remember how from one session to the next • Productivity • perform tasks quickly and efficiently • Minimal error rates • if they occur, good feedback so user can recover • High user satisfaction • confident of success (slide by James Landay) IS202: Information Organization & Retrieval

  8. Who builds UIs? • A team of specialists • graphic designers • interaction / interface designers • technical writers • marketers • test engineers • software engineers (slide by James Landay) IS202: Information Organization & Retrieval

  9. Design Evaluate Prototype How to Design and Build UIs Iterate at every stage! • Task analysis • Rapid prototyping • Evaluation • Implementation (slide adapted from James Landay) IS202: Information Organization & Retrieval

  10. Task Analysis • Observe existing work practices • Create examples and scenarios of actual use • Try out new ideas before building software IS202: Information Organization & Retrieval

  11. Task = Information Access • The standard interaction model for information access • (1) start with an information need • (2) select a system and collections to search on • (3) formulate a query • (4) send the query to the system • (5) receive the results • (6) scan, evaluate, and interpret the results • (7) stop, or • (8) reformulate the query and go to step 4 IS202: Information Organization & Retrieval

  12. HCI Interface questions using the standard model of IR • Where does a user start? Faced with a large set of collections, how can a user choose one to begin with? • How will a user formulate a query? • How will a user scan, evaluate, and interpret the results? • How can a user reformulate a query? IS202: Information Organization & Retrieval

  13. Interface design: Is it always HCI or the highway? • No, there are other ways to design interfaces, including using methods from • Art • Architecture • Sociology • Anthropology • Narrative theory • Geography IS202: Information Organization & Retrieval

  14. Information Access: Is the standard IR model always the model? • No, other models have been proposed and explored including • Berrypicking (Bates, 1989) • Sensemaking (Russell et al., 1993) • Orienteering (O’Day and Jeffries, 1993) • Intermediaries (Maglio and Barrett, 1996) • Social Navigation (Dourish and Chalmers, 1994) • Agents (e.g., Maes, 1992) • And don’t forget experiments like (Blair and Maron, 1985) IS202: Information Organization & Retrieval

  15. IR+HCI Question 1: Where does the user start? IS202: Information Organization & Retrieval

  16. Dialog box for choosing sources in old lexis-nexis interface IS202: Information Organization & Retrieval

  17. Where does a user start? • Supervised (Manual) Category Overviews • Yahoo! • HiBrowse • MeSHBrowse • Unsupervised (Automated) Groupings • Clustering • Kohonen Feature Maps IS202: Information Organization & Retrieval

  18. IS202: Information Organization & Retrieval

  19. Incorporating Categories into the Interface • Yahoo is the standard method • Problems: • Hard to search, meant to be navigated. • Only one category per document (usually) IS202: Information Organization & Retrieval

  20. More Complex Example: MeSH and MedLine • MeSH Category Hierarchy • Medical Subject Headings • ~18,000 labels • manually assigned • ~8 labels/article on average • avg depth: 4.5, max depth 9 • Top Level Categories: anatomy diagnosis related disc animals psych technology disease biology humanities drugs physics IS202: Information Organization & Retrieval

  21. MeshBrowse (Korn & Shneiderman95)Only the relevant subset of the hierarchy is shown at one time. IS202: Information Organization & Retrieval

  22. HiBrowse (Pollitt 97)Browsing several different subsets of category metadata simultaneously. IS202: Information Organization & Retrieval

  23. Large Category Sets • Problems for User Interfaces • Too many categories to browse • Too many docs per category • Docs belong to multiple categories • Need to integrate search • Need to show the documents IS202: Information Organization & Retrieval

  24. Text Clustering • Finds overall similarities among groups of documents • Finds overall similarities among groups of tokens • Picks out some themes, ignores others IS202: Information Organization & Retrieval

  25. Scatter/Gather Cutting, Pedersen, Tukey & Karger 92, 93, Hearst & Pedersen 95 • How it works • Cluster sets of documents into general “themes”, like a table of contents • Display the contents of the clusters by showing topical terms andtypical titles • User chooses subsets of the clusters and re-clusters the documents within • Resulting new groups have different “themes” • Originally used to give collection overview • Evidence suggests more appropriate for displaying retrieval results in context IS202: Information Organization & Retrieval

  26. Another use of clustering • Use clustering to map the entire huge multidimensional document space into a huge number of small clusters. • “Project” these onto a 2D graphical representation • Group by doc: SPIRE/Kohonen maps • Group by words: Galaxy of News/HotSauce/Semio IS202: Information Organization & Retrieval

  27. Clustering Multi-Dimensional Document Space(image from Wise et al 95) IS202: Information Organization & Retrieval

  28. Kohonen Feature Maps on Text(from Chen et al., JASIS 49(7)) IS202: Information Organization & Retrieval

  29. Summary: Clustering • Advantages: • Get an overview of main themes • Domain independent • Disadvantages: • Many of the ways documents could group together are not shown • Not always easy to understand what they mean • Different levels of granularity IS202: Information Organization & Retrieval

  30. IR+HCI Question 2: How will a user formulate a query? IS202: Information Organization & Retrieval

  31. Query Specification • Interaction Styles (Shneiderman 97) • Command Language • Form Fill • Menu Selection • Direct Manipulation • Natural Language • What about gesture, eye-tracking, or implicit inputs like reading habits? IS202: Information Organization & Retrieval

  32. Command-Based Query Specification • command attribute value connector … • find pa shneiderman and tw user# • What are the attribute names? • What are the command names? • What are allowable values? IS202: Information Organization & Retrieval

  33. Form-Based Query Specification (Altavista) IS202: Information Organization & Retrieval

  34. Form-Based Query Specification (Melvyl) IS202: Information Organization & Retrieval

  35. Form-based Query Specification (Infoseek) IS202: Information Organization & Retrieval

  36. Direct Manipulation Spec.VQUERY (Jones 98) IS202: Information Organization & Retrieval

  37. Menu-based Query Specification(Young & Shneiderman 93) IS202: Information Organization & Retrieval

  38. IR+HCI Question 3: How will a user scan, evaluate, and interpret the results? IS202: Information Organization & Retrieval

  39. Display of Retrieval Results Goal: minimize time/effort for deciding which documents to examine in detail Idea: show the roles of the query terms in the retrieved documents, making use of document structure IS202: Information Organization & Retrieval

  40. Putting Results in Context • Interfaces should • give hints about the roles terms play in the collection • give hints about what will happen if various terms are combined • show explicitly why documents are retrieved in response to the query • summarize compactly the subset of interest IS202: Information Organization & Retrieval

  41. Putting Results in Context • Visualizations of Query Term Distribution • KWIC, TileBars, SeeSoft • Visualizing Shared Subsets of Query Terms • InfoCrystal, VIBE, Lattice Views • Table of Contents as Context • Superbook, Cha-Cha, DynaCat • Organizing Results with Tables • Envision, SenseMaker • Using Hyperlinks • WebCutter IS202: Information Organization & Retrieval

  42. KWIC (Keyword in Context) • An old standard, ignored by internet search engines • used in some intranet engines, e.g., Cha-Cha IS202: Information Organization & Retrieval

  43. TileBars • Graphical Representation of Term Distribution and Overlap • Simultaneously Indicate: • relative document length • query term frequencies • query term distributions • query term overlap IS202: Information Organization & Retrieval

  44. TileBars Example Query terms: What roles do they play in retrieved documents? DBMS (Database Systems) Reliability Mainly about both DBMS & reliability Mainly about DBMS, discusses reliability Mainly about, say, banking, with a subtopic discussion on DBMS/Reliability Mainly about high-tech layoffs IS202: Information Organization & Retrieval

  45. IS202: Information Organization & Retrieval

  46. SeeSoft: Showing Text Content using a linear representation and brushing and linking (Eick & Wills 95) IS202: Information Organization & Retrieval

  47. David Small: Virtual Shakespeare IS202: Information Organization & Retrieval

  48. IS202: Information Organization & Retrieval

  49. IS202: Information Organization & Retrieval

More Related