User Interfaces for Information Access

User Interfaces for Information Access Marti Hearst IS202, Fall 2005

Outline • Introduction • What do people search for (and how)? • Why is designing for search difficult? • How to Design for Search • HCI and iterative design • What works? • Small details matter • Scaffolding • The Role of DWIM • Core Problems • Query specification and refinement • Browsing and searching collections • Information Visualization for Search • Summary

What Do People Search For?(And How?)

Question/Answer Browse and Build Text Data Mining A of Information Needs Spectrum • What is the typical height of a giraffe? • What are some good ideas for landscaping my client’s yard? • What are some promising untried treatments for Raynaud’s disease?

Questions and Answers • What is the height of a typical giraffe? • The result can be a simple answer, extracted from existing web pages. • Can specify with keywords or a natural language query • However, most search engines are not set up to handle questions properly. • Get different results using a question vs. keywords

Classifying Queries • Query logs only indirectly indicate a user’s needs • One set of keywords can mean various different things • “barcelona” • “dog pregnancy” • “taxes” • Idea: pair up query logs with which search result the user clicked on. • “taxes” followed by a click on tax forms • Study performed on Altavista logs • Author noted afterwards that Yahoo logs appear to have a different query balance. Rose & Levinson, Understanding User Goals in Web Search, Proceedings of WWW’04

Classifying Web Queries Rose & Levinson, Understanding User Goals in Web Search, Proceedings of WWW’04

What are people looking for?Check out Google Answers

Information Seeking Behavior • Two parts of a process: • search and retrieval • analysis and synthesis of search results

Standard Model • Assumptions: • Maximizing precision and recall simultaneously • The information need remains static • The value is in the resulting document set

Alternative to the Standard Model • Users learn during the search process: • Scanning titles of retrieved documents • Reading retrieved documents • Viewing lists of related topics/thesaurus terms • Navigating hyperlinks • The “berry-picking” model • Interesting information is scattered like berries among bushes • The query is continually shifting Bates, The Berry-Picking Search: UI Design, in “User Interface Design”, Thimbley (ED), Addison-Wesley 1990

A sketch of a searcher… “moving through many actions towards a general goal of satisfactory completion of research related to an information need.” (after Bates 89) Q2 Q4 Q3 Q1 Q5 Q0 Bates, The Design of Browsing and Berry-Picking Techniques for the On-line Search Interface”, Online Review 13(5), 1989

Implications • Interfaces should provide clues for where to go next • Interfaces should make it easy to store intermediate results • Interfaces should make it easy to follow trails with unanticipated results • Different types of information needs require different kinds of search tools and interfaces • Lists of ranked results and snippets • Collection browsing tools • Comparison tables • We’ve only begun to scratch the surface!

What People do AFTER the Search • Look for Trends • Make Comparisons • Aggregation and Scaling • Identify a Critical Subset • Assess • Interpret • The rest: • cross-reference • summarize • find evocative visualizations • miscellaneous O’Day & Jeffries, Orienteering in an information landscape: how information seekers get from here to there, Proceedings of InterCHI ’93.

SenseMaking • The process of encoding retrieved information to answer task-specific questions • Combine • internal cognitive resources • external retrieved resources • Create a good representation • an iterative process • contend with a cost/benefit tradoff Russell, Stefik, Pirolli, Card, The Cost Structure of Sensemaking , Proceedings of InterCHI ’93.

Why is Supporting Search Difficult?

Why is Supporting Search Difficult? • Everything is fair game • Abstractions are difficult to represent • The vocabulary disconnect • Users’ lack of understanding of the technology

Everything is Fair Game • The scope of what people search for is all of human knowledge and experience. • Other interfaces are more constrained (word processing, formulas, etc) • Interfaces must accommodate human differences in: • Knowledge / life experience • Cultural background and expectations • Reading / scanning ability and style • Methods of looking for things (pilers vs. filers)

Abstractions Are Hard to Represent • Text describes abstract concepts • Difficult to show the contents of text in a visual or compact manner • Exercise: • How would you show the preamble of the US Constitution visually? • How would you show the contents of Joyce’s Ulysses visually? How would you distinguish it from Homer’s TheOdyssey or McCourt’s Angela’s Ashes? • The point: it is difficult to show text without using text

Vocabulary Disconnect • If you ask a set of people to describe a set of things there is little overlap in the results.

Lack of Technical Understanding • Most people don’t understand the underlying methods by which search engines work.

People Don’t Understand Search Technology A study of 100 randomly-chosen people found: • 14% never type a url directly into the address bar • Several tried to use the address bar, but did it wrong • Put spaces between words • Combinations of dots and spaces • “nursing spectrum.com” “consumer reports.com” • Several use search form with no spaces • “plumber’slocal9” “capitalhealthsystem” • People do not understand the use of quotes • Only 16% use quotes • Of these, some use them incorrectly • Around all of the words, making results too restrictive • “lactose intolerance –recipies” • Here the – excludes the recipes • People don’t make use of “advanced” features • Only 1 used “find in page” • Only 2 used Google cache Hargattai, Classifying and Coding Online Actions, Social Science Computer Review 22(2), 2004 210-227.

People Don’t Understand Search Technology Without appropriate explanations, most of 14 people had strong misconceptions about: • ANDing vs ORing of search terms • Some assumed ANDing search engine indexed a smaller collection; most had no explanation at all • For empty results for query “to be or not to be” • 9 of 14 could not explain in a method that remotely resembled stop word removal • For term order variation “boat fire” vs. “fire boat” • Only 5 out of 14 expected different results • Understanding was vague, e.g.: • “Lycos separates the two words and searches for the meaning, instead of what’re your looking for. Google understands the meaning of the phrase.” Muramatsu & Pratt, “Transparent Queries: Investigating Users’ Mental Models of Search Engines, SIGIR 2001.

Outline • Introduction • What do people search for (and how)? • Why is designing for search difficult? • How to Design for Search • HCI and iterative design • What works? • Small details matter • Scaffolding • The Role of DWIM • Core Problems • Query specification and refinement • Browsing and searching collections • Information Visualization for Search • Summary

HCI Design Process and Principles

HCI Principles • We design for the user • Not for the designers • Not for the system • AKA: user-centered design • Make use of cognitive principles where available • Important guideslines for search: • Reduce memory load • Speak the user’s language • Provide helpful feedback • Respect perceptual principles

User-Centered Design • Needs assessment • Find out • who users are • what their goals are • what tasks they need to perform • Task Analysis • Characterize what steps users need to take • Create scenarios of actual use • Decide which users and tasks to support • Iterate between • Designing • Evaluating

User Interface Design is An Iterative Process Design Evaluate Prototype Slide by James Landay

Rapid Prototyping • Build a mock-up of design • Low fidelity techniques • paper sketches • cut, copy, paste • video segments

Telebears example

Telebears example: Task 4: Adding a course

Why Do We Prototype? • Get feedback on our design faster • Experiment with alternative designs • Fix problems before code is written • Keep the design centered on the user Slide adapted from James Landay

Evaluation • Test with real users (participants) • Formally or Informally • “Discount” techniques • Potential users interact with paper computer • Expert evaluations (heuristic evaluation) • Expert walkthroughs

What Works?

What Works for Search Interfaces? • Query term highlighting • in results listings • in retrieved documents • Sorting of search results according to important criteria (date, author) • Grouping of results according to well-organized category labels (see Flamenco) • DWIM only if highly accurate: • Spelling correction/suggestions • Simple relevance feedback (more-like-this) • Certain types of term expansion • So far: not really visualization Hearst et al: Finding the Flow in Web Site Search, CACM45(9), 2002.

Highlighting Query Terms • Boldface or color • Adjacency of terms with relevant context is a useful cue.

found! found! don’t know don’t know Highlighted query term hits using Google toolbar Microso US Blackout PGA Microsoft

Small Details Matter • UIs for search especially require great care in small details • In part due to the text-heavy nature of search • A tension between more information and introducing clutter • How and where to place things important • People tend to scan or skim • Only a small percentage reads instructions

Small Details Matter • UIs for search especially require endless tiny adjustments • In part due to the text-heavy nature of search • Example: • In an earlier version of the Google Spellchecker, people didn’t always see the suggested correction • Used a long sentence at the top of the page: “If you didn’t find what you were looking for …” • People complained they got results, but not the right results. • In reality, the spellchecker had suggested an appropriate correction. • Interview with Marissa Mayer by Mark Hurst: http://www.goodexperience.com/columns/02/1015google.html

Small Details Matter • The fix: • Analyzed logs, saw people didn’t see the correction: • clicked on first search result, • didn’t find what they were looking for (came right back to the search page • scrolled to the bottom of the page, did not find anything • and then complained directly to Google • Solution was to repeat the spelling suggestion at the bottom of the page. • More adjustments: • The message is shorter, and different on the top vs. the bottom • Interview with Marissa Mayer by Mark Hurst: http://www.goodexperience.com/columns/02/1015google.html

Using DWIM • DWIM – Do What I Mean • Refers to systems that try to be “smart” by guessing users’ unstated intentions or desires • Examples: • Automatically augment my query with related terms • Automatically suggest spelling corrections • Automatically load web pages that might be relevant to the one I’m looking at • Automatically file my incoming email into folders • Pop up a paperclip that tells me what kind of help I need. • THE CRITICAL POINT: • Users love DWIM when it really works • Users DESPISE it when it doesn’t

User Interfaces for Information Access