1.92k likes | 1.93k Vues
Designing Information Architecture for Search. Marti Hearst University of California, Berkeley www.sims.berkeley.edu/~hearst NSF CAREER Grant, NSF9984741. Tutorial: SIGIR 2001. Outline. Motivation Search Interfaces: Web search vs Site Search Search UIs: What works; what doesn’t
E N D
Designing Information Architecturefor Search Marti Hearst University of California, Berkeley www.sims.berkeley.edu/~hearst NSF CAREER Grant, NSF9984741 Tutorial: SIGIR 2001
Outline • Motivation • Search Interfaces: • Web search vs Site Search • Search UIs: What works; what doesn’t • Methodology • Information Architecture Defined • Faceted Metadata • Integrating Search into IA via Faceted Metadata • Results of Usability Studies • Tools • Conclusions
Contributors to the Research • Dr. Rashmi Sinha • Graduate Students • Ame Elliott • Jennifer English • Kirsten Swearington • Ping Yee • Research funded by • NSF CAREER Grant, NSF9984741
Claims • Web Search is OK • Gets people to the right starting points • Web SITE search is NOT ok • The best way to improve site search is • NOT to make new fancy algorithms • Instead …
The best way to improve search: Improve the User Interface
Recent Study by Vividence Research • Spring 2001, 69 web sites • 70% eCommerce • 31% Service • 21% Content • 2% Community • The most common problems: 53% had poorly organized search results 32% had poor information architecture 32% had slow performance 27% had cluttered home pages 25% had confusing labels 15% invasive registration 13% inconsistent navigation
Vividence findings: effects on users • Poorly organized search results • Frustration and wasted time • Poor information architecture • Confusion • Dead ends • "back and forthing" • Forced to search
Vividence findings: effects on users • Cluttered home pages • Creates disinterest • Wastes time • No contrast: everything has equal weight • Don’t know where to start • Failure to engage • No call to action • Failure to establish navigation • Layout reflects company organization chart • Investor centeredness
Vividence findings: characteristics • Inconsistent Navigation • Primary navigation bar is, in fact, really secondary • Un-scalable designs • Poor transitions between company divisions • "Junk Drawer" navigation bars • Random links • Shoe-horned functions • Heavy need to hit the "back-button"
Vividence Study • Breakdown of most common search problems • 41% - of searches encountered no problems • 20% - had search problems not named below • 14% - of searches were not “advanced” enough • 12% - did not organize results well • 10% - of searches yielded inaccurate/unrelated results • 9% - were too slow • 8% - of searches had insufficient instructions • 7% - engine was too difficult to locate • 7% - of searches produced too few results • 7% - of searches were too limiting • 3% - of searches produced an error message • 3% - were too difficult to use
Other Relevant Studies • Commercial studies (are not usually scientific, do not supply full details) • CreativeGood.com Holiday 2000 ecommerce report • UIE, and Jared Spool’s talks: http://world.std.com/~uieweb • Scientific studies (often less relevant to real web situations) • Many papers from the CHI proceedings http://www.acm.org/dl/ • Papers from Human Factors and the Web http://www.optavia.com/hfweb/ • See the extensive bibliography from my textbook chapter (in this package).
The Philosophy • Information architecture should be designed to integrate search throughout • Search results should reflect the information architecture. • This supports an interplay between navigation and search • This supports the most common human search strategies.
The Approach • Assign faceted metadata to content items • Allow users to navigate through the faceted metadata in a flexible manner • Organize search results according to the faceted metadata so navigation looks similar throughout • Give previews of next choices • Allow access to previous choices
Advantages of the Approach • Supports different task types • Highly constrained known-item searches use one interface • Open-ended, browsing tasks use another interface • Both types of interface use the same underlying structure • Can easily switch from one interface type to the other midstream
Advantages of the Approach • Honors many of the most important usability design goals • User control • Provides context for results • Reduces short term memory load • Allows easy reversal of actions • Provides consistent view
Advantages of the Approach • Allows different people to add content without breaking things • Can make use of standard technology
Web Search is Working! Survey finds high user satisfaction Study by npd group http://www.searchenginewatch.com/reports/npd.html
Why is Web Search Working? • Web Search is Successful at Finding Good Starting Points (home pages) • Evidence: • Search engines using • Link analysis • Page popularity • Interwoven categories • These all find dominant home pages
Organizing Search Results:What works, What Doesn’t • There is a lot of prior work on this • Cha-Cha (Chen et al. 1999) • Scatter-Gather clustering (Cutting et al. 93, Hearst et al. 1996) • Becoming more prevalent in web search too. • Teoma • Vivisimo • Northern Light
Web Search Results Grouping • Drill down one category • Cannot mix and match categories • Not clear if it is useful or not • Can help differentiate different meanings of the same word. • But …what about site search?
If Web search engines are providing source selection … … what happens when the user gets to the site? Follow Links … or … Search
Following Hyperlinks • Works great when it is clear where to go next • Frustrating when the desired directions are undetectable or unavailable Site Search Is not getting good reviews
text search An Analogy hypertext
Analogy • Hypertext: • A fixed number of choices of where to go next; • A glance at the map tells you where you are; • But may not go where you want to go. • To get from Topeka to Santa Fe, may have to go through Frostbite Falls • Site Search: • Can go anywhere; • But may get stuck, disoriented, in a crevasse!
Goal: An All-Tertrain Vehicle • The best of both techniques • A vehicle that magically lays down track to suggest choices of where you want to go next based on what you’ve done so far and what you are trying to do • The tracks follow the lay of the land and go everywhere, but cross over the crevasses • The tracks allow you to back up easily
What works, what doesn’t • There is negative evidence for • Clustering • Fancy visualizations • There is positive evidence for • Grouping into meaningful, consistent categories • Relevance feedback • Depends how you do it • Showing similar items
Study of Kohonen Feature Maps • H. Chen, A. Houston, R. Sewell, and B. Schatz, JASIS 49(7) • Comparison: Kohonen Map and Yahoo • Task: • “Window shop” for interesting home page • Repeat with other interface • Results: • Starting with map could repeat in Yahoo (8/11) • Starting with Yahoo unable to repeat in map (2/14) UWMS Data Mining Workshop
Study (cont.) • Participants liked: • Correspondence of region size to # documents • Overview (but also wanted zoom) • Ease of jumping from one topic to another • Multiple routes to topics • Use of category and subcategory labels UWMS Data Mining Workshop
Study (cont.) • Participants wanted: • hierarchical organization • other ordering of concepts (alphabetical) • integration of browsing and search • corresponce of color to meaning • more meaningful labels • labels at same level of abstraction • fit more labels in the given space • combined keyword and category search • multiple category assignment (sports+entertain) UWMS Data Mining Workshop
Visualization of Clusters • Huge 2D maps may be inappropriate focus for information retrieval • Can’t see what documents are about • Documents forced into one position in semantic space • Space is difficult to use for IR purposes • Hard to view titles • Perhaps more suited for pattern discovery • problem: often only one view on the space
Summary: Clustering(Based on other studies as well) • Advantages: • Get an overview of main themes • Domain independent • Disadvantages: • Many of the ways documents could group together are not shown • Not always easy to understand what they mean • Different levels of granularity • Probably best for scientists only • Take heart – there is good evidence for organizing via categories!
The DynaCat System • Decide on important question types in an advance • What are the adverse effects of drug D? • What is the prognosis for treatment T? • Make use of MeSH categories • Retain only those types of categories known to be useful for this type of query. Pratt, W., Hearst, M, and Fagan, L. A Knowledge-Based Approach to Organizing Retrieved Documents. AAAI-99: Proceedings of the Sixteenth National Conference on Artificial Intelligence, Orlando, Florida, 1999.
DynaCat Study • Design • Three queries • 24 cancer patients • Compared three interfaces • ranked list, clusters, categories • Results • Participants strongly preferred categories • Participants found more answers using categories • Participants took same amount of time with all three interfaces
Cha-Cha (intranet search) Cha-Cha: A System for Organizing Intranet Search Results, by Chen, Hearst, Hong, and Lin, Proceedings of 2nd USENIX Symposium on Internet Systems, Boulder, CO, Oct 1999. cha-cha.berkeley.edu
The Standard Model • Assumptions: • Maximizing precision and recall simultaneously • The information need remains static • The value is in the resulting document set
“Berry-Picking” as an Information Seeking Strategy (Bates 90) • Berry-picking model • Interesting information is scattered like berries among bushes • The user learns as they progress, thus • The query is continually shifting