Designing Information Architecture for Search

Designing Information Architecturefor Search Marti Hearst University of California, Berkeley www.sims.berkeley.edu/~hearst NSF CAREER Grant, NSF9984741 Tutorial: SIGIR 2001

Outline • Motivation • Search Interfaces: • Web search vs Site Search • Search UIs: What works; what doesn’t • Methodology • Information Architecture Defined • Faceted Metadata • Integrating Search into IA via Faceted Metadata • Results of Usability Studies • Tools • Conclusions

Contributors to the Research • Dr. Rashmi Sinha • Graduate Students • Ame Elliott • Jennifer English • Kirsten Swearington • Ping Yee • Research funded by • NSF CAREER Grant, NSF9984741

Motivation and Background

Claims • Web Search is OK • Gets people to the right starting points • Web SITE search is NOT ok • The best way to improve site search is • NOT to make new fancy algorithms • Instead …

The best way to improve search: Improve the User Interface

Recent Study by Vividence Research • Spring 2001, 69 web sites • 70% eCommerce • 31% Service • 21% Content • 2% Community • The most common problems: 53% had poorly organized search results 32% had poor information architecture 32% had slow performance 27% had cluttered home pages 25% had confusing labels 15% invasive registration 13% inconsistent navigation

Vividence findings: effects on users • Poorly organized search results • Frustration and wasted time • Poor information architecture • Confusion • Dead ends • "back and forthing" • Forced to search

Vividence findings: effects on users • Cluttered home pages • Creates disinterest • Wastes time • No contrast: everything has equal weight • Don’t know where to start • Failure to engage • No call to action • Failure to establish navigation • Layout reflects company organization chart • Investor centeredness

Vividence findings: characteristics • Inconsistent Navigation • Primary navigation bar is, in fact, really secondary • Un-scalable designs • Poor transitions between company divisions • "Junk Drawer" navigation bars • Random links • Shoe-horned functions • Heavy need to hit the "back-button"

Vividence Study • Breakdown of most common search problems • 41% - of searches encountered no problems • 20% - had search problems not named below • 14% - of searches were not “advanced” enough • 12% - did not organize results well • 10% - of searches yielded inaccurate/unrelated results • 9% - were too slow • 8% - of searches had insufficient instructions • 7% - engine was too difficult to locate • 7% - of searches produced too few results • 7% - of searches were too limiting • 3% - of searches produced an error message • 3% - were too difficult to use

Other Relevant Studies • Commercial studies (are not usually scientific, do not supply full details) • CreativeGood.com Holiday 2000 ecommerce report • UIE, and Jared Spool’s talks: http://world.std.com/~uieweb • Scientific studies (often less relevant to real web situations) • Many papers from the CHI proceedings http://www.acm.org/dl/ • Papers from Human Factors and the Web http://www.optavia.com/hfweb/ • See the extensive bibliography from my textbook chapter (in this package).

The Philosophy • Information architecture should be designed to integrate search throughout • Search results should reflect the information architecture. • This supports an interplay between navigation and search • This supports the most common human search strategies.

The Approach • Assign faceted metadata to content items • Allow users to navigate through the faceted metadata in a flexible manner • Organize search results according to the faceted metadata so navigation looks similar throughout • Give previews of next choices • Allow access to previous choices

Advantages of the Approach • Supports different task types • Highly constrained known-item searches use one interface • Open-ended, browsing tasks use another interface • Both types of interface use the same underlying structure • Can easily switch from one interface type to the other midstream

Advantages of the Approach • Honors many of the most important usability design goals • User control • Provides context for results • Reduces short term memory load • Allows easy reversal of actions • Provides consistent view

Advantages of the Approach • Allows different people to add content without breaking things • Can make use of standard technology

Web Search vs. Site Search

Web Search is Working! Survey finds high user satisfaction Study by npd group http://www.searchenginewatch.com/reports/npd.html

Why is Web Search Working? • Web Search is Successful at Finding Good Starting Points (home pages) • Evidence: • Search engines using • Link analysis • Page popularity • Interwoven categories • These all find dominant home pages

Organizing Search Results:What works, What Doesn’t • There is a lot of prior work on this • Cha-Cha (Chen et al. 1999) • Scatter-Gather clustering (Cutting et al. 93, Hearst et al. 1996) • Becoming more prevalent in web search too. • Teoma • Vivisimo • Northern Light

Putting Results into Clusters

Drilldown – what does it mean?

Vivisimo – same idea

Yahoo lists category matches

Web Search Results Grouping • Drill down one category • Cannot mix and match categories • Not clear if it is useful or not • Can help differentiate different meanings of the same word. • But …what about site search?

If Web search engines are providing source selection … … what happens when the user gets to the site? Follow Links … or … Search

Following Hyperlinks • Works great when it is clear where to go next • Frustrating when the desired directions are undetectable or unavailable Site Search Is not getting good reviews

text search An Analogy hypertext

Analogy • Hypertext: • A fixed number of choices of where to go next; • A glance at the map tells you where you are; • But may not go where you want to go. • To get from Topeka to Santa Fe, may have to go through Frostbite Falls • Site Search: • Can go anywhere; • But may get stuck, disoriented, in a crevasse!

Goal: An All-Tertrain Vehicle • The best of both techniques • A vehicle that magically lays down track to suggest choices of where you want to go next based on what you’ve done so far and what you are trying to do • The tracks follow the lay of the land and go everywhere, but cross over the crevasses • The tracks allow you to back up easily

Organizing Search ResultsWhat works; what doesn’t

What works, what doesn’t • There is negative evidence for • Clustering • Fancy visualizations • There is positive evidence for • Grouping into meaningful, consistent categories • Relevance feedback • Depends how you do it • Showing similar items

Kohonen Feature Maps on Text(from Chen et al., JASIS 49(7))

Study of Kohonen Feature Maps • H. Chen, A. Houston, R. Sewell, and B. Schatz, JASIS 49(7) • Comparison: Kohonen Map and Yahoo • Task: • “Window shop” for interesting home page • Repeat with other interface • Results: • Starting with map could repeat in Yahoo (8/11) • Starting with Yahoo unable to repeat in map (2/14) UWMS Data Mining Workshop

Study (cont.) • Participants liked: • Correspondence of region size to # documents • Overview (but also wanted zoom) • Ease of jumping from one topic to another • Multiple routes to topics • Use of category and subcategory labels UWMS Data Mining Workshop

Study (cont.) • Participants wanted: • hierarchical organization • other ordering of concepts (alphabetical) • integration of browsing and search • corresponce of color to meaning • more meaningful labels • labels at same level of abstraction • fit more labels in the given space • combined keyword and category search • multiple category assignment (sports+entertain) UWMS Data Mining Workshop

Visualization of Clusters • Huge 2D maps may be inappropriate focus for information retrieval • Can’t see what documents are about • Documents forced into one position in semantic space • Space is difficult to use for IR purposes • Hard to view titles • Perhaps more suited for pattern discovery • problem: often only one view on the space

Summary: Clustering(Based on other studies as well) • Advantages: • Get an overview of main themes • Domain independent • Disadvantages: • Many of the ways documents could group together are not shown • Not always easy to understand what they mean • Different levels of granularity • Probably best for scientists only • Take heart – there is good evidence for organizing via categories!

The DynaCat System • Decide on important question types in an advance • What are the adverse effects of drug D? • What is the prognosis for treatment T? • Make use of MeSH categories • Retain only those types of categories known to be useful for this type of query. Pratt, W., Hearst, M, and Fagan, L. A Knowledge-Based Approach to Organizing Retrieved Documents. AAAI-99: Proceedings of the Sixteenth National Conference on Artificial Intelligence, Orlando, Florida, 1999.

DynaCat

DynaCat Study • Design • Three queries • 24 cancer patients • Compared three interfaces • ranked list, clusters, categories • Results • Participants strongly preferred categories • Participants found more answers using categories • Participants took same amount of time with all three interfaces

Cha-Cha (intranet search) Cha-Cha: A System for Organizing Intranet Search Results, by Chen, Hearst, Hong, and Lin, Proceedings of 2nd USENIX Symposium on Internet Systems, Boulder, CO, Oct 1999. cha-cha.berkeley.edu

Cha-Cha (intranet search)

How People Search

The Standard Model • Assumptions: • Maximizing precision and recall simultaneously • The information need remains static • The value is in the resulting document set

“Berry-Picking” as an Information Seeking Strategy (Bates 90) • Berry-picking model • Interesting information is scattered like berries among bushes • The user learns as they progress, thus • The query is continually shifting

Designing Information Architecture for Search

Designing Information Architecture for Search

Presentation Transcript

Designing the Architecture

Designing the Architecture

OFC219: Planning and Designing Logical and Information Architecture

Information Architecture for Indexers

Designing an Effective Information Architecture SharePoint 2010

Designing Architecture

Designing the Architecture

Information Architecture and Search AKA The Librarian

Designing physics Algorithms for gpu architecture

Designing a DSL for Information Systems Architecture

Search for Quality Information

Information Designing

Designing Search for Humans

Information Architecture Designing and Organising Digital Information Spaces

Designing the Architecture

Designing for Search Engines

Designing an Architecture

Information Designing

Designing the Architecture

Designing Software Architecture

Designing Information Architecture for Search

Designing the Architecture