Grouper: A Dynamic CLUSTERIN G INTERFACE to WEB SEARCH RESULTS

Grouper: A Dynamic CLUSTERING INTERFACE to WEB SEARCH RESULTS Erdem Sarıgil - 21000089 Oğuz Yılmaz - 21000082

Grouper • Interface to the results of the HuskySearch • Dynamically groups the search results into clustersusing Suffix Tree Clustering Algorithm (STC) • The goal make search engine results easy to browse by clustering them • Grouper receives hit from different engines, and only looks at the top hits from each search engine

Post-retrieval Clustering • Based on the returned document set • Superior results than pre-retrieval clustering • Some key requirements: • Coherent Clusters • Efficiently Browsable • Speed • Algorithmic Speed • Snippet-Tolerance

Suffix Tree Clustering (STC) • Linear time clustering algorithm • STC has three logical steps: • Document cleaning • Identifying base clusters using a suffix tree • Merging these base clusters into clusters • STC has several novel characteristics: • Overlapping clusters • Bag-of-words • Well suited for Web document clustering • Robust in such “noisy” situations

User Interface

User Interface (cont’d)

Making the Clusters Easy to Browse Three heuristic to identify redundant phases: • Word Overlap • Sub- and Super- Strings • Most General Phase with Low Coverage

Speeeeed • Quality Search • Time Quality OR Time Quality • the vice president of vice president

Coherent Clusters

Comparison • Number of documents followed • Time Spent • Click Distance

Comparison (cont’d)

Thanks for our patience

Grouper: A Dynamic CLUSTERIN G INTERFACE to WEB SEARCH RESULTS

Grouper: A Dynamic CLUSTERIN G INTERFACE to WEB SEARCH RESULTS

Presentation Transcript

Expanding Square Search Pattern

CMOS Digital Integrated Circuits

Search Engine Technology

Search Engine

Chapter Overview Search

CS 541: Artificial Intelligence

HCI User Interface

Search Engine Optimization (SEO)

BLAST Quick Start

Dynamic Memory Management

CHAPTER 6: PNNI (Private Network Node Interface or Private Network-to-Network Interface)

159.741 STATE-SPACE SEARCH

EV6 Demo Outline

A search engine for a mixture of European languages

User Interfaces

Evaluation of IR Systems

Java Web Service Servers and Clients in Internet2 Grouper

Search Engine Optimization (SEO)

BUS INTERFACE

Chapter Overview Search

SEO Services