Georg Buscher , Andreas Dengel, Ludger van Elst German Research Center for AI (DFKI)

Query Expansion UsingGaze-Based Feedback on the Subdocument Level Georg Buscher, Andreas Dengel, Ludger van Elst German Research Center for AI (DFKI) Knowledge Management Department Kaiserslautern, Germany SIGIR 08

Outline • Motivation • Reading detection and document annotation technique • Implicit feedback methods • Study design • Results /

Background and Motivation • Relevance feedback à la Rocchio is well understood • Feedback is mostly applied for entire documents • Precision presumably gets better when acquiring feedback on the subdocument level • Drawbacks of such fine-grained feedback: • Too much cognitive load for explicit feedback • Too little implicit feedback data through explicit interactions (e.g. highlighting) document / Relevance feedbackon the document level Relevance feedbackon the subdocument level •  Use eye gaze as source for implicit feedback on the subdocument level

Outline • Motivation • Reading detection and document annotation technique • Implicit feedback methods • Study design • Results

Eye Tracking • Unobtrusive • Relatively precise(accuracy: 1° of visual angle) • Expensive • Mostly used as „passive“ tool for behavior analysis, e.g. visualized by heatmaps: • We use eye tracking for immediate implicit feedback taking into account temporal fixation patterns

Reading Detection • Starting point: Noisy gaze data from the eye tracker. • Fixation detection and saccade classification • Reading (red) and skimming (yellow) detection line by line See G. Buscher, A. Dengel, L. van Elst: “Eye Movements as Implicit Relevance Feedback”, in CHI '08

Gaze-Based Document Meta Data • Line-matching by applying optical character recognition • Store reading information as document annotations in a semantic Wiki See G. Buscher, A. Dengel, L. van Elst, F. Mittag: “Generating and Using Gaze-Based Document Annotations”, in CHI '08

Implicit Relevance Feedback for Query Expansion • Input: viewed documents having one specific task in mind • Find termsthatbestdescribetheuser‘scurrentinterest. • Usethesetermsforqueryexpansion terms describing theuser‘s current interest /context task / information needcontext

Three Implicit Feedback Methods to Evaluate Input:vieweddocuments Gaze-Filter TF x IDF based on read or skimmed passages Gaze-Length-Filter • Interest(t) x TF x IDF based on length of coherently read text

Gaze-Length-Filter • Long passages are passages containing at least 230 characters (i.e. more than the following two lines). • The heuristic assumes that shorter text parts only rarely convey sophisticated concepts to the reader. • It further assumes that readers are generally not very interested in the contents of short read or skimmed text parts. Therefore all terms contained in short read or skimmed text parts get a lower interest value. • # long read or skimmed passages containing t • Interest(t) = • # all read or skimmed passages containing t

Three Implicit Feedback Methods to Evaluate Input:vieweddocuments Gaze-Filter TF x IDF based on read or skimmed passages Gaze-Length-Filter • Interest(t) x TF x IDF based on length of coherently read text Reading Speed ReadingScore(t) xTF x IDF based on read vs. skimmed passages containing term t

Reading Speed • P are all read or skimmed passages containing term t. • The heuristic assumes that more thoroughly read text parts (and therefore their terms) are more likely to be of interest to the user than cursorily viewed parts. • 1 • Σ • ReadingScore(t) = • r(p) • |P | • p є P • t • t • t

Three Implicit Feedback Methods to Evaluate Input:vieweddocuments Gaze-Filter TF x IDF based on read or skimmed passages Gaze-Length-Filter • Interest(t) x TF x IDF based on length of coherently read text Reading Speed ReadingScore(t) xTF x IDF based on read vs. skimmed passages containing term t Baseline TF x IDF based on opened entire documents

Study Design • Informational task given • 2 different tasks • Task description in simulated email • Participants had to imagine being journalists • Read pre-selected documents • Email attachments • Document structure carefully chosen • Search for more information on Wikipedia • 3 different queries:main topic, sub-topic, related topic • Give relevance feedback for the first20 result entries per query 2x Read about topic in email Look through 4 emailattachments to getstarted with the topic 3x Find more informationby querying search engine Give explicit relevancefeedback

Task Example • Topic: perceptual organs of animals • Pre-selected documents: 4 Wikipedia articles about cats, sharks, dogs, bats • The articles described all facets of the species. • Each article contained several paragraphs dealing with perception-related issues. • 3 different queries • Main topic query: more material about perception • Sub-topic query: more material about visual perception • Related-topic query: perceptual organs for the earth‘s magnetic field

Result List Generation User • Create basic result list • Create expanded queries(+ top 50 terms) • Re-rank that list for every query expansion variant • Merge the re-ranked result lists in a balanced, ordered way • Present merged list to the participant User query Result list Viewed documents Variation: Baseline Expanded query 1 Re-ranked list 1 Variation: Gaze-Filter Expanded query 2 Re-ranked list 2 Variation: Gaze-Length-Filter Expanded query 3 Re-ranked list 3 Variation: Reading-Speed Expanded query 4 Re-ranked list 4 Merged result list

Overview • 21 participants • 60-80 minutes per participant • 111 issued user queries • 2220 explicit relevance ratings • Distribution of the relevance ratings

Precision and Discounted Cumulative Gain (DCG)

Mean Average Precision • Powerful improvement of all gaze-based variants over the baseline • Reading-Speed variant is less effective than GF and GLF • GLF might be a bit better than GF? ** : p < 0.01 * : p < 0.05 (*): p < 0.1 (two-tailed paired t-test)

Query Type Differentiation B: BaselineGF: Gaze-FilterGLF: Gaze-Length-F.RS: Reading-Speed • Generally similar trend within each query type • MAP consistently decreases from main topic to sub topic to related topic queries • Narrow information needs especially for related topic queries • Wikipedia did not contain too many relevant pages • MAP of the Baseline decreases much more (-0.25)compared to GF (-0.17), GLF (-0.18) Asterisks mark significance of improvement overthe baseline

Inappropriate Context Pages about animal species • The baseline method extracts terms that might be far away from the user‘s current topic of interest. • Expanding the query with these terms can lead in a wrong and for the user unpredictable direction. • The more distant the topic of the user’s next query is (i.e. related topic query), the more negative is the effect of unsuitable terms for expanding the query. Gaze-based methods Parts of animal perception(e.g. only visual and auditory perception) Animal perception Baseline method Animal species

Conclusion • Gaze data can effectively be analyzed and used as a source for implicit feedback • Reading behavior detection on its own provides useful information for query expansion and re-ranking • Precision can be improved just by adding those terms to a query that have been read before Future Work • More realistic web search scenarios (e.g. not only on Wikipedia) • More sophisticated heuristics for interpreting gaze-based feedback • Gaze also for long-term implicit feedback (e.g. desktop search)

Interested? • Interested in implicit feedback for personalization? • E.g. scrolling behavior, click-through, mouse movements, eye tracking, EEG, bio sensors, emotions, magic, … • Please let me know! • georg.buscher@dfki.de •  Workshop?

Thank you for your attention! Special thanks for the travel grant by - ACM SIGIR • - AmitSinghal made in honor of Donald B. Crouch • - Microsoft Research made in honor of Karen Sparck Jones

Georg Buscher , Andreas Dengel, Ludger van Elst German Research Center for AI (DFKI)

Georg Buscher , Andreas Dengel, Ludger van Elst German Research Center for AI (DFKI)

Presentation Transcript

DFKI Overview

Sustainable Technology Transfer: The German Way

German Armed Forces Research Center

Bishop Franz-Peter Tebartz -van Elst

IKS impact on DFKI research

German Research Centre for Artificial Intelligence (DFKI) Robotics Innovation Center (RIC)

Design, Implementierung und Evaluierung einer virtuellen Maschine für Oz

Music Ontologies - Some starting points for discussion - Ludger van Elst ONTOlogen Session

Overview

German Aerospace Center (DLR)

Project Information

German Research Hints

German Aerospace Center

Georg Buscher , Andreas Dengel, Ludger van Elst German Research Center for AI (DFKI)

Perspectives for the Indo German Scientific and Technological Cooperation

The Common European Research Information Format CERIF

German Research Center for Artificial Intelligence, DFKI GmbH Stuhlsatzenhausweg 3

Overview

DLR German Aerospace Center