970 likes | 1.16k Vues
Dynamic information filtering. Patrick Baudisch Xerox PARC March 26, 2001. Contents. Introduction Requirements and related work The TV Scout …as a retrieval system …and as a filtering system How it works The QuerySet Architecture Building QuerySet filtering systems
E N D
Dynamic information filtering Patrick Baudisch Xerox PARC March 26, 2001
Contents • Introduction • Requirements and related work • The TV Scout • …as a retrieval system • …and as a filtering system • How it works • The QuerySet Architecture • Building QuerySet filtering systems • Manual profile editing • Conclusions
Introduction • Requirements and related work • The TV Scout • …as a retrieval system • …and as a filtering system • How it works • The QuerySet Architecture • Building QuerySet filtering systems • Manual profile editing • Conclusions
Motivation: Information overload • Too many • research papers • books • movies • web pages • … • even TV programs! • Goal: alleviate information overload
Analytic information seeking strategies Retrieval (IR) changing interests, stable database Filtering (IF) changing sources, stable interests Many application fit in dictionaries => IR music => IF Others fit into neither niche High source and need change rate Example stock market [Oard 96]: “Grand challenge” Filtering Information source change rate Retrieval Information need change rate IF, IR, and dynamic filtering Dynamicinformationfiltering
Objective of dynamic filtering • Adaptation speed is crucial • (user profile = interest)is crucial for filtering accuracy • Interest changes: (profile interest) => filtering quality drops • Adapt profile as fast as possible • Subject of this thesis:Filtering architecture for maximum adaptation speed
Introduction • Requirements and related work • The TV Scout • …as a retrieval system • …and as a filtering system • How it works • The QuerySet Architecture • Building QuerySet filtering systems • Manual profile editing • Conclusions
Requirements • Requirement 1: Exhaustiveness (arbitrary interests) • (King and Sacramento), but not (King and Queen), INFOS [Mock 96] • Requirement 2: Output style (single ranking preferred) • Boolean output, Info. Lens [Malone 87]; Categories, SIFT [Yan 95] • Requirements 3-5: Adapt to interest changes
userprofile error error delayedprofile interest R3: Learning from relevance feedback actual interests time [Jennings 91, p.207] • Newt [Sheth and Maes 93] • WebMate [Chen and K. Sycara] • GroupLens [Konstan et al 97]
error delayedreaction R4: Limitations of manual profile editing Problems with gradual changes userinterest • Rule-based systems • Information Lens [Malone et al 87] • ISCREEN [Pollock 88] • INFOSCOPE [Fischer 91]
Resulting design guideline • Build a filtering system that allows • learning from relevance feedback (for gradual changes) • users to edit their profiles directly (for abrupt changes) • and • that uses a “meaningful” model for the user profiles,so that users understand how to edit them
Introduction • Requirements and related work • The TV Scout • …as a retrieval system • …and as a filtering system • How it works • The QuerySet Architecture • Building QuerySet filtering systems • Manual profile editing • Conclusions
Query Frame Content frame
Exact match Q1. select a query Best match
retentionmenus laundry list video labels …print them out, take them home Q2. read & retain program descriptions programdescription table programdescriptionlist
Q3. suggestions suggest queries
Introduction • Requirements and related work • The TV Scout • …as a retrieval system • …and as a filtering system • How it works • The QuerySet Architecture • Building QuerySet filtering systems • Manual profile editing • Conclusions
QuerySet profile editor (Expert mode) Best match profile (QuerySet profile) QuerySet Profile:Personal programper singlemouse click
QSA profile editor (experts) viewing time profile editor channelprofile editor QSAprofile editor suggest queries QSAmenu query menus textsearch retentionmenus programdescription table laundry list video labels programdescriptionlist Summary TV Scout interface with starting page
queries (one shot state) bookmarks (reuse state) start T2 T T3 system suggests system compiles system learns S2 QSA profile (filtering state) S3 user defines user updates U2 U3 Incremental usage T1 system provides S1 user writes U1
Studies done on the TV Scout so far • Comparison of individual query classes • > 13,000 registered users • Predefined queries (genres) covered most interests • Text search for what genres do not cover • Search for actors, series, topics • “Opinion leader” recommendation was 5th most popular query • Long term study still outstanding
Introduction • Requirements and related work • The TV Scout • The TV Scout as a retrieval system… • …and as a filtering system • How it works • The QuerySet Architecture • Building QuerySet filtering systems • Manual profile editing • Conclusions
d1 d2 … dj-1 dj r1 r2 r3 rm … QSAprofile q1 … qn userprofile A QuerySet profile vs. other user profiles • Queries in QSA profile intended to represent different interests • != query representation nodes • != concepts (or facets) that are part of a query/interest. • != IR query that represents a single interest only This is not (necessarily)an inference network e.g. news,sports, Comedyshows How does user like news compared to sports…?
Objective of that decomposition • Several interests changes can be handled with minor profile changes • “I am not in the mood for action movies today” • “My taste in action movies has changed” • => Update only query weight in aggregation functionBenefit: all queries remain unaffected • Edit only action movies queryBenefit: all other queries remain unaffected
bookmarks (reuse state) start T3 system learns user updates U3 Make queries correspond to interests • Selection principle • Make a query what will change as a whole • It is interests that change • => Use queries corresponding to interests • Negative examples • Data fusion (e.g. [Fox 94, Lee 97]) => redundancy • Automated collaborative filtering => overlap • Positive example: • The Incremental usage supported by QSA systems:Use as query, then bookmark, then use as profile T1 T2 T system provides system suggests system compiles S1 queries (one shot state) S2 QSA profile (filtering state) S3 user writes user defines U1 U2
Introduction • Requirements and related work • The TV Scout • The TV Scout as a retrieval system… • …and as a filtering system • How it works • The QuerySet Architecture • Building QuerySet filtering systems • Manual profile editing • Conclusions
query feedback query ratings Query-executingIR/IF subsystem Query-executingIR/IF subsystem Pre-conversion Re-pre-conversion Pre-conversion Re-pre-conversion IR/IF subsystem running the aggregation function IR/IF subsystem running the aggregation function Re-post-conversion output rating aggregation feedback Post-conversion Post-conversion Re-post-conversion relevance ratings relevance feedback How to build QSA systems? Reuse! Sybase,FreeWAIS, Print import, <more>
Aggregation subsystem • Example • User profile = {action movies, comedies, Tips by Lars} • Aggregation: turn these three rankings into a single ranking • Is a programs {0.4 action movie, 0.3 comedy, “excellent” by Lars} better than {0 action movie, 0.8 comedy, “ok” by Lars}? • Notion of tradeoffs similar to IR/IF systems on term frequencies • Query = {“information”, “retrieval”} • Is a web page {0.4 information, 0.3 retrieval}better than another web page {0 information, 0.8 retrieval}? • => Reuse IR/IF systems • Weighted request and indexing retrieval model • Output rating(object) = Sum of query ratings • TV Scout: Overlap between queries was small enough=> This model is sufficient
Introduction • Requirements and related work • The TV Scout • The TV Scout as a retrieval system… • …and as a filtering system • How it works • The QuerySet Architecture • Building QuerySet filtering systems • Manual profile editing • Conclusions
Simple case: “Rate a query” • What is the general concept behind profile editors? • Rate a query as a whole“How do you like science fiction movies”? • => This is fast, because users can take experience with and expectations about query into account • But what if the user lovesnews programs, but wantsonly a few top-ranked ones?(redundancy between news)
General case: “Rate a set” • Generalization • Ask user to rate arbitrary set of objects • Example “How do you like:{Back to the future, Brazil, Blade runner, 1984…Metropolis}? • User-aggregated relevance feedback • The user mentally assigns a rating to each object • The user aggregates these and tells the system the result • This save effort for communicating individual ratings • Benefit • “Rate a query” is a special case of “Rate a set” • This makes both compatible with relevance feedback
Combine both • Goal: find a way • as simple and fast as “rate a query” • as flexible as “rate a set” • Solution • Use top and bottom ranks of queries (and others) • Extensible to arbitrary ranks -> Histogram-based interfaces “How muchdo you liketop-rankednews programs?” “How muchdo you likebottom-rankednews programs?”
Query-wise preferable if few queries (e.g. query inserted) Property-wise preferable if many queries (e.g. mood change) few URF samples (simplicity): form-based interfaces paintableinterfaces many URF samples (accuracy): histogram-based interfaces history Information Sports Information Sports undo save Golf C. music Theater Series C. music Theater Series execute Golf Series Series Soap Basket ball Soap Basket ball Action movies Comedy Comedy Action movies M. Arts M. Arts Sitcom Schwarz.. B. Hills M.A.S.H. B. Hills M.A.S.H. Simpsons Schwarz.. Simpsons Profile editor framework 2. Dead Poets Society 1. Bayern-Manchester 2. Amazons on Mars ------------------------------- 2. Le Grand Bleu 1. Sat1 ran Skip Skip all
Theater Classicmusic Information Series Endorsed by Paul Comedy “Action AND Comedy” BeverlyHills 90210 M.A.S.H. Endorsedby Lars Action movies Schwarzenegger Multiple select applied to interest Theater Classicmusic Golf Information Sports Basketball Series Endorsed by Paul Comedy “Action AND Comedy” BeverlyHills 90210 M.A.S.H. Endorsedby Lars Action movies Schwarzenegger
Multiple select Pixel selection first,then function selection Painting Function (tool) selection first,then pixel selection (painting) Multiple select versus painting • Immediate visual feedbackallows differentiated input
Layout by co-occurrence Chick Sand Milk Fruit Pie Coffee Iced Tea Danish Cookie Fish sand Bacon TOTAL Milk Shake TOTAL French Toast Orange Juice Onion Rings Cola French Fries Pan-cakes English muffin Eggs Cheese Burger Hash Browns Ham Sundae Ham Burger Root Beer
history Information Sports Information Sports undo save Golf C. music Theater Series C. music Theater Series execute Golf Series Series Soap Basket ball Soap Basket ball Action movies Comedy Comedy Action movies M. Arts M. Arts Sitcom Insertion of “sitcom” Schwarz.. B. Hills M.A.S.H. B. Hills M.A.S.H. Simpsons Schwarz.. Simpsons A paintable profile editor
Paintable time and channel editors • Interval sliders are split into segments • no handles, just paint the addition • Intervals labeled as entities to reduce cluttering
Introduction • Requirements and related work • The TV Scout • The TV Scout as a retrieval system… • …and as a filtering system • How it works • The QuerySet Architecture • Building QuerySet filtering systems • Manual profile editing • Conclusions
QSA vs. requirements arbitrary interests • Requirement 1: Exhaustiveness • Requirement 2: Output style • Requirements 3-5: Adapt to interest changes single ranking User-aggregatedrelevance feedback Reuse of old queries(weight set to zero) Relevance feedback
Achievements of the dissertation • (1) a new generic IF system architecture designed for the efficient handling of highly dynamic interests(the QuerySet Architecture) • (2) a new paradigm of high-level access to user profiles (user-aggregated relevance feedback) • (3) a framework of new user interface interaction styles providing users with this high-level access • (4) a proof of concept implementation (TV Scout)
Future work • (1) new application areas • (2) new query classes • (3) improved aggregation functions • (4) new profile editor user interfaces • (5) empirical work.
Image processing there are no blackpixels only rather dark pixels there are no whitepixels current state of the image Number of pixels Luminance desired stateof the image black handleassigns 0%luminance white handleassigns 100% luminance gray handleassigns 50% luminance
a c t i o n m o v i e s Slide rule (Rechenschieber) action movies 0 ½ ¼ ¾ 1 merge histograms “zipper style” comedies 0 ½ 1 ¼ ¾ | c o medies |
Histogram-based interfaces Terminator 2 Dead Poets Society Amazons on Mars -------------------------- Le Grand Bleu Back to the Future 32 out of 333 sports programs per week selected Sports Sports 512 out of 914 movies per week selected Entertain-ment Entertain-ment 536 out of 536 comedy shows per week selected Comedyshows Comedyshows 14 out of 14 martial arts programs per week selected Martial arts Martial arts Legend Legend hot! Overall: 1094 out of 1797 programs per week selected hot! selected selected Undo Save rejected rejected
Selected for output Action Comedy News Overall: 32 out of 59 programs per week selected Save Undo Auto The jelly interface