Letizia: Passive, Personalized Web Browsing Assistant

incorporating personal information brent chun sims296a-3

letizia • recommends web pages during browsing based on user profile • learns user profile using simple heuristics • passive observation, recommend on request • provides relative ordering of link interestingness • assumes recommendations “near” current page are more valuable than others user letizia heuristics recommendations user profile

why is this useful? • tracks and learns user behavior, provides user “context” to the application (browsing) • completely passive: no work for the user • consequences? • useful when user doesn’t know where to go • no modifications to application: letizia interposes between the web and the browser • consequences?

consequences of passiveness • weak heuristics • example: click through multiple uninteresting pages en route to interestingness • example: user browses to uninteresting page, heads to nefeli for a coffee • example: hierarchies tend to get more hits near root • cold start • no ability to fine tune profile or express interest without visiting “appropriate” pages

open issues • how far can passive observation get you? • for what types of applications is passiveness sufficient? • profiles are maintained internally and used only by the application. some possibilities: • expose to the user (e.g. fine tune profile) ? • expose to other applications (e.g. reinforce belief)? • expose to other users/agents (e.g. collaborative filtering)? • expose to web server (e.g. cnn.com custom news)? • personalization vs. closed applications • others?

new clone xfer find summ lifestreams • lifestream = time ordered stream of documents + filters + “agents” • filters provide views (like rdbms) called substreams • “agents” attach to the ui, streams, and documents • provide (condition,action) pairs. • no machine learning A lifestreams document a lifestream A lifestreams document A lifestreams document A lifestreams document A lifestreams document A lifestreams document A lifestreams document A lifestreams document Oct 19, 1998 lifestream operations Oct 20, 1998 Oct 21, 1998

lifestreams assessment • linear stream of documents is a poor metaphor if used alone • don’t tell me to abandon my hierarchies! • problems: managing complexity, large “working sets”, etc. • stated problem: too many apps, too many file xfers, too many format xlations, too many hierarchies • lifestreams don’t help with any of these and simply replaces the fourth • most of techniques used apply equally well to hierarchies • no machine learning = more work for the user

lifestreams assessment cont. • filters are nice, but how do you write one? • application-specific, but we already knew this • example: “all the email I haven’t responded to” • agents are nice, but how do you write one? • application-specific, but we already knew this • agents have limited applicability

open issues • new metaphors to manage complexity • easy ways to create filters/agents • allow “fuzzy” filters • lifestreams: filters need to be precisely specified • use machine learning + user feedback to relax this • associate actions with filters • tight integration of filters, agents w/ applications • apply ideas in lifestreams to hierarchies • others?

learning interface agents • add agents in the ui, delegate tasks to them • use machine learning to improve performance • learn user behavior, preferences • useful when: • 1) past behavior is a useful predictor of the future • 2) wide variety of behaviors amongst users • examples: • mail clerk: sort incoming messages in right mailboxes • calendar manager: automatically schedule meeting times?

advantages • 1) less work for user and application writer • compare w/ other agent approaches • no user programming • significant a priori domain-specific and user knowledge not required • 2) adaptive behavior • agent learns user behavior, preferences over time • 3) user and agent build trust relationship gradually • claimed advantage: user constructs model of how agent makes decision over time • real users: do the right thing!

machine learning • 1) learn by observation • observe user, record (situation,action) pairs • use “similar” past (situation,action) pairs to predict action for new situations • similarity = weighted difference of situation features • weights assigned based on feature/action correlations • algorithm • take n closest situations, compute scores for associated actions • recommend (or perform) action with highest score • use (situation,action) pairs to explain recommendations

machine learning cont. • 2) learn by user feedback • indirect feedback (e.g. ignore recommendation) • direct feedback (e.g. don’t do this again) • database of priority ratings • 3) learn by being trained • train agent by giving examples of desired behavior • e.g. save all messages from bnc@cs.berkeley.edu in the sims296a-3 mailbox

open issues • how far can black box treatment of apps get you? • example: mail clerk integration w/ ui requires access to application internals; what if this wasn’t the case? • tight integration with application user interface • access to internal events/state of significance • easy way to enable third-party developers to write personalization modules for applications? • chaining (situation,action) pairs to perform complex tasks • e.g. monitor ACM digital library -> look for interesting papers -> download them -> file them -> notify me via email -> print out. • others?

sonia • automatic construction of document clusters • categorization based on full-text comparisons • automatically classify new docs into existing clusters • multiple cluster hierarchies imposed on same data • examples: categorize search results into clusters, categorize files in user’s home directory classes feature selector stemmer clusterer create clusters documents cs298-1 is290-2 is296a-3 classifier project discussion classify documents documents

creating clusters • stemmer: e.g. walking, walked, walk -> walk • feature selector • 1) remove stopwords, e.g. the, and, is, ... • 2) removes term with freq < 3 or freq > 1000 • clusterer • 1) hierarchical agglomerative clustering • 2) iterative clustering technique • document similarity based on term overlap • cluster similarity = pairwise ave. of document similarities

classifying documents • pachinko machine (bayesian classification) • uses 50 “most informative features” for each cluster • significant reduction in computational cost • claim: often sufficient for accurate classification • obvious trade-off between compute time vs. accuracy • best case: compare new document with every document in every cluster and assign, compute time may not justify gain in accuracy.

why is this useful? • useful to help understand contents of large collection of documents (e.g. results from a database query) • useful to automatically construct multiple categorizations of same data • e.g. user may take the time to categorize personal files in a single hierarchy, unlikely to do this in multiple ways • saves times by automatically classifying documents • most applicable when consequences of error are low

open issues • adding importance, confidence to the system • using document structure for weighting terms (e.g. terms in abstract vs. terms in text) • support for different document types (e.g. PS!) • others?

Letizia: Passive, Personalized Web Browsing Assistant

Letizia: Passive, Personalized Web Browsing Assistant

Presentation Transcript

Module 10 – Incorporating Additional Information

Incorporating Additional Information

Personal information

PERSONAL INFORMATION :-

Personal Information

Personal Information

Mod 10 Incorporating additional information--Panera

Module 10-Incorporating Additional Information

Incorporating Additional Information

Personal Information Management

Personal information

Personal Information

Personal Information

Personal Information

PERSONAL INFORMATION

Personal information

Personal information

Personal Information

Incorporating Information Competency into the Curriculum

Personal Information

Personal Information