300 likes | 433 Vues
Personalised Search on the World Wide Web. Presented by: Team Grape. About. The Paper: Personalized Search on the World Wide Web Alessandro Micarelli Fabio Gasparetti Filippo Sciarrone Susan Gauch. Team Grape: Jin Wu Kewei Duan Linh Duy To Miaolai Han Takazumi Matsumoto.
E N D
Personalised Search on the World Wide Web Presented by: Team Grape
About The Paper: Personalized Search on the World Wide Web Alessandro Micarelli Fabio Gasparetti Filippo Sciarrone Susan Gauch Team Grape: Jin Wu KeweiDuan LinhDuy To Miaolai Han Takazumi Matsumoto The interactive stuff: MOT lesson Grapple lessons: Text only, Depth first
Overview • Introduction • A Short Overview on Personalised Search • Contextualised Search • Personalisation Based on Search Histories • Personalisation Based on Rich Representations of User Needs • Collaborative Search Engines • Adaptive Result Clustering • Hyperlink-Based Personalisation • Combined Approaches to Personalisation • Conclusions
Introduction Personalisation “adapting the results according to each user’s information needs” (Micarelli et al., 2007, p. 195) • Searching the WWW • Dealing with the information overload • Limitations of traditional search engines • Information access paradigms: • Searching by surfing (hyperlink directories) • Searching by query (Information Retrieval) • Recommendation (suggested items)
Content and Collaborative-based Personalisation • Originally: information retrieval • Content-based: • Consider individuals - mostly used • Polysemy & synonymy leads to vocabulary problem → irrelevant information • Collaborative-based: • Consider models of different users • User similarity → similar information needs • Social navigation • Not employed in search engines
User Modelling in Personalised Systems • User modelling/profiling techniques: • Track visited pages & search history → important feature learned → more relevant information • Simplest cases: registration form or questionnaire • More complex cases: user model consists of a dynamic information structure • Examples: • Google Alert: explicit approach & routing query → limited • Google Personalized Search: deliver customised search based on user profile • User modelling components affect search in 3 distinct phases: • Part of retrieval process • Re-ranking • Query modification
Source of Personalisation • Data mining & machine learning • Relevant feedback & query expansion • Explicit relevant feedback • Implicit relevant feedback • Further sources: desktop search systems
An Overview on Personalisation Approaches • Current context: based on implicit feedback using client-based software • Search History: • Limited to web search history • Done during retrieval process → fast response • Rich user models: explicit feedback → build rich representation of user needs • Collaborative approach: relevant resources based on previous ratings by user with similar tastes & preferences • Result clustering: results grouped into clusters, each related to same topic • Hyper textual data: include additional factors in ranking algorithm
Contextual Search • A new approach for search • The information system proactively suggests information based on a person’s working context • Just-in-Time IR (JITIR) • Rhodes
JITIR • Monitors the user’s actions • Non-intrusive • Automatically identify relevant information • Retrieve resources automatically
Based on Agents • Remembrance Agent • Margin Notes Agent • Jimminy Agent
Personalisation Based on Search Histories Flight Visa Credit Card Citizenship Travel
Online Approaches • Capture history information as soon as they are available, affecting user models and providing personalised results taking into consideration the last interactions of the user • Two different types of information are collected: • submitted queries • snippets
Offline Approaches • Exploit history information in a distinct pre-processing step, usually analysing relationships between queries and documents visited by users • CubeSVD Algorithm based on the click-through algorithm • Time-consuming
Personalisation Based on Rich Representations of User Needs Three prototypes ifWeb, Wifs, InfoWeb • Based on complex representations of user needs (user models) • Built using explicit user feedback on results • Based on frames and semantic networks (AI)
ifWeb • User model-based intelligent agent • Weighted semantic network for user profile • Autonomous focused crawling to find related documents based on previously identified documents • Updates user profile using user feedback • Reduces the weight of unused concepts (rent)
Wifs • Content-based approach • Filters HTML and text documents from AltaVista, reordering links based on UM • Frame-based user model structure A frame has slots which contains terms (topics), associated with other terms (co-keywords), forming a semantic network • The terms are stored in a Terms DataBase that is created beforehand (by experts) • Instead of traditional IR, the relevance of a document is calculated from the occurrence and relevance of terms in the document
Wifs • Content-based approach • Frame-based user model structure A frame has slots which contains terms (topics), associated with other terms (co-keywords), forming a semantic network • The terms are stored in a Terms DataBase that is created beforehand (by experts) • Filters HTML and text documents from AltaVista, reordering links based on UM • Instead of traditional IR, the relevance of a document is calculated from the occurrence and relevance of terms in the document Representations of the User model (a) and Document model (b) (From Micarelli et al., 2007)
InfoWeb • Content-based approach • Adaptive retrieval of documents in digital libraries, based on Vector Space (IR) • Stereotype knowledge base Contains most significant documents for a specific category of user (domain), created beforehand (by an expert) • k-means clustering on document collection beforehand Each cluster is seeded by a representative document for each class of user • User model starts as a stereotype, evolves based on feedback
Collaborative Search Engine • ‘SearchParty’ module • Social filtering • Stores user queries and the results users clicked • Knowledge Sea • Social adaptive navigation system • Exploits both traditional IR and social navigation approaches • Results represented by colour lightness
Collaborative Search Engine • Calculate similarity measures among user needs • Identified by queries, selected resources • Two queries might contain no common terms but returns similar results • E.g. ‘PDA’ and ‘handheld computer’ • Statistical model • Based on the probability a page was selected for a given query • Focus on relative frequency instead of content-analysis techniques
Collaborative Search Engine • Compass Filter • Based on web communities • Pre-processing the web structure • If user frequently visit a community, the results in the same community are boosted
Adaptive Result Clustering • Traditional Search Engines • Rank the list by similarity of query and page • Might take a long time • Important that users clearly describe what they are looking for • Organise the results • By grouping pages into folders and sub folders • On a graphical interactive map
Adaptive Result Clustering • Clustering • Query process needs to be fast • Usually performed after retrieval of query results • Does not require pre-defined categories • Provides concise and accurate descriptions • Further clustering systems • SnakeT • Scatter / Gather
Hyperlink-Based Personalisation • Main algorithms: • PageRank: PR value • HubFinder: hub value • HubRank: PR value & hub value
Combined Approaches to Personalisation Perform personalisation using multiple adaptive approaches • Outride: Browsing history & current context • infoFACTORY: Integrate web tools & services
Outride Outride includes: • Contextualisaion Interrelated conditions that occur within an activity • Individualisation Characteristics that distinguish an individual
infoFACTORY • A large set of integrated web tools and services that are able to evaluate and classify documents retrieved following a user profile • New • Has potential • Interesting
Conclusions • Information is crucial to users • Need to filter and personalise resources to deal with information overload successfully • Increases search engine accuracy and reduces time wasted sorting through irrelevant results • Can be extended e.g. targeted advertising • Some systems already in use, others under development (e.g. Semantic Web) • Future directions: • Predicting future user behaviour (plan-recognition) • Language semantic analysis (Natural Language Processing)
Thanks for listening Any questions?