1 / 16

Web Search with Variable User Model

Web Search with Variable User Model. Peter Gurský Stanislav Krajči Tomáš Horváth Róbert Novotný Jozef Jirásek Veronika Vaneková Peter Vojtáš. PF UPJŠ Košice MFF UK Praha. Datakon, 22.10.2007. Problem: Information Overload. Multiple sources Different structure, layout, usage

temple
Télécharger la présentation

Web Search with Variable User Model

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Web Search with Variable User Model Peter Gurský Stanislav Krajči Tomáš Horváth Róbert Novotný Jozef Jirásek Veronika Vaneková Peter Vojtáš PF UPJŠ Košice MFF UK Praha Datakon, 22.10.2007

  2. Problem: Information Overload • Multiple sources • Different structure, layout, usage • Various software tools with different sets of answers Datakon 22.10.2007

  3. Objectives • Integrate data from heterogeneous sources • Find adequate number of answers that match user preferences • Suitable representation of user preferences Datakon 22.10.2007

  4. System Architecture Corporate memory Ontology HTML files annotation crawler Top-k objects query WEB evaluation Middleware system best objects Datakon 22.10.2007

  5. Text-Oriented Annotation • Regular expressions • Analyze of visual representation • Structural differences: • Element hierarchy • HTML attributes • HTML node values Datakon 22.10.2007

  6. Graphic-Oriented Annotation • Preliminary exploration. • Web pages may contain pictures, flash animations, ... This information is not available from web page source. • We use OCR processing and analysis of color, position, ... Datakon 22.10.2007

  7. User Dependent Querying Object display and evaluation Evaluate Evaluate Display Find Rules Suitable Object Search (Top-k) Learning Preferences (IGAP) Find Top-k Objects RDF repository Preferences Datakon 22.10.2007

  8. Retrieving Preferences from User • Direct user specification • Collaborative filtering • Learning preferences from sample objects evaluated by user • Iterative method: repeat evaluating until the relevant objects are found Datakon 22.10.2007

  9. Learning Preference from Evaluation Datakon 22.10.2007

  10. Learning Preference from Evaluation Datakon 22.10.2007

  11. Basic Fuzzy Set Types • Lower values are better • Higher values are better • Middle values are better • Either high or low, but not middle Datakon 22.10.2007

  12. Aggregation Each fuzzy set relates to one attribute, e.g. number of stars. Thus we obtain partial relevance for every attribute. Overall relevance is result of aggregation: • Weighted average (continuous range)goodU = 2/3*cheapU + 1/3*high-classU • Rules (discretized range)evaluationU = good IF (price≤500 AND stars≥***)evaluationU = excellent IF (distance≤1 km) Datakon 22.10.2007

  13. User 1 User 2 User 3 User 4 Close Far Middle distance Border Middle price Cheap Middle price Border Datakon 22.10.2007

  14. Relevant Object Search • having retrieved local and global preferences, we can find top-k objects according to user preferences • do not browse and calculate above all data, use only those that are necessary • use 3-phased No Random Access Algorithm – an improvement of Fagin's algorithm Datakon 22.10.2007

  15. User Independent Querying • Text-based vector model • Document is defined as a vector ofTF-IDF weights of the document terms • Weights are stored in database index • Similarity ofqueryand document collection isdetermined by cosine measure Datakon 22.10.2007

  16. Thank You for Your Attention. Questions?

More Related