Contextualized Search with Directory-Based Clustering and Wordnet Knowledge
110 likes | 202 Vues
Explore a Semantically Enhanced Desktop Search using metadata and Wordnet extensions to enhance search accuracy and result expressiveness. Gain insights into the structure, purpose, prototype, and advantages of this innovative approach.
Contextualized Search with Directory-Based Clustering and Wordnet Knowledge
E N D
Presentation Transcript
Semantically Enhanced Desktop Search Using Directory-Based Clustering and Wordnet Knowledge Ştefania GHIŢĂ
Content • Project Overview • Google • Purpose • Structure • Photo Prototype • Offline Content Prototype • Conclusions Hannover
Project Overview • Background • Search • No personalization user preferences • No context • Topic classification in DMOZ • Purpose • Contextualize / personalize search using additional metadata • Advantages • Precision of search • Expresiveness of search results Hannover
Google • A possible solution – indexing data on the PC (Google): • Increase search efficiency • Doesn’t use specific characteristics of the user like : • Folder hierarchies • Browser caches Hannover
Purpose • Finding new solutions for: • Increasing precision of search according to the user’s profile • Expresiveness of search results by adding additional information to the search • Ranking the search results • Metadata as the answer to these problems Hannover
Structure • How to characterize and obtain a user profile • Define metadata models for different types of information • Automatically generating such metadata • Enriching data by adding additional information: Wordnet • Extending additional information using file structure and user behaviour • Search engine that uses the metadata Hannover
Photo prototype • /My Pictures/ Holidays/ Germany/ Hannover/ Rathaus/ building.jpg • <location_info>Holidays</location_info> • … • <location_info>building</location_info> • <lastModified>date</lastModifies> • <sizeBytes>XX</sizeBytes> <resolution>0</resolution> <sizeX>(pixels)</sizeX> <sizeY>(pixels)</sizeY> <colorScheme>X</colorScheme> Hannover
Enriching Data with Wordnet • Holidays/ Germany/ Hannover RDF • Add Wordnet extensions: • Synonims • Holonyms (Germany is a part of …) • Meronyms (Germany has part …) • Hypernims (Holiday is a kind of …) • Hyponims (… is a kind of Holiday) • Troponyms Hannover
Example <rdf:Description rdf:about="file:\\C:\Stefi\L3S\beautiful\home\plant\cat.jpg"> <j.0:location_info>C:\Stefi\</j.0:location_info> <j.0:location_info>C:\Stefi\L3S\</j.0:location_info> <j.0:location_info> <rdf:Description rdf:about="file:\\C:\Stefi\L3S\beautiful\"> <j.0:sense>beautiful</j.0:sense> </rdf:Description> </j.0:location_info> <j.0:location_info rdf:resource="file:\\C:\Stefi\L3S\beautiful\home\"/> <j.0:location_info> <rdf:Description rdf:about="file:\\C:\Stefi\L3S\beautiful\home\plant\"> <j.0:sense>plant</j.0:sense> <j.0:sense>establish</j.0:sense> <j.0:sense>implant</j.0:sense> </rdf:Description> </j.0:location_info> <j.0:location_info>cat</j.0:location_info> <j.0:sense>cat</j.0:sense> <j.0:sense>kat</j.0:sense> <j.0:sense>guy</j.0:sense> <j.0:sense>cat-o'-nine-tails</j.0:sense> <j.0:sense>big_cat</j.0:sense> <j.0:sense>vomit</j.0:sense> <j.0:sense>Caterpillar</j.0:sense> <j.0:sense>computerized_tomography</j.0:sense> <j.0:lastModified>Tue Oct 26 17:36:44 CEST 2004</j.0:lastModified> <j.0:sizeBytes>291851</j.0:sizeBytes> </rdf:Description> </rdf:RDF> Hannover
Offline Content Prototype • Additional information for the user’s profile • Browsing behaviour • Relevant results • Additional context for results • Structure: • ID of the page • Date of access • Link from which the user came • Links accessed on the page • Others annotations of the content Hannover
Conclusion • Metadata models for contextualized search for different types of files • Tools for automatically generating metadata • Tools for enriching metadata • Search engine and algorithms that use the metadata Hannover