1 / 12

Using Ontological Relationships to Provide Indexing of Plain T ext Searches

Using Ontological Relationships to Provide Indexing of Plain T ext Searches. Research by Fletcher Liverance fletcher.liverance@gmail.com November 14 th , 2011. How Does a Search Engine Work?. 1. User submits a keyword based query to the search engine.

udell
Télécharger la présentation

Using Ontological Relationships to Provide Indexing of Plain T ext Searches

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Using Ontological Relationships to Provide Indexing of Plain Text Searches Research by Fletcher Liverance fletcher.liverance@gmail.com November 14th, 2011

  2. How Does a Search Engine Work? 1. User submits a keyword based query to the search engine 4. Pages are ranked and returned to the user 2. The indexer locates all relevant pages containing those keywords 3. The database returns all pages found in the index

  3. How Does a Search Engine Work? Benefits • Fast • Machine learnable • Straight forward Drawbacks • Pattern matching • Keyword based • Garbage in, garbage out

  4. Garbage in, Garbage out Scenario You saw this television series and you’d like to find out more about it, but you don’t know what the name of the series or any of the characters are. What do you do? http://www.dan-dare.org/FreeFun/Images/CartoonsMoviesTV/WinnieThePoohWallpaper1024.jpg

  5. Garbage in, Garbage out POOR RESULTS!

  6. Garbage in, Garbage out GOOD RESULTS!

  7. Semantic Relationships • Ontology “An ontology is a description (like a formal specification of a program) of the concepts and relationships that can exist for an agent or a community of agents.”http://www-ksl.stanford.edu/kst/what-is-an-ontology.html • Resource Description Framework (RDF) “RDF extends the linking structure of the Web to use URIs to name the relationship between things as well as the two ends of the link. Using this simple model, it allows structured and semi-structured data to be mixed, exposed, and shared across different applications.” http://www.w3.org/RDF/ Disney Winnie the Pooh Bear isMadeBy isA hasFriend hasClothing hasColor Piglet Shirt Yellow hasColor isA Pig Red

  8. Semantic Relationships How can we locate useful semantic relationships? • Link Distance • Link Direction • Link Relationship Bear Disney hasColor isA isA isMadeBy isA Company Brown Winnie the Pooh Mammal hasFriend hasClothing hasColor Piglet Shirt Yellow hasColor isA hasRGB Pig Red 0xFFFF00

  9. Modified Search Indexing 1. User submits a keyword based query to the search engine 4. Searches are ranked and returned to the user as additional search suggestions 2. Search analyzer creates additional searches based on ontological information 3. Search engine performs parallel searches of top search terms

  10. Current Work • NASA SWEET Ontologies • 6000 concepts • 200 ontologies • Scientific • Loose relationships • National Oceanographic and Atmospheric Administration • 30+ years of scientific research • Text based • Unsorted • 2+ gigabytes • Domain specific terminology

  11. Challenges & Future Work • How to rank plain text • No links or history • No ‘page views’ • Limited ontology coverage • 6000 concepts in NASA SWEET ontologies • ~170,000 words in the English language • Many more unique names and scientific terms • How can ontologies be automatically generated? • Graph matching • Identifying related terms in a large graph is difficult • Multiple links per node, must identify appropriate links

  12. Q & A

More Related