1 / 20

“Find What I Mean, Not What I Say"

“Find What I Mean, Not What I Say". Mike Moran IBM Distinguished Engineer November, 2007. Why do companies use search?. How does IBM OmniFind meet those needs?. OmniFind Enterprise Edition. OmniFind Yahoo! Edition. Scalable and Secure Enterprise Search for Corporate Intranets.

fred
Télécharger la présentation

“Find What I Mean, Not What I Say"

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. “Find What I Mean, Not What I Say" Mike Moran IBM Distinguished Engineer November, 2007

  2. Why do companies use search?

  3. How does IBM OmniFind meet those needs? OmniFind Enterprise Edition OmniFind Yahoo! Edition Scalable and Secure Enterprise Search for Corporate Intranets Basic, No-Charge Search OmniFind Discovery Edition Insight Solutions with OmniFind Search for Self-Service and eCommerce Content Analytics

  4. Why is search so difficult? • It is harder to think of words than to make choices • Choosing the same words as the author is not easy • Words are ambiguous 1 to 10 of 10 zillion

  5. Pat phone The classic search model Task Misconception I need to tell Pat. Information Need Mistranslation How do I contact Pat? Verbal form What’s Pat’s phone number? Misformulation Query Ambiguity Search Engine

  6. Sometimes your word is used too often Searching for “neon” finds signs and cars

  7. Sometimes your word isn’t used at all Searching for “Pat phone” finds nothing Pat phone

  8. Analytics bridge unstructured and structured data Text Analysis Unstructured Information Structured Information Text, Chat, Email, Audio, Video Indices DBs KBs • Explicit semantics • Efficient search • Focused content ...BUT... • Slow growing • Narrow coverage • Less current/relevant • High-value • Most current • Fastest growing • ...BUT ... • Buried in huge volumes (noise) • Implicit semantics • Inefficient search

  9. Find what I mean, not what I say Rate for Rate Billboard Going rate for leasing a billboard near Triborough Bridge SEARCH: Bronx Located in No keywords in common, but a good answer Rate for Rate Billboard “…We were offered $250,000/year in 2001 for an outdoor sign in Hunts Point overlooking the Bruckner expressway. …” Bronx Located in

  10. Without semantic search, it’s not a pretty picture Rate for Rate Billboard Going rate for leasing a billboard near Triborough Bridge SEARCH: Bronx Common keywords Bad semantic match Located in Song Title Queens “…Simon and Garfunkel's "The 59th Street Bridge Song" was rated highly by the Billboard magazine in the 60's…” Magazine

  11. Relationship Annotator Located At Gov Official Arg1:Entity Arg2:Location Country Title Person Named Entity Annotator NP VP PP Syntactic Annotator President visits shrine in Israel Bush News example • Search: “Bush trip to Middle East”

  12. CeoOf Arg2:Org Arg1:Person Person Organization PP NP VP Financial services example • Search: “Fred Center’s title” • Search: “head of Center Micros” Relationship Named Entity Parser Fred Center is the CEO of Center Micros

  13. Relationship Annotator Driven By Arg1:Car Arg2:Person Car Named Entity Annotator Person NP VP PP Syntactic Annotator A Neon was driven by Higgins Timothy Law enforcement example • Search: Neon car • Search: “Higgins’ car”

  14. How does semantic search find a phone number?

  15. When you search for “IBM phone number” @xmlf2::‘ibm <.or>phone <#phonenumber/> "phone nbr" "telephone nbr" "telephone number" </.or> <.or>number <#phonenumber/> "phone nbr" "telephone nbr" "telephone number" </.or>' Expanded Query Synonyms Results

  16. Customers need a platform, not just samples • To create domain-specific knowledge, create a new annotator or modify one already shipped • Or configure any regular expression with no coding • And it needs to work in many natural languages

  17. Text Customers need an open, extensible framework • Text analysis is a complex, multi-step process • No one vendor can satisfy every need you’ll have in text analysis • That’s why you need an open framework OmniFind Enterprise Edition UIMA Parse Words Identify Language Categorize Search Index Annotate

  18. IBM has submitted the Unstructured Information Management Architecture (UIMA) specification to the Organization for the Advancement of Structured Information Standards (OASIS) The UIMA source code has been contributed to the Apache Software Foundation and an Apache Incubator project has been established to foster collaborative, consensus based development of new software based on UIMA UIMA is an open standard framework

  19. Support for UIMA and OmniFind Provide applications that leverage text analysis and enhanced search Deliver content to platform for analysis Provide components that perform text analysis

  20. Read all about it • “Buy this book, read it, and then read it again.”--Chris Sherman, Search Engine Watch • “Indispensable guide”--Kirkus Reports • Updated every printing The search marketing best seller • “Act now and read it”—Bryan Eisenberg • “Great book”--Robert Scoble • “Bravo” --Search Engine Watch Internet Marketing For more information about the books, and for the free Biznology newsletter and blog: www.mikemoran.com

More Related