1 / 32

Effective Keyword Search in Relational Databases

Effective Keyword Search in Relational Databases. Fang Liu (University of Illinois at Chicago) Clement Yu (University of Illinois at Chicago) Weiyi Meng (Binghamton University) Abdur Chowdhury (America Online, Inc.). Effective Keyword Search in Relational Databases. Introduction

enriquec
Télécharger la présentation

Effective Keyword Search in Relational Databases

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Effective Keyword Search in Relational Databases Fang Liu (University of Illinois at Chicago) Clement Yu (University of Illinois at Chicago) Weiyi Meng (Binghamton University) Abdur Chowdhury (America Online, Inc.)

  2. Effective Keyword Search in Relational Databases • Introduction • IR ranking in text databases • Our ranking strategy in RDBs • Experiments • Conclusions and future work SIGMOD 2006: Effective Keyword Search in Relational Databases

  3. Introduction Why keyword search in relational databases? • We want to search text data in relational databases • SQL with the “contains” operator is not for non-expert users • Keyword search is tremendous successful in text database by ranking documents based on similarity. It is for non-expert users SIGMOD 2006: Effective Keyword Search in Relational Databases

  4. Introduction • Text data in relational databases SIGMOD 2006: Effective Keyword Search in Relational Databases

  5. Introduction Suppose a user is looking for albums titled “off the wall” SIGMOD 2006: Effective Keyword Search in Relational Databases

  6. Introduction • Keyword search is very successful in text database by ranking documents based on similarity. Google, Yahoo and MSN search are the examples. So, let’s do keyword search in relational databases! (DBXplorer, BANKS, DISCOVER & IR-style DISCOVER, ObjectRank, Ranking Objects) SIGMOD 2006: Effective Keyword Search in Relational Databases

  7. Introduction • Let’s do it, but how? • What are answers to be ranked? • How should we rank these answers? SIGMOD 2006: Effective Keyword Search in Relational Databases

  8. Introduction -- an answer An answer for a given query Q: a tuple tree, in which every leaf node must have at least one keyword in Q. SIGMOD 2006: Effective Keyword Search in Relational Databases

  9. Introduction • Use a slightly modified algorithm [DISCOVER] to produce all answers for a given query. SIGMOD 2006: Effective Keyword Search in Relational Databases

  10. Introduction: Ranking • Our focus is on the effectiveness problem of ranking answers: the more relevant an answer is to the user query, the higher it should be ranked. SIGMOD 2006: Effective Keyword Search in Relational Databases

  11. Introduction: Contributions • We identify four new factors that are critical to effective ranking and we propose a new ranking strategy • Design and conduct comprehensive experiments for the effectiveness problem • Experimental results show our strategy is significantly better than existing works in effectiveness SIGMOD 2006: Effective Keyword Search in Relational Databases

  12. Effective Keyword Search in Relational Databases • Introduction • IR ranking in text databases • Our ranking strategy in RDBs • Experiments • Conclusions and future work SIGMOD 2006: Effective Keyword Search in Relational Databases

  13. tf=2, ntf=1.53;tf=10, ntf=2.2; half: idf =0.69, 1/100, idf=4.6, 1/200,000, idf=12, s=0.2 1: ndl=1, half, ndl=0.9, 1/10:ndl = 0.8, 2: ndl=1.2, 10: ndl=2.8 3.3 IR Ranking • Q=(k1, k2, ..,kn), D is a document, Sim(Q,D) is the ranking score of D. SIGMOD 2006: Effective Keyword Search in Relational Databases

  14. Effective Keyword Search in Relational Databases • Introduction • IR ranking in text databases • Our ranking strategy in RDBs • Experiments • Conclusions and future work SIGMOD 2006: Effective Keyword Search in Relational Databases

  15. Our Ranking Strategy • T=(D1,D2,..Dn), so Sim(Q,D)Sim(Q,T) SIGMOD 2006: Effective Keyword Search in Relational Databases

  16. Our Ranking Strategy • T=(D1,D2,..Dn), so Sim(Q,D)Sim(Q,T) SIGMOD 2006: Effective Keyword Search in Relational Databases

  17. Our Ranking Strategy • Tuple Tree Size Normalization # of tuples in a tuple tree T SIGMOD 2006: Effective Keyword Search in Relational Databases

  18. Document length of Di Average Document length of the text column of Di Our Ranking Strategy • Document Length Normalization Reconsidered SIGMOD 2006: Effective Keyword Search in Relational Databases

  19. Our Ranking Strategy • Document Frequency Normalization SIGMOD 2006: Effective Keyword Search in Relational Databases

  20. Our Ranking Strategy • T=(D1,D2,..Dn) • maxWgt is the maximum weight(k, Di) • sumWgt is the sum of weight(k, Di) SIGMOD 2006: Effective Keyword Search in Relational Databases

  21. Our Ranking Strategy • T=(D1,D2,..Dn), so Sim(Q,D)Sim(Q,T) SIGMOD 2006: Effective Keyword Search in Relational Databases

  22. Our Ranking Strategy • Schema Terms in Query • lyrics for How come by D12 • lusher the singer's lyrics to burn • Phrase-based Ranking • Using position information to boast phrase matching • Concept-based Ranking • Can improve effectiveness • Can assign semantics to answers SIGMOD 2006: Effective Keyword Search in Relational Databases

  23. Effective Keyword Search in Relational Databases • Introduction • IR ranking in text databases • Our ranking strategy in RDBs • Experiments • Conclusions and future work SIGMOD 2006: Effective Keyword Search in Relational Databases

  24. Experiments – data set • A Lyrics Database • 50 Queries from an AOL query log • Relevance Judgment: pooling + logs

  25. Experiments: some queries • to me lyrics by lionel richie • inner smile texas lyrics • lionel richie lyrics • lionel richie lyrics you mean more to me • avril lavigne lyrics for the album under this skin • avril lavigne lyrics

  26. Experiments – measure • Reciprocal rank: measures how good the system is to return the first relevant answer. • MAP (mean average precision): A precision is computed after each relevant answer is retrieved. Then we average all precision values to get a single number to measure the overall effectiveness.

  27. Experiments – results • Our ranking strategy: the four new factors.

  28. Experiments – results • Comparison with related works

  29. Effective Keyword Search in Relational Databases • Introduction • IR ranking in text databases • Our ranking strategy in RDBs • Experiments • Conclusions and future work SIGMOD 2006: Effective Keyword Search in Relational Databases

  30. Conclusions • Effectiveness is as important as efficiency • The four new factors are critical to search effectiveness • Our strategy is significantly more effective than related works SIGMOD 2006: Effective Keyword Search in Relational Databases

  31. Future Work • Utilize link analysis • Combine non-text columns • Efficiency Problem • More real world data sets SIGMOD 2006: Effective Keyword Search in Relational Databases

  32. Questions ? SIGMOD 2006: Effective Keyword Search in Relational Databases

More Related