1 / 20

Form-Based Proxy Caching for Database-Backed Web Sites

Form-Based Proxy Caching for Database-Backed Web Sites. Qiong Luo Joint work with Jeffrey F. Naughton University of Wisconsin-Madison. Web Caching for Databases. Goal Proxy caching for db-backed web sites Gap RDBMS answers SQL queries Web caching proxies cache web pages

Télécharger la présentation

Form-Based Proxy Caching for Database-Backed Web Sites

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Form-Based Proxy Caching for Database-Backed Web Sites Qiong Luo Joint work with Jeffrey F. Naughton University of Wisconsin-Madison

  2. Web Caching for Databases • Goal • Proxy caching for db-backed web sites • Gap • RDBMS answers SQL queries • Web caching proxies cache web pages • Our proposal: a new caching proxy • Query result caching, plus • Query processing Qiong Luo @ VLDB 2001

  3. Outline • Introduction • HTML Forms and Query Templates • Form-Based Active Caching • Experiments • Conclusions Qiong Luo @ VLDB 2001

  4. Web/Application Server Application DB Server Database Database-Backed Web Sites HTML Forms (2) (1) Browser HTTP (3) • Forms allow user input • Queries go through multiple tiers • DB Server is often the bottleneck (4) Qiong Luo @ VLDB 2001

  5. DB Server Application Web/Application Server Database Today’s proxies do URL-matching; We want to add query processing here! Caching Proxies HTML Forms (3) (1) Caching Proxy (2) Browser HTTP HTTP (4) (5) Qiong Luo @ VLDB 2001

  6. Why Proxy Caching? • Flexibility and easy deployment • Don’t change servers or clients • Server workload sharing • On a hit, save all steps on the server • Response time improvement • By sharing server workload, • By bring content closer to users, or • By both Qiong Luo @ VLDB 2001

  7. Exact Match Prior Research: Web Caching • IBM’s Olympics web site [CID99] • Dynamic Content Caching Protocol [SAYZ99] • Web view materialization [LR00] • Caching search engine results [Mar00] • CachePortal project at NEC [CLL+01] • Active Cache Protocol [CZB98] Qiong Luo @ VLDB 2001

  8. Our Focus • Feasibility • How can proxies cache HTML form queries? • Different server collaboration levels • Efficiency • Which caching schemes are more efficient? • Active caching vs. passive caching • Flexibility • How can we still keep things simple and easy? • Declarative specification of form semantics Qiong Luo @ VLDB 2001

  9. Search by: At a browser At the web site db SELECT top50 i_title, i_id, a_fname, a_lname FROM item, author WHERE a_id = i_a_id AND (i_title LIKE '%Java Programming%') ORDER BY i_title Search Request Page An HTML Form Query Java Programming HTML form queries have database semantics. Qiong Luo @ VLDB 2001

  10. Search by: At a browser At the web site db SELECT top50 i_title, i_id, a_fname, a_lname FROM item, author WHERE a_id = i_a_id AND (i_title LIKE '%Network Programming%') ORDER BY i_title Search Request Page Another HTML Form Query Network Programming Queries from the same HTML form have a common structure! Qiong Luo @ VLDB 2001

  11. Search by: Query Template @ proxy SELECT top50 i_title, i_id, a_fname, a_lname FROM item, author WHERE a_id = i_a_id AND (i_title LIKE $search_string) ORDER BY i_title Search Request Page Proxy Side Query Template tpcwSearchForm.html Form = Parameterized Queries (Language Independent) Qiong Luo @ VLDB 2001

  12. Class of Queries Handled SELECT TOP n selection_list FROM target_relations WHERE search_predicate(search_field, $search_string) ANDother_predicates ORDER BY orderby_fields • SPJ queries with • a parameterized search predicate, • an order-by clause, and • a top-n operation. • selection_list  search_field  orderby_fields Qiong Luo @ VLDB 2001

  13. Remainder Query Q1 Q2 Remainder Predicate Challenges of Form Queries • Unordered domain of search predicates • Much different than range predicates • Top-n operation • Answering subsumed queries needs care. • Remainder queries to the server Q2-Q1 Qiong Luo @ VLDB 2001

  14. Form-Based Active Caching • Keep queries from the same form together • Cache the whole result set, not only top-N • Process a new query at the cache • Exact matches – trivial • Subsumed queries – selections at the cache • Otherwise -- • For collaborating server, send the remainder query • Otherwise, send the original query • Eliminate duplicate tuples in the cache Qiong Luo @ VLDB 2001

  15. Cached Queries Cached Tuples Cached Lexicons Advanced Advanced Java Programming Java Network Programming Java Programming Network Cache Organization Example Predicates on i_title Only i_title shown Index on i_title Java Programming Network Programming Unix Network Programming Unix Form-based Active Cache Organization Qiong Luo @ VLDB 2001

  16. Experiments Overview • Setup • TPC-W book title search workload • Adding overlap in queries • Adding overlap in datasets • Real user trace over real web sites (omitted) Qiong Luo @ VLDB 2001

  17. TPC-W Search Times* • One signature word per tuple • One signature word per query • Five result tuples per query • 10K-query trace • 2K distinct queries • 10K-tuple cache Both caching schemes perform well on TPC-W. *Milliseconds Qiong Luo @ VLDB 2001

  18. Response times of noun traces on 100K TPC-W database 450 400 Time in milliseconds Direct 350 300 PQ 250 AQ0 200 150 100 Noun100 Noun80 Noun60 Noun40 50 0 Adding Overlap in Queries Active outperforms Passive. Qiong Luo @ VLDB 2001

  19. Adding Overlap in Datasets Remainder Predicates can help… Qiong Luo @ VLDB 2001

  20. Conclusions • Form-based proxy caching framework • Enabling declaration of query templates • Answering HTML form-based queries • Caching schemes • Passive caching is sufficient for the TPC-W trace. • Active caching is more promising for other traces. • Full semantic caching is probably not worthwhile. • Each needs different server collaboration level: none  some  remainder query handling Full paper at http://www.cs.wisc.edu/niagara/ Qiong Luo @ VLDB 2001

More Related