1 / 34

Updating Web views distributed over wide area networks

Updating Web views distributed over wide area networks. Sidiropoulos Antonis Katsaros Dimitrios Aristotle Univ. of Thessaloniki , Greece. Presentation by: Katsaros Dimitrios. Web client. Origin Web server. 1. 1. 2. 2. INTERNET. 3. 4. 3. 4. CDN Cache Servers.

arty
Télécharger la présentation

Updating Web views distributed over wide area networks

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Updating Web views distributed over wide area networks Sidiropoulos Antonis Katsaros Dimitrios Aristotle Univ. of Thessaloniki, Greece Presentation by: Katsaros Dimitrios

  2. Web client Origin Web server 1 1 2 2 INTERNET 3 4 3 4 CDN Cache Servers Content Distribution Networks

  3. Content Distribution Networks • Advantages • prevention of the flush crowd problem • avoidance of network congestion • reduction of user-perceived latency • e.g., Akamai • launced in early 1999 • 12,000 servers • in 1,000 networks

  4. Disseminating Updates

  5. Outline • Related work & Motivation • Proposed method • Preliminary performance evaluation • Conclusions & Future work

  6. Presentation Outline • Related work & Motivation • Proposed method • Preliminary performance evaluation • Conclusions & Future work

  7. Best-effort cache coherency • Lack of bandwidth to disseminate all updates • Many caches • Single point of updates generation

  8. Related work • Static Web object caching/prefetching (Katsaros & Manolopoulos, ACM SAC’04) (Nanopoulos, Katsaros & Manolopoulos, IEEE TKDE’03) • Dynamic Web object caching/prefetching • cache plays the central role i.e., prefetching (Cho & Garcia-Molina, SIGMOD’00) and (Gal & Eckstein, J.ACM’01) • minimizing the bandwidth consumption and query latency in the presence of constraints on the age or accuracy of cached objects (Bright & Raschid, VLDB’02; Cohen & Kaplan, Computer Networks’02; Olston & Widom, SIGMOD’01) • strong cache coherence maintenance (Challenger, Iyengar & Dantzig, INFOCOM’99) • update dissemination, best-effort but with a single cache(Labrinidis & Roussopoulos, VLDB’01) • caches and sources cooperate, best effort caching, (Olston & Widom, SIGMOD’02) • optimal tranmission of updates, but fixed assumptions about update rates and transmission capabilities (Wang, Evans & Kwok, Information Systems Frontiers,’03)

  9. Presentation Outline • Related work & Motivation • Proposed method • Preliminary performance evaluation • Conclusions & Future work

  10. Web object freshness Freshness of object O over period [ti,tj] Freshness of database D with N objects

  11. Weighted Web object freshness • The access pattern of Web objects is skewed • Objects with higher access rates contribute more to what is perceived as database freshness • For a database with N objects Oi each with popularity fOi the freshness is defined as :

  12. Maintain best-effort coherency • Devise a sequence of update disseminations so as to maximize F(D,T) • Hence: The “best-effort” cache coherence maintenance is a nonpreemptive scheduling problem

  13. FIFO scheduling • Assume that there are sufficient • network resources • processing resources • Use of the FIFO scheduling (First-Come-first-Served) • Visualize our scheduling problem with the 2-dimensional Gantt charts(Goemans & Williamson, SIAM Journal on Discrete Mathematics’00)

  14. Example of updates • We have three pending refreshes in the server's queue, i.e., Refresh1, Refresh2 and Refresh3, which occurred with the order mentioned

  15. 11 popularity 1 8 6 2 4 2 3 cost 2 4 6 8 2-D Gantt chart for FIFO Divergence = 1 - Freshness = Area under the thick polygonal line = 64

  16. 11 popularity 1 8 6 2 4 2 3 cost 2 4 6 8 Can we do better ?

  17. 11 popularity 1 8 6 2 4 2 3 cost 2 4 6 8 Can we do better ?

  18. 3 11 popularity 2 8 6 4 1 2 cost 2 4 6 8 Yes ! Schedule the max(pop/cost) Divergence = 1 - Freshness = Area under the thick polygonal line = 58 (10% gains even for this small example)

  19. Largest Slope Rule scheduling • Select for dissemination the update with the largest popularity/cost ratio • It can be proved that this rule is optimal • No longer optimal in the presence of dependencies • Very efficient heuristic even when there exist dependencies

  20. Presentation Outline • Related work & Motivation • Proposed method • Preliminary performance evaluation • Conclusions & Future work

  21. Parasol Node MasterCDN Parasol CPU CPU:0 Parasol Network Link CPU:1 CPU:2 Router Router Router Routers/Gateways Router Router Router CDN server 1 CDN server 2 CDN server n Simulated System Hardware

  22. Relation updates Scheduler algorithm 4 Request for view update ViewUpdater Dispatcher 1 DB updates 3 2 5 6 DBMS Master CDN CDN1 updater CDN2 updater CDNn updater CDN1 CDN2 CDNn Simulated System Model

  23. Node:MasterCDN Scheduler algorithm Pool of views to be updated Rel. Queue Relation update CPU:0 CPU:2 Dispatcher CPU:1 DBMS ViewUpdater Pool of views to transmit Pool of views to transmit Pool of views to transmit CDN1updater CDN2updater CDNnupdater masterCDN components

  24. Methodology • Synthetic (sample CDN with 10 edge servers) • Synthetic data generator • Modeling network nodes, network bandwidth, size of documents, relations, views, view derivation hierarchy, update rates, popularity • Examine the impact of: • update rate • number of relations

  25. Freshness vs. Update rate

  26. Freshness vs. Update rate

  27. Freshness vs. Update rate

  28. Freshness vs. #Relations

  29. LSR Freshness vs. update rate

  30. Freshness vs. (#Rel, dep_density) Top: 100 Rels Left: Sparse dep. Right: Dense dep. Botom: 500 Rels

  31. Presentation Outline • Related work & Motivation • Proposed method • Preliminary performance evaluation • Conclusions & Future work

  32. Conclusions & Future work • Conclusions • we proposed a best-effort cache coherence maintenance scheme for the edge servers of a CDN • it is a pure push-based dissemination method • the scheme is based on the LSR scheduling algorithm • we presented preliminary results to justify its efficiency • Future work • Organize the edge serves into a (possibly) deep hierarchy, so as to parallelize the update dissemination

  33. References • L. Bright and L. Raschid, Using Latency-Recency Profiles for Data Delivery on the Web, Proc. of the VLDB, pp. 550-561, 2002. • J. Challenger, A. Iyengar, and P. Dantzig, A Scalable System for Consistently Caching Dynamic Web Data, Proc. of the IEEE INFOCOM, 1999. • J. Cho and H. Garcia-Molina, Synchronizing a Database to Improve Freshness, Proc. of the ACM SIGMOD, pp. 117-128, 2000. • E. Cohen and H. Kaplan, Refreshment Policies for Web Content Caches, Computer Networks, 38(6), 795-808, 2002. • A. Gal and J. Eckstein, Managing Periodically Updated Data in Relational Databases: A Stochastic Modeling Approach, Journal of the ACM, 48(6), pp. 1141-1183, 2001. • M.X. Goemans and D.P. Williamson, Two-Dimensional Gantt Charts and a Scheduling Algorithm of Lawler, SIAM Journal on Discrete Mathematics, 13(3), pp. 281-294, 2000. • D. Katsaros and Y. Manolopoulos, Caching in Web Memory Hierarchies, Proc. of the ACM SAC, 2004. • A. Labrinidis and N. Roussopoulos, Update Propagation Strategies for Improving the Quality of Data on the Web, Proc. of the VLDB, 2001. • A. Nanopoulos, D. Katsaros and Y. Manolopoulos, A Data Mining Algorithm for Generalized Web Prefetching, IEEE Trans. on Knowledge and Data Engineering, 15(5), pp.1155-1169, 2003. • C. Olston and J. Widom, Adaptive Precision Setting for Cached Approximate Values, Proc. of the ACM SIGMOD, pp. 355-366, 2001. • C. Olston and J. Widom, Best-Effort Cache Synchronization with Source Cooperation, Proc. of the ACM SIGMOD, pp. 73-84, 2002. • J.W. Wang, D. Evans and M. Kwok, On Staleness and the Delivery of Web Pages, Information Systems Frontiers, 5(2), pp. 129-136, 2003.

  34. Contact information Sidiropoulos Antonis Dept. of Informatics Aristotle University Thessaloniki, 54124, Greece asidirop@csd.auth.gr http://users.auth.gr/~asidirop Katsaros Dimitrios Dept. of Informatics Aristotle University Thessaloniki, 54124, Greece dkatsaro@csd.auth.gr http://skyblue.csd.auth.gr

More Related