Multicache-Based Content Management for Efficient Web Caching

Multicache-Based Content Management for Web Caching Kai Cheng and Yahiko Kambayashi Graduate School of Informatics, Kyoto University Kyoto JAPAN

Outline of the Presentation • Introduction • Why Content Management • Contributions of Our Work • Multicache-Based Content Management • Content Management Scheme for LRU-SP • Experimental Evaluation • Concluding Remarks (C)chengk@kuis.kyoto-u.ac.jp

1.1. Why Content Management ① ④ ③ ② User Network Servers Maximize Hit Rates (r = ②/①)(or Weighted HR) (C)chengk@kuis.kyoto-u.ac.jp

Can Web Do Without Caching? • Bandwidth Scarcity= Weakest Part • Unrealistic to Update All Resources • “Hot-Spot” Servers • Unpredictable of Server Overload • Inherent Latency = Light Speed  Distance • Even Sufficient Bandwidth and Server Capacity • Transoceanic Data Transfer: 200ms300ms Caching Is Necessary To Adaptively Reduce Remote Data Requests (C)chengk@kuis.kyoto-u.ac.jp

1.2. Why Content Management Replacement policies based on empirical formula are difficult to deal with these! (C)chengk@kuis.kyoto-u.ac.jp

Deploying Content Management • To Support • Larger Cache Space • Sophisticated Control Logic • To Support • Sophisticated Replacement Policies With • User-Oriented Performance Metrics • Document Treated as Semantic Unit (C)chengk@kuis.kyoto-u.ac.jp

1.3. Contributions of This Work • A Multicache Architecture for Implementing Sophisticated Content Management, Including a New Cache Definition • A Study of Content Management for LRU-SP • Simulations to Compare LRU-SP Against Others (C)chengk@kuis.kyoto-u.ac.jp

Previous Work • Classifications in Approximate Implementations of Complicated Caching Schemes • LRV, LNC-W3-U, etc. • Segmentation in Traditional Caching As Tradeoffs Between Performance and Complexity • Segmented FIFO, FBR, 2Q etc. • Disadvantages • Both Are Built-in Ad hoc Implementation, Rather than An Independent Mechanism • Can Not Support Sophisticated Category nor Semantic-Based Classification (C)chengk@kuis.kyoto-u.ac.jp

>2 B(8) C(6) D(3) Hit Outs D(3) F(1) G(1) B(8) C(6) E(2) F(2) H(1) A(10) 2 References A(10) E(2) F(2) Hit Outs F(1) G(1) H(1) 1 First In First Out Order Managing LFU Contents in Multiple Priority Queues (C)chengk@kuis.kyoto-u.ac.jp

Cache Components • Space • Limit Storage Space • Contents • Objects Selected for Caching • Policies • Replacement Policies • Constraints • Special Conditions Space Space Constraints Contents Policies (C)chengk@kuis.kyoto-u.ac.jp

Constraints for Cache • Admission Constraints • Define Conditions for Objects Eligible For Caching e.g. (size < 2MB) && !(Source = local) • Freshness Constraints • Define Conditions for Objects Fresh Enough For Re-Use e.g. (Type = news) && (Last-Modified < 1week) • Miscellaneous Constraints e.g. (Time= end-of-day) (Total-Size< 95%*Cache-Size) (C)chengk@kuis.kyoto-u.ac.jp

Cache Knowledge Base Multicache Architecture Web Cache With Multiple Subcaches IN-CACHE CONSTRAINTS CENTRAL ROUTER Client Web Servers Request/Response CKB SUBCACHE SUBCACHE SUBCACHE JUDGE (C)chengk@kuis.kyoto-u.ac.jp

Components of the Architecture • Central Router • Control and Mediate the Cache • Cache Knowledge Base (CKB) • A Set of Rule Based To Allocate Objects R1. Allocate(X, 1):-url(X, U), match(U, *.jp),content(X, baseball) • Subcaches • Cache for Keeping Objects With Special Properties • Cache Judge • Make Final Decisions From A Set of Eviction Candidates (C)chengk@kuis.kyoto-u.ac.jp

The Procedural Description Central Router services each request. Suppose current request is for document p; • Locating p by In-cache Index • If p is not in cache, download p; • Validate Constraints, if false, loop; • Fire rules in CKB, let subcache ID = K; • While no enough space in subcache K for p • Subcache K selects an eviction ; • If space sharing, other subcaches do same; • Judge assesses the eviction candidates; • Purge the victim; • Cache p in subcache K • If p is in subcache , do i) - iv) re-cache p. (C)chengk@kuis.kyoto-u.ac.jp

Content Management for LRU-SP • LRU (Least Recently Used) • Primarily Designed for Equal Sized Objects, and Only Recency of Reference In Use • Extended LRUs • Size-Adjusted LRU (SzLRU) • Segmented LRU (SgLRU) • LRU-SP(Size-Adjusted and Popularity-Aware LRU) • Make SzLRU Aware of Popularity Degree (C)chengk@kuis.kyoto-u.ac.jp

Probability of Re-ReferenceAs a Function of Current Reference Times (C)chengk@kuis.kyoto-u.ac.jp

Cost –To-Size Ratio Model • An Object A In Cache Saves Cost nref * (1/atime) • nref is the frequency of reference • atime is the time since last access, (1/atime) is the dynamic frequency of A • When Put In Cache, It Takes Up Space size • Cost-to-size ratio = nref /(size*atime) • The Object With Least Ratio Is Least Beneficial One (C)chengk@kuis.kyoto-u.ac.jp

Content Management of LRU-SP • CKB Rule: • Allocate(X, log(size/nref)):-Size(X, size), Freq(X, nref) • Subcaches • Least Recently Used (LRU) • Judge • Find the One With Largest (size*atime)/nref • The Larger and Older and Colder, the Fast An Object Will Be Purged (C)chengk@kuis.kyoto-u.ac.jp

Predicted Results • A higher Hit Rate is expectable for LRU-SP, because it utilizes three indicators to document popularity. • However, higher Hit Rates are usually at the cost of lower Byte Hit Rates, because smaller documents contribute less to bytes of hit data. (C)chengk@kuis.kyoto-u.ac.jp

Experiment Results * * (C)chengk@kuis.kyoto-u.ac.jp

Explanations • LRU-SP really obtained a much higher Hit Rate than either SzLRU, SgLRU or LRV. • LRU-SP also obtained a higher Byte Hit Rate, when cache space exceeds 3% of total required space. • LRU-SP only incurs O(1) time complexity in content management. • LRU-SP a significantly improved algorithm (C)chengk@kuis.kyoto-u.ac.jp

Concluding Remarks • Multicahe-Based Architecture Has Proved Ideal To Realize Good Balance Between High Performance and Low Overhead • It Is Capable of Incorporating Semantic Information as Well as User Preference In Caching • It Can Work With Data Management Systems to Support Web Information Integration (C)chengk@kuis.kyoto-u.ac.jp

Multicache-Based Content Management for Efficient Web Caching

Multicache-Based Content Management for Efficient Web Caching

Presentation Transcript

Web Content Management System

Web Content Management

Caching Solutions to increase availability of Web Content

Dynamic-Content Web Caching with Cooperative Proxy Scheme

Collaborative Web Caching Based on Proxy Affinities

Intelligent Bayesian Network-Based Approaches for Web Proxy Caching

Multicache-Based Content Management for Web Caching

Technology for Backbone Web Caching

Web Caching

Leveraging caching for Internet-scale content-based publish/subscribe networks

web caching

Web Caching

Web Content Management

Web Caching

Web Content Management

Java-Based Adaptive Web Caching

Form-Based Proxy Caching for Database-Backed Web Sites

Web caching

Web Caching

Web Caching and Content Delivery

Large-Scale Web Caching and Content Delivery

Java-Based Adaptive Web Caching