200 likes | 321 Vues
This paper explores the effectiveness of content management techniques in web proxy caches, highlighting issues with existing replacement policies that lead to low hit rates and additional delays. It presents a systematic evaluation, including a critique of current methods, data collection techniques utilizing long-term traces, and experimental design considerations. The authors propose new replacement algorithms aimed at improving efficiency, while also introducing the concept of virtual caches. Key findings emphasize the importance of frequency and recency factors in cache management.
E N D
Evaluating Content Management Techniques for Web Proxy Caches Internet Server Martin Arlitt, Ludmila Cherkasova, John Diley, Rich Friedrich and Tai Jin (Hewlett-Packard Laboratories) (in 2nd Workshop on Internet Server Performance, in conjunction with ACM SIGMETRICS 99) Cho Joon-ho(CA Lab, CS department, KAIST) 2001 . 11. 6
Agenda • Problems • Quick Tour (Summary) • Critique • Design & Design Rationale • Data Collection and Reduction • Key Workload Characteristics • Experimental Design • Simulation Results • Virtual Cache Evaluating Content Management Tech for Web Proxy Caches
Problems • Current Web Proxy caches utilize simple replacement policies • Relatively low hit rates • Additional delays • So what? • Developing a quantitative understanding of Web traffic • How effective are current proxy cache replacement policies for real workloads? • Focus on two performance metrics • Hit rate • Byte hit rate • Designing new replacement policies • Utilize frequency for higher performance • Are neither susceptible to cache pollution nor require parameterization Evaluating Content Management Tech for Web Proxy Caches
Agenda • Problems • Quick Tour (Summary) • Critique • Design & Design Rationale • Data Collection and Reduction • Key Workload Characteristics • Experimental Design • Simulation Results • Virtual Cache Evaluating Content Management Tech for Web Proxy Caches
Quick Tour (Summary) – 1/3 • The problems of existing studies • Short-term traces of busy proxies or long-term traces of relatively inactive proxies • Long-term traces in busy environments are needed • Trace driven simulation • Collect total 117,652,652 requests during five month • Use smaller and more compact log • The points to be considered • Object size • Recency of Reference • Frequency of Reference • Turnover Evaluating Content Management Tech for Web Proxy Caches
Quick Tour (Summary) – 2/3 • Existing replacement policy • LRU (Least-Recently-Used) • Size – replaces the largest object • GD-Size (GreedyDual-Size) • Replaces the object with the lowest utility • LFU - replaces the least frequently used object • New replacement policy • GDSF(GreedyDual-Size with Frequency) • GD-Size + a frequency factor • LFU-DA (Least Frequently Used with Dynamic Aging) • LFU-Aging + a dynamic mechanism(Running age L) • Virtual Caches • Logically partitions the cache into N virtual caches Ki=Ci/Si+L Ki=Fi*Ci/Si+L Ki=Ci*Fi+L Evaluating Content Management Tech for Web Proxy Caches
Comparison of Proposed Policies to Existing Replacement Policies Quick Tour (Summary) – 3/3 Analysis of Virtual Cache Performance; VC0 using GDSF-Hits, VC1 using LFU-DA Evaluating Content Management Tech for Web Proxy Caches
Agenda • Problems • Quick Tour (Summary) • Critique • Design & Design Rationale • Data Collection and Reduction • Key Workload Characteristics • Experimental Design • Simulation Results • Virtual Cache Evaluating Content Management Tech for Web Proxy Caches
Critique • Pros • Quantitative understanding of Web traffic • Long term trace-driven simulation in busy proxy servers • Providing two new replacement algorithms that run efficiently • Providing a new cache management method, ‘Virtual Cache’ • Cons • Not fresh • No consideration of dynamic data • No consideration of processing overhead for these more complex algorithms • Performance improvements are insignificant Evaluating Content Management Tech for Web Proxy Caches
Agenda • Problems • Quick Tour (Summary) • Critique • Design & Design Rationale • Data Collection and Reduction • Key Workload Characteristics • Experimental Design • Simulation Results • Virtual Cache Evaluating Content Management Tech for Web Proxy Caches
Data Collection and Reduction • Data collection • Long term trace-driven simulation • Total 117,652,652 requests were handled during five month period • Data include • Client IP address, request time, response status, the time required for the proxy to complete its response… • Data reduction • Smaller, more compact log • Due to storage constraint • To ensure that analyses and simulations could be completed in a reasonable amount of time • Reduction by • Storing data in more efficient manner • Removing information of little value Evaluating Content Management Tech for Web Proxy Caches
Key Workload Characteristics • Cacheable Objects • Most client requests be for cacheable objects (96%) • Object Set Size • total 389GB • Object Sizes • Variable – medium : 4KB, maximum : 148MB video clip • Recency of reference • 1/3 of all re-references occurred within one hour • Frequency of reference • Web referencing patterns are non-uniform • Turnover • Objects that were once popular are no longer requested Evaluating Content Management Tech for Web Proxy Caches
Experimental Design – 1/2 • Least-Recently-Used(LRU) • Replaces the object requested least recently • Considers only a single work load characteristic • Size • Replaces the largest object • Tries to minimize the miss ratio (target to byte hit rate) • Cache pollution • GreedyDual-Size(GD-Size) • GD-Size(1) for Hit Rate • GD-Size(Packets) for Byte Hit Rate Ci – the cost associated with bringing object i into the cache Si – the object size L – a running age factor Ki=Ci/Si+L Evaluating Content Management Tech for Web Proxy Caches
Experimental Design – 2/2 • LFU • Replaces the least frequently used object • LFU-Aging = LFU + Aging → avoids cache pollution • Parameterization problem still remains • Greedy Dual-Size with Frequency(GDSF) • GD-Size doesn’t take into account frequency • Least Frequently Used with Dynamic Aging(LFU-DA) • LFU-Aging requiresparameterization to perform well • LFD-DA uses inflation factor as well as the frequency count Ki=Fi*Ci/Si+L Fi – a frequency count Ki=Ci*Fi+L L – a running age factor Evaluating Content Management Tech for Web Proxy Caches
Agenda • Problems • Quick Tour (Summary) • Critique • Design & Design Rationale • Data Collection and Reduction • Key Workload Characteristics • Experimental Design • Simulation Results • Virtual Cache Evaluating Content Management Tech for Web Proxy Caches
Figure1. Comparison of existing Replacement Policies Simulation Results – 1/2 Evaluating Content Management Tech for Web Proxy Caches
Figure2. Comparison of Proposed Policies to Existing Replacement Policies Simulation Results – 2/2 Evaluating Content Management Tech for Web Proxy Caches
Agenda • Problems • Quick Tour (Summary) • Critique • Design & Design Rationale • Data Collection and Reduction • Key Workload Characteristics • Experimental Design • Simulation Results • Virtual Cache Evaluating Content Management Tech for Web Proxy Caches
Virtual Cache – 1/2 • An approach that can focus on both of hit rate and byte hit rate simultaneously • Mechanism • Logically partitions the cache into N virtual caches • Each virtual cache(VC)is managed with its own replacement policy • Steps • Initially all objects are in VC0 • Replacements from VCi are moved to VCi+1 • Replacements from VCi+1 are evicted form the cache • When reaccessed, objects are reinserted in VC0 Evaluating Content Management Tech for Web Proxy Caches
Virtual Cache – 2/2 Figure 3. Analysis of Virtual Cache Performance; VC0 using GDSF-Hits, VC1 using LFU-DA Figure 4. Analysis of Virtual Cache Performance; VC0 using LFU-DA, VC1 using GDSF-Hits Evaluating Content Management Tech for Web Proxy Caches