Web Cache

Web Cache Yee Vang

Introduction • Internet has many user • Issues with access latency (lag) • Server crashing • How to solve? • One solution, Web Cache

Web Cache • What is web cache? • Cache “a place of storage” • Web cache – “a place to store websites or web objects”

Web caching • Web Caching • Technique that can: • Reduce access latency • “the time it takes for a request to be completed” • Network congestion • “occurs when a link or node is carrying so much data that its quality of service deteriorates”

Web Caching • How does it reduce user access latency and network congestion? • No cache example • Movie Storage Room in the next building • Contain one copy of every movie • One worker

Web Caching • Cache example • The same as the previous example • Movie Storage Room in the next building • Contain one copy of every movie • One worker • A Movie rack that can hold five movie at a time, to simulate a movie cache.

Web Caching • In the example • Customer -> User • Movie -> Web Pages • Worker -> ISP • Movie Storage Room -> Origin Server • Movie Rack -> Web Cache

Cache Hit/Cache hit rate • Cache hit • Occurs when a request can be satisfied by the web cache. • In the movie store example • Hit? • Cache hit rate • Is the percentage that a previously cached object will score a cache hit

Cache Miss • Cache miss • Occurs when a request cannot be satisfied by the web cache. • In the movie example • Miss?

Web Caching • Pros • Can reduce internet bandwidth • If a request can be satisfied by the web cache • Reduce the work load of the origin server • By storing previously requested web objects in a web cache • Reduce user access latency • When a cache hit occurs

Web Caching • Cons • Not every web objects are cacheable • Website that generate dynamic data • Requires an active connection • https:// • Stale Cache • Cache that are out of date • Bottleneck at the proxy server (in proxy caching)

Types of web cache • Browser Cache • Proxy Cache • Reverse Proxy Cache

Browser Cache • Cache stored at client level • Meaning the cache is actually stored on the user’s computer • i.e. Temporary internet files, [http://www.holgermetzger.de/pdl.html]

Browser Cache • Advantages of Browser Cache • Stored Locally • On cache hit it saves bandwidth • Increase in access latency • User pattern • The same user has a higher probability of browsing the same website each day.

Browser Cache • Disadvantages of Browser Cache • Takes up hard drive space • Stale object • Always risk running into stale object with caching. • Stored Locally • Only serves one computer.

Proxy Cache • Cache are stored at a proxy server • The proxy server usually serves more than one user • Acts as a gateway to the internet for large company or institution http://www.codeproject.com/KB/web-cache/ExploringCaching/cache_array.jpg

Proxy Cache • Request are directed to the proxy server instead of the origin server. • On cache hit • Returns the requested object to the user. • On cache miss • Request is then forwarded to origin server.

Proxy Cache • Advantages • Serves more than one client • Cache hit can occur even if different user makes the same request. • Gateway • Companies can limit what user can access. • Disadvantages • Serves more than one client • Can be overloaded. • Gateway • When the proxy server is down all the users are disconnected from then internet.

Reverse proxy Cache • Serves, origin server • Basically a proxy server that sits in front of the origin server. http://odino.org/images/proxy-cache.jpg

Reverse Proxy Cache • When a request is made? • Directed to the reverse proxy cache server • On cache hit • Object is returned to user • On cache miss • Request is forwarded to the origin server • A copy is stored on the Reverse proxy server • A copy is sent back to the user

Reverse Proxy Cache • Advantages • Reduces workload off of the origin server • Requested object can be requested once, cached on the reverse proxy server, and server many clients without contacting the origin server again • Static files can be cached • i.e. CSS files, java scripts, logos • Allows the origin server to better process dynamic contents

Reverse Proxy Cache • Disadvantages • Bottleneck • Many users making requests at the same time • Stale Cache/old files • Risk of cache hits on stale object, also static files can be outdated

Web Caching Architecture • Two main web caching architecture • Hierarchical • Distributed • They both utilizes the network shown below [3]

Hierarchical Caching Architecture • There are more than one level of cache between the users and the origin server • Typically employs more than one types of cache • There are parents, child and sibling relationships between caches.

Hierarchical Caching Architecture • First level of cache – Institutional Network • Second level of cache – Regional Network • Third level of cache – National Network • Parents? Child? Siblings? [3]

Hierarchical Caching Architecture • When a request is made • Its sent to the level one cache • If the level one cache cannot satisfy the request • Then its forwarded to the level two cache • If the level two cache cannot satisfy the request • Then its forwarded to the next level. • Once it reaches the last level, and still not be satisfied, then the request is forwarded to the origin server

Hierarchical Caching Architecture • Advantages • Different level of cache offers more chance for a cache hit • Leads to decrease access latency • Also reduce workload on the origin servers • Disadvantages • Every level added to the hierarchy adds delay • On cache miss there is a slight increase in latency • Higher level cache servers are expensive

Distributed Caching Architecture • Cache are stored at the Institutional Level • Regional and national level are eliminated • Each institutional network in the distributed system are siblings to each other. [3]

Distributed Caching Architecture • What is special in the distributed caching architecture? • Each institutional cache can contact its sibling cache • So each cache can knows what is in the other cache • They can receive objects from their sibling

Distributed Caching Architecture • When a request is made? • Query-Based Approach – Internet Caching Protocol • Request sent to configured institutional cache server • On cache miss, the request is broadcasted to the institutional cache’s sibling cache. • If a sibling cache contains the requested object, the sibling cache sends the object to the immediate institutional cache. The immediate institutional cache then stores a copy in itself, and sends the client another copy • If no sibling contains the requested object, a timeout will occur. • At which point the immediate institutional cache will then forward the request to the origin server.

Distributed Caching Architecture • When a request is made? • Directory-Based Approach – Cache Digest (Squid) • In this approach metadata is used. • Each cache is aware of it’s siblings content. • When a request is made, its sent to the immediate institutional cache. • On cache miss, the institutional cache checks its metadata to see if any of it’s sibling cache contains the requested object. • If not, then it forwards the request to the origin server

Distributed Caching Architecture • Advantage • Sibling cache servers share common interests • More chance of cache hit • Sibling cache servers are assigned based on proximity • Faster response time

Distributed Caching Architecture • Disadvantage • Sibling cache servers share common interests • If the servers are too far apart • Increase in access latency • Sibling cache servers are assigned based on proximity • Servers may not share common interest • Less chance of cache hit

Web Cache Coherency • Web cache coherency • Is the cache up to date? • Web cache coherency mechanism • Validation check • When a web object is first received • It gets time stamped • When the cached object is used, the cache server makes a validation check, by sending the time stamp to the origin server

Web Cache Coherency • Web cache coherency mechanism • Callback • When a web object is cached, it receives a callback promise for the object, from server. • Callback promise – a promise that the origin server will notify the cache server if the object has been updated • So the cache object is up to date if the cache server have not received a notification from the origin server

Web Cache Coherency • Web cache coherency mechanism • Expiration • When an object is cache an expiration date is assigned to it • Object is valid until expiration date • The first request for the object after its expiration date is requested from the origin server again. • At this time a new expiration date is assigned to the object

Cache Placement and Replacement Policies • How cache are replaced • Random • A random cache is replaced. • Size • Largest cache is replaced first • FIFO – First In First Out • Oldest cache is eliminated first

Cache Placement and Replacement Policies • LRU – Least Recently Used • Cache that has not been requested for the longest time is eliminated first • LRU/MIN – Least Recently Used Minimum • The first document whose size is larger than or equal to the size of the new document is removed • HLRU – History Least Recently Used • Record how many times each cached object is used • Elimination based on • LRU • Least Used

Cache Placement and Replacement Policies • LFU – Least Frequently Used • Cache are sorted based on how frequently it is used • On cache hit, the counter for the hit object is incremented by one. • List is then re-ordered • The web object with the lowest count is replaced first • LFU – Aging • Same as LRU • The Average count of all cached object is monitored • When the average count reaches a threshold, all counts are reset back to zero

Cache Placement and Replacement Policies • LRV – Lowest Relative Value • Each cached object is assigned a cost value • Object with the lowest cost value are replaced first • GD – Greedy Duel • Each cached object is assigned a cost value • Lowest cost object are replaced first • Then all cached object has their cost lowered by the replaced object’s cost • Each time a cache is accessed its cost is reset back to its original cost

Conclusion • Web cachinghelps reduce: • Network Congestion • User access latency • Performance of origin server

Questions? • Questions?

Reference • [1]Barish, G., & Obraczke, K. (2000). World Wide Web caching: trends and techniques. Communications Magazine, IEEE , 38(5), 178 - 184 . doi:10.1109/35.841844 Retrieved from http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=841844&isnumber=18201 • [2]Bakiras, S., Loukopoulos, T., Papadias, D., & Ahmad, I. (2005). Adaptive schemes for distributed web caching. Jour of Parallel and Distributed Computing, Retrieved from http://www.cs.ust.hk/~dimitris/PAPERS/JPDC05-DWC.pdf • [3]Biersack, E. W., Rodriguez, P., & Spanner, C. (2001). Analysis of Web caching architectures: hierarchical and distributed caching. Networking, IEEE/ACM Transactions on , 9(4), 404-418. doi:10.1109/90.944339 Retrieved from http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=944339&isnumber=20434 • [4]Das, S., Dykes, S. G., & Jeffery, C. L. (1999). Taxonomy and design analysis for distributed Web caching. System Sciences, 1999. HICSS-32. Proceedings of the 32nd Annual Hawaii International Conference on , 8, 10. doi:10.1109/HICSS.1999.773040 Retrieved from http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=773040&isnumber=16788 • [5]Davison, B. D. (2001). A Web caching primer. Internet Computing, IEEE, 5(4), 38-45. doi:10.1109/4236.939449 Retrieved from http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=939449&isnumber=20329

Reference • [6]Dubois, M., & Jeong, J. (2002, June). In R Bianchini (Chair). Cost-sensitive cache replacement algorithms. Paper presented at Second workshop on caching, coherence, and consistency, New York, NY, USA Retrieved from http://www.research.rutgers.edu/~wc3/papers/dubois.pdf.gz • [7]Geetha, K., Gounden, N. A., & Monikandan, S. (2009). SEMALRU: An Implementation of modified web cache replacement algorithm. Nature & Biologically Inspired Computing, 1406-1410. • doi: 10.1109/NABIC.2009.5393711 • URL: http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=5393711&isnumber=5393306 • [8]Hassanein, H., Liang, Z., & Liang, P. (2002). Performance comparison of alternative Web caching techniques. Computers and Communications, 2002. Proceedings. ISCC 2002. Seventh International Symposium on , 213 - 218 . doi:10.1109/ISCC.2002.1021681 Retrieved from http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=1021681&isnumber=21983 • [9](n.d.). Reverse Proxy Caching. In Cisco ACNS Caching and Streaming Configuration Guide. (5th ed.). (pp. 6-1). San Jose, CA: Cisco Systems, Inc.. doi:OL-4070-01 Retrieved from http://www.cisco.com/en/US/docs/app_ntwk_services/waas/acns/v51/configuration/local/guide/a51cag.pdf • [10]Tay, T. T., & Wijesundara, M. N. (2002). Distributed Web caching. Communication Systems, 2002. ICCS 2002. The 8th International Conference on , 2(25-28), 1142- 1146 vol.2 . doi:10.1109/ICCS.2002.1183311 Retrieved from http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=1183311&isnumber=26554

Reference • [11]Vakali, A. (2000). Lru-based algorithms for web cache replacement. In K. Bauknecht, S. Kumar Madria & G. Pernul (Eds.), Electronic Commerce and Web Technologies, First International Conference (p. 409-418). Retrieved from http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.59.5504&rep=rep1&type=pdf

Web Cache

Web Cache

Presentation Transcript

Cache

Squirrel: A peer-to-peer web cache

Concurrent Web Map Cache Server

A Survey of Web Cache Replacement Strategies

Cache

Web Cache

Web Cache

Cache

Summary Cache: A Scalable Wide-Area Web Cache Sharing Protocol

Web Cache Replacements

Web Cache Consistency

Web Cache Behavior

A Scalable Web Cache Consistency Architecture

Squirrel: A peer-to-peer web cache

Summary Cache: A Scalable Wide-Area Web Cache Sharing Protocol

ICP and the Squid Web Cache

Cache?

Concurrent Web Map Cache Server

Cache