470 likes | 743 Vues
Web Cache. Yee Vang. Introduction. Internet has many user Issues with access latency (lag) Server crashing How to solve? One solution, Web Cache. Web Cache. What is web cache? Cache “ a place of storage” Web cache – “a place to store websites or web objects”. Web caching.
E N D
Web Cache Yee Vang
Introduction • Internet has many user • Issues with access latency (lag) • Server crashing • How to solve? • One solution, Web Cache
Web Cache • What is web cache? • Cache “a place of storage” • Web cache – “a place to store websites or web objects”
Web caching • Web Caching • Technique that can: • Reduce access latency • “the time it takes for a request to be completed” • Network congestion • “occurs when a link or node is carrying so much data that its quality of service deteriorates”
Web Caching • How does it reduce user access latency and network congestion? • No cache example • Movie Storage Room in the next building • Contain one copy of every movie • One worker
Web Caching • Cache example • The same as the previous example • Movie Storage Room in the next building • Contain one copy of every movie • One worker • A Movie rack that can hold five movie at a time, to simulate a movie cache.
Web Caching • In the example • Customer -> User • Movie -> Web Pages • Worker -> ISP • Movie Storage Room -> Origin Server • Movie Rack -> Web Cache
Cache Hit/Cache hit rate • Cache hit • Occurs when a request can be satisfied by the web cache. • In the movie store example • Hit? • Cache hit rate • Is the percentage that a previously cached object will score a cache hit
Cache Miss • Cache miss • Occurs when a request cannot be satisfied by the web cache. • In the movie example • Miss?
Web Caching • Pros • Can reduce internet bandwidth • If a request can be satisfied by the web cache • Reduce the work load of the origin server • By storing previously requested web objects in a web cache • Reduce user access latency • When a cache hit occurs
Web Caching • Cons • Not every web objects are cacheable • Website that generate dynamic data • Requires an active connection • https:// • Stale Cache • Cache that are out of date • Bottleneck at the proxy server (in proxy caching)
Types of web cache • Browser Cache • Proxy Cache • Reverse Proxy Cache
Browser Cache • Cache stored at client level • Meaning the cache is actually stored on the user’s computer • i.e. Temporary internet files, [http://www.holgermetzger.de/pdl.html]
Browser Cache • Advantages of Browser Cache • Stored Locally • On cache hit it saves bandwidth • Increase in access latency • User pattern • The same user has a higher probability of browsing the same website each day.
Browser Cache • Disadvantages of Browser Cache • Takes up hard drive space • Stale object • Always risk running into stale object with caching. • Stored Locally • Only serves one computer.
Proxy Cache • Cache are stored at a proxy server • The proxy server usually serves more than one user • Acts as a gateway to the internet for large company or institution http://www.codeproject.com/KB/web-cache/ExploringCaching/cache_array.jpg
Proxy Cache • Request are directed to the proxy server instead of the origin server. • On cache hit • Returns the requested object to the user. • On cache miss • Request is then forwarded to origin server.
Proxy Cache • Advantages • Serves more than one client • Cache hit can occur even if different user makes the same request. • Gateway • Companies can limit what user can access. • Disadvantages • Serves more than one client • Can be overloaded. • Gateway • When the proxy server is down all the users are disconnected from then internet.
Reverse proxy Cache • Serves, origin server • Basically a proxy server that sits in front of the origin server. http://odino.org/images/proxy-cache.jpg
Reverse Proxy Cache • When a request is made? • Directed to the reverse proxy cache server • On cache hit • Object is returned to user • On cache miss • Request is forwarded to the origin server • A copy is stored on the Reverse proxy server • A copy is sent back to the user
Reverse Proxy Cache • Advantages • Reduces workload off of the origin server • Requested object can be requested once, cached on the reverse proxy server, and server many clients without contacting the origin server again • Static files can be cached • i.e. CSS files, java scripts, logos • Allows the origin server to better process dynamic contents
Reverse Proxy Cache • Disadvantages • Bottleneck • Many users making requests at the same time • Stale Cache/old files • Risk of cache hits on stale object, also static files can be outdated
Web Caching Architecture • Two main web caching architecture • Hierarchical • Distributed • They both utilizes the network shown below [3]
Hierarchical Caching Architecture • There are more than one level of cache between the users and the origin server • Typically employs more than one types of cache • There are parents, child and sibling relationships between caches.
Hierarchical Caching Architecture • First level of cache – Institutional Network • Second level of cache – Regional Network • Third level of cache – National Network • Parents? Child? Siblings? [3]
Hierarchical Caching Architecture • When a request is made • Its sent to the level one cache • If the level one cache cannot satisfy the request • Then its forwarded to the level two cache • If the level two cache cannot satisfy the request • Then its forwarded to the next level. • Once it reaches the last level, and still not be satisfied, then the request is forwarded to the origin server
Hierarchical Caching Architecture • Advantages • Different level of cache offers more chance for a cache hit • Leads to decrease access latency • Also reduce workload on the origin servers • Disadvantages • Every level added to the hierarchy adds delay • On cache miss there is a slight increase in latency • Higher level cache servers are expensive
Distributed Caching Architecture • Cache are stored at the Institutional Level • Regional and national level are eliminated • Each institutional network in the distributed system are siblings to each other. [3]
Distributed Caching Architecture • What is special in the distributed caching architecture? • Each institutional cache can contact its sibling cache • So each cache can knows what is in the other cache • They can receive objects from their sibling
Distributed Caching Architecture • When a request is made? • Query-Based Approach – Internet Caching Protocol • Request sent to configured institutional cache server • On cache miss, the request is broadcasted to the institutional cache’s sibling cache. • If a sibling cache contains the requested object, the sibling cache sends the object to the immediate institutional cache. The immediate institutional cache then stores a copy in itself, and sends the client another copy • If no sibling contains the requested object, a timeout will occur. • At which point the immediate institutional cache will then forward the request to the origin server.
Distributed Caching Architecture • When a request is made? • Directory-Based Approach – Cache Digest (Squid) • In this approach metadata is used. • Each cache is aware of it’s siblings content. • When a request is made, its sent to the immediate institutional cache. • On cache miss, the institutional cache checks its metadata to see if any of it’s sibling cache contains the requested object. • If not, then it forwards the request to the origin server
Distributed Caching Architecture • Advantage • Sibling cache servers share common interests • More chance of cache hit • Sibling cache servers are assigned based on proximity • Faster response time
Distributed Caching Architecture • Disadvantage • Sibling cache servers share common interests • If the servers are too far apart • Increase in access latency • Sibling cache servers are assigned based on proximity • Servers may not share common interest • Less chance of cache hit
Web Cache Coherency • Web cache coherency • Is the cache up to date? • Web cache coherency mechanism • Validation check • When a web object is first received • It gets time stamped • When the cached object is used, the cache server makes a validation check, by sending the time stamp to the origin server
Web Cache Coherency • Web cache coherency mechanism • Callback • When a web object is cached, it receives a callback promise for the object, from server. • Callback promise – a promise that the origin server will notify the cache server if the object has been updated • So the cache object is up to date if the cache server have not received a notification from the origin server
Web Cache Coherency • Web cache coherency mechanism • Expiration • When an object is cache an expiration date is assigned to it • Object is valid until expiration date • The first request for the object after its expiration date is requested from the origin server again. • At this time a new expiration date is assigned to the object
Cache Placement and Replacement Policies • How cache are replaced • Random • A random cache is replaced. • Size • Largest cache is replaced first • FIFO – First In First Out • Oldest cache is eliminated first
Cache Placement and Replacement Policies • LRU – Least Recently Used • Cache that has not been requested for the longest time is eliminated first • LRU/MIN – Least Recently Used Minimum • The first document whose size is larger than or equal to the size of the new document is removed • HLRU – History Least Recently Used • Record how many times each cached object is used • Elimination based on • LRU • Least Used
Cache Placement and Replacement Policies • LFU – Least Frequently Used • Cache are sorted based on how frequently it is used • On cache hit, the counter for the hit object is incremented by one. • List is then re-ordered • The web object with the lowest count is replaced first • LFU – Aging • Same as LRU • The Average count of all cached object is monitored • When the average count reaches a threshold, all counts are reset back to zero
Cache Placement and Replacement Policies • LRV – Lowest Relative Value • Each cached object is assigned a cost value • Object with the lowest cost value are replaced first • GD – Greedy Duel • Each cached object is assigned a cost value • Lowest cost object are replaced first • Then all cached object has their cost lowered by the replaced object’s cost • Each time a cache is accessed its cost is reset back to its original cost
Conclusion • Web cachinghelps reduce: • Network Congestion • User access latency • Performance of origin server
Questions? • Questions?
Reference • [1]Barish, G., & Obraczke, K. (2000). World Wide Web caching: trends and techniques. Communications Magazine, IEEE , 38(5), 178 - 184 . doi:10.1109/35.841844 Retrieved from http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=841844&isnumber=18201 • [2]Bakiras, S., Loukopoulos, T., Papadias, D., & Ahmad, I. (2005). Adaptive schemes for distributed web caching. Jour of Parallel and Distributed Computing, Retrieved from http://www.cs.ust.hk/~dimitris/PAPERS/JPDC05-DWC.pdf • [3]Biersack, E. W., Rodriguez, P., & Spanner, C. (2001). Analysis of Web caching architectures: hierarchical and distributed caching. Networking, IEEE/ACM Transactions on , 9(4), 404-418. doi:10.1109/90.944339 Retrieved from http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=944339&isnumber=20434 • [4]Das, S., Dykes, S. G., & Jeffery, C. L. (1999). Taxonomy and design analysis for distributed Web caching. System Sciences, 1999. HICSS-32. Proceedings of the 32nd Annual Hawaii International Conference on , 8, 10. doi:10.1109/HICSS.1999.773040 Retrieved from http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=773040&isnumber=16788 • [5]Davison, B. D. (2001). A Web caching primer. Internet Computing, IEEE, 5(4), 38-45. doi:10.1109/4236.939449 Retrieved from http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=939449&isnumber=20329
Reference • [6]Dubois, M., & Jeong, J. (2002, June). In R Bianchini (Chair). Cost-sensitive cache replacement algorithms. Paper presented at Second workshop on caching, coherence, and consistency, New York, NY, USA Retrieved from http://www.research.rutgers.edu/~wc3/papers/dubois.pdf.gz • [7]Geetha, K., Gounden, N. A., & Monikandan, S. (2009). SEMALRU: An Implementation of modified web cache replacement algorithm. Nature & Biologically Inspired Computing, 1406-1410. • doi: 10.1109/NABIC.2009.5393711 • URL: http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=5393711&isnumber=5393306 • [8]Hassanein, H., Liang, Z., & Liang, P. (2002). Performance comparison of alternative Web caching techniques. Computers and Communications, 2002. Proceedings. ISCC 2002. Seventh International Symposium on , 213 - 218 . doi:10.1109/ISCC.2002.1021681 Retrieved from http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=1021681&isnumber=21983 • [9](n.d.). Reverse Proxy Caching. In Cisco ACNS Caching and Streaming Configuration Guide. (5th ed.). (pp. 6-1). San Jose, CA: Cisco Systems, Inc.. doi:OL-4070-01 Retrieved from http://www.cisco.com/en/US/docs/app_ntwk_services/waas/acns/v51/configuration/local/guide/a51cag.pdf • [10]Tay, T. T., & Wijesundara, M. N. (2002). Distributed Web caching. Communication Systems, 2002. ICCS 2002. The 8th International Conference on , 2(25-28), 1142- 1146 vol.2 . doi:10.1109/ICCS.2002.1183311 Retrieved from http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=1183311&isnumber=26554
Reference • [11]Vakali, A. (2000). Lru-based algorithms for web cache replacement. In K. Bauknecht, S. Kumar Madria & G. Pernul (Eds.), Electronic Commerce and Web Technologies, First International Conference (p. 409-418). Retrieved from http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.59.5504&rep=rep1&type=pdf