1 / 35

An Analysis of Facebook Photo Caching

An Analysis of Facebook Photo Caching. by Huang et al., SOSP 2013. Presented by Phuong Nguyen. Some animations and figures are borrowed from the original paper and presentation. Photos on Facebook: Overview. Album. Feed. Profile. 250 billion photos, as of Sep 2013. 2. Storage Backend.

logan-lott
Télécharger la présentation

An Analysis of Facebook Photo Caching

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. An Analysis ofFacebook Photo Caching by Huang et al., SOSP 2013 Presented by Phuong Nguyen Some animations and figures are borrowed from the original paper and presentation

  2. Photos on Facebook: Overview Album Feed Profile 250 billion photos, as of Sep 2013 2

  3. Storage Backend FB Cache Layers Photos on Facebook: Overview Akamai CDN Full-stack Study 3

  4. FACEBOOK PHOTO CACHING: HOW IT WORKS? 4

  5. Local Fetch Client Client Browser Cache Client-based Browser Cache 5

  6. PoP Client Edge Cache Browser Cache Geo-distributed Edge Cache (FIFO) (Millions) (Tens) 6

  7. PoP Data Center Client Browser Cache Edge Cache Origin Cache Single Global Origin Cache (FIFO) Hash(url) (Millions) (Tens) (Four) 7

  8. PoP Data Center Client Backend (Haystack) Browser Cache Edge Cache Origin Cache Haystack Backend (Millions) (Tens) (Four) 8

  9. FULL-STACK CACHE STUDY:DATA COLLECTION 9

  10. PoP Data Center Client Backend (Haystack) Browser Cache Edge Cache Origin Cache Trace Collection Instrumentation Scope • Objective: collecting a representative sample that could permits correlation of events related to the same request 10

  11. Sampling Strategies • Request-based: sampling requests randomly • Bias on popular content • Objected-based: focused on some subset of photos selected by a deterministic test on photoId • Fair coverage of unpopular photos • Cross stack analysis 11

  12. WORKLOAD ANALYSIS 12

  13. Analysis Objectives • Traffic sheltering effects of caches • Photo popularity distribution • Geographic traffic distribution & collaborative caching • Can we make the cache better? • Impact of sizes & algorithm • Could we know which photos to cache? 13

  14. ANALYSIS:TRAFFIC SHELTERING 14

  15. PoP Data Center Client 65.5% 58.0% 31.8% 77.2M 11.2M 7.6M 26.6M 9.9% 65.5% 20.0% 4.6% Traffic Share Backend (Haystack) Browser Cache Edge Cache Origin Cache Traffic Sheltering R 15

  16. ANALYSIS:PHOTO POPULARITY IMPACT 16

  17. Popularity Distribution • Skewness is reduced after layers of cache 17

  18. Popularity Impact on Caches 18

  19. ANALYSIS:GEOGRAPHIC TRAFFIC DISTRIBUTION & COLLABORATIVE CACHING 19

  20. Miami Chicago NYC Atlanta 35% local 35% local 60% local Substantial Remote Traffic at Edge LA Dallas 18% local 50% local 20% local 20

  21. Atlanta 5% NYC Substantial Remote Traffic at Edge 10% Chicago 35% D.C. 5% California • Atlanta has 80% requests served by remote Edges 5% Dallas 20% Miami 20% local 21

  22. Collaborative Edge 22

  23. 18% Impact of Using Collaborative Edge Collaborative Collaborative Edge increases hit ratio by 18% 23

  24. ANALYSIS:IMPACTS OF CACHE SIZE & ALGORITHM 24

  25. Potential Improvement Study • Methodology: cache simulation • Replay the trace (25% warm up) • Evaluate using remaining 75% • Improvement factors: • Cache size • Caching algorithm • Evaluation metric: hit ratio 25

  26. Edge Cache with Different Sizes & Algorithms Infinite Cache The same hit ratio can be achieved with a smaller cache and higher-performing algorithms 26

  27. Edge Cache with Different Sizes & Algorithms Infinite Cache Sophisticated algorithm can achieve better hit ratio with the same cache size 27

  28. ANALYSIS:WHICH PHOTOS TO CACHE? 28

  29. Intuitions • Properties that intuitively associated with photo traffic: • The age of photos • The number of Facebook followers associated with the owner 29

  30. Content Age Affect • Age-based cache replacement algorithm could be effective • Fresh content is popular and tends to be effectively cached throughout the hierarchy 30

  31. Social Affect • The more popular photo owner is, the more likely the photo is to be accessed • Browser caches tend to have lower hit ratios for popular users (“viral” effect) 31

  32. DISCUSSIONS 32

  33. Discussions • Evaluation method: • Only consider desktop clients, excluding mobile clients • Trends by mobility of users • Sampling: object-based sampling might not represent realistic workload • Impact of caching done by Akamai CDN • Correlating requests method is not perfect • Latency issue • Evaluation mainly focuses on hit ratio & traffic sheltering, not latency • Latency of collaborative caching is note evaluated 33

  34. Discussions (cont.) • Other potential improvements: • Improved caching algorithm taking into account metadata of photos • Optimal placement of resizing functionality along the stack • The use of Clairvoyant caching might be possible based on predicting future accesses • E.g., photos from the same album, photos appear on news feed, etc. • Solve geographical diversity by improving routing policy (e.g., put more weight into locality aspect) 34

  35. THANK YOU! 35

More Related