1 / 33

Web Search for a Planet: The Google Cluster Architecture

Web Search for a Planet: The Google Cluster Architecture. Eugenio De Hoyos 6175 Computer Science Seminar October 4, 2011. introduction. introduction. “. … a single query on Google reads h undreds of megabytes of data and c onsumes tens of billions of CPU cycles…. ”. IO.

draco
Télécharger la présentation

Web Search for a Planet: The Google Cluster Architecture

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Web Search for a Planet:The Google ClusterArchitecture Eugenio De Hoyos 6175 Computer Science Seminar October 4, 2011

  2. introduction

  3. introduction “ … a single query on Google reads hundreds of megabytes of data and consumes tens of billions of CPU cycles… ” IO 500 MB @ 20 MB/s → 25 sec CPU 10x109 cycles @ 2 GHz → 5 sec

  4. introduction “ … a single query on Google reads hundreds of megabytes of data and consumes tens of billions of CPU cycles… ” IO 500 MB @ 20 MB/s → 25 sec CPU 10x109 cycles @ 2 GHz → 5 sec

  5. outline A Single Query Philosophy Power Index Hardware Index Memory Conclusion

  6. a single query http://www.googlefalle.com

  7. a single query Google Web Server Google Web Server Google Web Server Google Web Server Google Web Server Hardware Load Balancer Google Web Server Google Web Server

  8. Google Web Server Google Web Server Google Web Server 4 3 2 1 IndexServers DocumentServers Shard Shard Shard Shard Shard Shard Shard Shard PC PC PC PC PC PC PC PC PC PC PC PC PC PC PC PC PC PC PC PC PC PC PC PC PC PC PC PC PC PC PC PC

  9. outline A Single Query Philosophy Power Index Hardware Index Memory Conclusion

  10. philosophy Service C Service B Service A

  11. philosophy

  12. outline A Single Query Philosophy Power Index Hardware Index Memory Conclusion

  13. the power problem RAM/BOARD HD

  14. ” A Google data center, circa 2000. Note the fan on the floor to cool servers. (Credit: Stephen Shankland-CNET News.com/Jeff Dean-Google)

  15. their observation Equipment Cost Power & Cooling Scale

  16. are their numbers right? Min. Amortization Requires $ 1,500 Operating Costs Min. Cost Requires $ 20,000 Amortization Cost of inefficiency

  17. outline A Single Query Philosophy Power Index Hardware Index Memory Conclusion

  18. hardware index server RAM CPU Hard Drive

  19. hardware 0 8 6 7 9 5 3 1 2 4 1 Clock Cycle 0 8 6 7 9 5 3 1 2 4 0 8 6 7 9 5 3 1 2 4 0 8 6 7 9 5 3 1 2 4 Short Pipeline Pentium III 0 8 6 7 9 5 3 1 2 4 5 3 1 2 4 0 8 6 7 9 5 3 1 2 4 5 3 1 2 4 0 6 7 8 9 0 5 9 1 2 3 4 0 8 6 7 9 5 3 1 2 4 5 3 1 2 4 0 8 6 7 9 0 5 8 9 3 1 2 4 6 7 8 9 0 5 7 8 9 1 2 3 4 0 8 6 7 9 5 3 1 2 4 5 3 1 2 4 6 7 8 0 5 6 7 8 9 1 2 3 4 Long Pipeline 6 7 0 5 6 7 8 9 1 2 3 4 5 Pentium IV 0 8 6 7 9 5 3 1 2 4 5 3 1 2 4 6 0 5 8 6 7 9 3 1 2 4 5 4 0 5 6 7 8 9 1 2 3 4 5 3 4 0 8 6 7 9 5 3 1 2 4 0 8 6 7 9 3 1 2 4 5 3 2 4 0 8 6 7 9 3 1 2 5 3 1 2 4 0 8 6 7 9 1 2 5 3 1 2 4

  20. hardware 0 8 6 7 9 5 3 1 2 4 1 Clock Cycle 0 8 6 7 9 5 3 1 2 4 0 8 6 7 9 5 3 1 2 4 0 8 6 7 9 5 3 1 2 4 Short Pipeline Pentium III 0 8 6 7 9 5 3 1 2 4 5 3 1 2 4 0 8 6 7 9 5 3 1 2 4 5 3 1 2 4 0 6 7 8 9 0 5 9 1 2 3 4 0 8 6 7 9 5 3 1 2 4 5 3 1 2 4 0 8 6 7 9 0 5 8 9 3 1 2 4 6 7 8 9 0 5 7 8 9 1 2 3 4 0 8 6 7 9 5 3 1 2 4 5 3 1 2 4 6 7 8 0 5 6 7 8 9 1 2 3 4 Long Pipeline 6 7 0 5 6 7 8 9 1 2 3 4 5 Pentium IV 0 8 6 7 9 5 3 1 2 4 5 3 1 2 4 6 0 5 8 6 7 9 3 1 2 4 5 4 0 5 6 7 8 9 1 2 3 4 5 3 4 0 8 6 7 9 5 3 1 2 4 0 8 6 7 9 3 1 2 4 5 3 2 4 0 8 6 7 9 3 1 2 5 3 1 2 4 0 8 6 7 9 1 2 5 3 1 2 4

  21. hardware instruction level parallelism 5 5 3 3 1 1 2 2 4 4 thread level parallelism 5 5 3 3 1 1 2 2 4 4 5 5 3 3 1 1 2 2 4 4 5 5 3 3 1 1 2 2 4 4 5 5 3 3 1 1 2 2 4 4

  22. hardware simultaneous multithreading (SMT) 5 5 5 5 3 3 3 3 1 1 1 1 2 2 2 2 4 4 4 4 5 5 5 5 3 3 3 3 1 1 1 1 2 2 2 2 4 4 4 4 5 5 5 5 3 3 3 3 1 1 1 1 2 2 2 2 4 4 4 4 5 5 5 5 3 3 3 3 1 1 1 1 2 2 2 2 4 4 4 4 CPU L1 5 5 5 5 3 3 3 3 1 1 1 1 2 2 2 2 4 4 4 4 L2

  23. hardware chip multiprocessor (CMP) 5 5 3 3 1 1 2 2 4 4 5 5 3 3 1 1 2 2 4 4 L1 5 5 3 3 1 1 2 2 4 4 5 5 CPU 3 3 1 1 2 2 4 4 5 5 3 3 1 1 2 2 4 4 5 5 3 3 1 1 2 2 4 4 L2 5 5 3 3 1 1 2 2 4 4 CPU 5 5 3 3 1 1 2 2 4 4 L1

  24. outline A Single Query Philosophy Power Index Hardware Index Memory Conclusion

  25. memory & scalability Unpredictable memory access Large cache lines prefetch helps RAM line length Cache CPU cache length Memory bandwith OK

  26. outline A Single Query Philosophy Power Index Hardware Index Memory Conclusion

  27. conclusion Cluster architecture is ideal and least expensive Maximize throughput Software Reliability

  28. conclusion Service C Service B Service A

  29. a discussion question… HDMI Monitor USB Keyboard 700 MHz ARM 11 128 MB RAM Open GL ES 2.0 1080p -- David Braben, UK game developer

  30. questions?

More Related