1 / 50

eScience Open Mic : Cloud Computing

eScience Open Mic : Cloud Computing. Bill Howe, Phd eScience Institute, UW. http://escience.washington.edu. eScience is about data. Old model: “ Query the world ” (Data acquisition coupled to a specific hypothesis)

ghalib
Télécharger la présentation

eScience Open Mic : Cloud Computing

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. eScience Open Mic:Cloud Computing Bill Howe, Phd eScience Institute, UW

  2. http://escience.washington.edu

  3. eScience is about data Old model: “Query the world” (Data acquisition coupled to a specific hypothesis) New model: “Download the world” (Data acquisition supports many hypotheses) • Astronomy: High-resolution, high-frequency sky surveys (SDSS, LSST, PanSTARRS) • Biology: lab automation, high-throughput sequencing, • Oceanography: high-resolution models, cheap sensors, satellites 40TB / 2 nights ~1TB / day 100s of devices Bill Howe, eScience Institute

  4. eScience is married to the Cloud: Scalable computing and storage for everyone Bill Howe, eScience Institute

  5. [Slide source: Werner Vogels] Generator Bill Howe, eScience Institute

  6. "... computing may someday be organized as a public utility just as the telephone system is a public utility... The computer utility could become the basis of a new and important industry.” -- John McCarthy Emeritus at Stanford Inventor of LISP 1961 Bill Howe, eScience Institute

  7. Economies of Scale src: Armbrust et al., Above the Clouds: A Berkeley View of CloudComputing, 2009 Bill Howe, eScience Institute

  8. Economies of Scale src: James Hamilton, Amazon.com Bill Howe, eScience Institute

  9. Elasticity Provisioning for peak load src: Armbrust et al., Above the Clouds: A Berkeley View of CloudComputing, 2009 Bill Howe, eScience Institute

  10. Elasticity Underprovisioning src: Armbrust et al., Above the Clouds: A Berkeley View of CloudComputing, 2009 Bill Howe, eScience Institute

  11. Elasticity Underprovisioning, more realistic src: Armbrust et al., Above the Clouds: A Berkeley View of CloudComputing, 2009 Bill Howe, eScience Institute

  12. Animoto Bill Howe, eScience Institute [Werner Vogels, Amazon.com]

  13. Periodic Bill Howe, eScience Institute [Deepak Singh, Amazon.com]

  14. Growth

  15. Growth Bill Howe, eScience Institute

  16. Amazon Bill Howe, eScience Institute [Werner Vogels, Amazon.com]

  17. Bill Howe, eScience Institute [Werner Vogels, Amazon.com]

  18. History History

  19. Timeline 2000 2001 2004 2008 2005+ 2006 2009 Application Service Providers Bill Howe, eScience Institute

  20. Exemplars • Software as a Service • Platform as a Service • Infrastructure as a Service Bill Howe, eScience Institute

  21. Grid Computing • Grid vs. Cloud • WAN vs. centralized • Heterogeneous vs. Data Center • Physical vs. Virtualized • Fewer, larger, dedicated allocations vs. more, smaller, shared allocations Foster 2002 Bill Howe, eScience Institute

  22. Cloud Services Infrastructure-aaS Platform-aaS Software-aaS Constrained Windows Azure Google App Engine Google Docs EC2 Force.com SQL Azure SalesForce.com S3 Elastic MapReduce Automation Bill Howe, eScience Institute

  23. Microsoft Azure

  24. Azure FC Owns this Hardware History Highly-available Fabric Controller (FC) Bill Howe, eScience Institute [Roger Barga, Microsoft]

  25. Bill Howe, eScience Institute [Roger Barga, Microsoft]

  26. Bill Howe, eScience Institute [Roger Barga, Microsoft]

  27. Bill Howe, eScience Institute [Roger Barga, Microsoft]

  28. Bill Howe, eScience Institute [Roger Barga, Microsoft]

  29. At Minimum • CPU: 1.5-1.7 GHz x64 • Memory: 1.7GB • Network: 100+ Mbps • Local Storage: 500GB • Up to • CPU: 8 Cores • Memory: 14.2 GB • Local Storage: 2+ TB Bill Howe, eScience Institute [Roger Barga, Microsoft]

  30. Web Role Worker Role main() { … } HTTP ASP.NET, WCF, etc. IIS Load Balancer Agent Agent Fabric VM Bill Howe, eScience Institute [Roger Barga, Microsoft]

  31. HTTP Blobs Drives Tables Queues Application Storage Compute Fabric … Bill Howe, eScience Institute [Roger Barga, Microsoft]

  32. AzureScope • http://azurescope.cloudapp.net/ • Performance measurements Bill Howe, eScience Institute [Roger Barga, Microsoft]

  33. My 2 Favorite Use Cases Bill Howe, eScience Institute

  34. Use Case 1: “Google Docs for developers” • The cloud is the ultimate collaborative development environment • A shared environment outside of the jurisdiction of over-protective (or otherwise non-responsive) sysadmins • No bugs closed as “can’t replicate” • Example: New software for serving oceanographic model results, requiring collaboration between UW, OPeNDAP.org, and OOI Bill Howe Bill Howe, eScience Institute

  35. Waited two weeks for credentials to be established • Gave up, spun up an EC2 instance, rolling within an hour Similarly, Seattle’s Institute for Systems Biology uses EC2/S3 for sharing computational pipelines Bill Howe, eScience Institute

  36. Use Case 2: Reproducible Research • Protocols, assays, experiments, workflows are increasingly computational • Paradoxically, these activities are often harder to reproduce than “manual” protocols • Why? Bill Howe, eScience Institute

  37. Python2.5 MATLAB Proj4 PostGIS Java 1.5 EJB PostgreSQL SAX SOAP Libs config XML-RPC Libs TomCat S3/EC2 Apache SQL Server Data Services mod_python config VTK security Google App Engine OpenGL Mesa account management 3D Drivers Software dependencies

  38. Division of Responsibility Q: Where should we place the division of responsibility between developers and users? Need to consider skillsets • Can they install packages? • Can they compile code? • Can they write DDL statements? • Can they configure a web server? • Can they troubleshoot network problems? • Can they troubleshoot permissions problems? Frequently the answer is“No” Plus: Tech support is hard. Usually easier to “fix it yourself.”

  39. Division of Responsibility Is there anything busy users arewilling to do?

  40. Example in the classroom • Dr. Randy Leveque, AMATH 574, Winter 2009 • Virtual machines with Clawpack software pre-installed, along with data, models, and analysis tools. • See a How To at http://escience.washington.edu/ search for “virtual machine” • (or go here: http://bit.ly/eMOcle ) Bill Howe, eScience Institute

  41. Use Case 3: Data Sharing • The days of FTP are over • It takes days to transfer 1TB over the Internet, and it isn’t likely to succeed. • Need to push the computation to the data, rather than push the data to the computation • Cloud is perfect • Globally shared storage • Equipped with arbitrary, on-demand computation by anyone Bill Howe, eScience Institute

  42. Bill Howe, eScience Institute

  43. Case Studies Bill Howe, eScience Institute

  44. Bill Howe, eScience Institute

  45. Bill Howe, eScience Institute

  46. FoldIt • Database, fileserver, multiple webservers • < $30k for a 3 year term • Database replicated in multiple zones • Web servers scale automatically with usage • includes 1TB of storage Bill Howe, eScience Institute

  47. Bill Howe, eScience Institute

  48. Many more • Computational Fluid Dynamics • Astronomy • GPGPUs • HIPAA-protected applications • National Security applications • It’s Mainstream! Bill Howe, eScience Institute

More Related