1 / 17

Cloud Computing What, why, how?

Cloud Computing What, why, how?. Noam Bercovici Renata Dividino. Motivation. Count how frequent each words appears in the corpus MEDline (18 millions texts). Motivation. I want to extend my research to another corpus. Need more computing resources. Agenda. Introduction

tmaddox
Télécharger la présentation

Cloud Computing What, why, how?

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Cloud ComputingWhat, why, how? Noam Bercovici Renata Dividino

  2. Motivation • Count how frequent each words appears in the corpus MEDline (18 millions texts)

  3. Motivation I want to extend my research to another corpus Need more computing resources

  4. Agenda • Introduction • Data Grid vs. Computing Grid • Grid Computing • Cloud Computing • Data Grid (HaDoop File System) • Computing Grid (Map Reduce) • Conclusion

  5. Data Grid vs. Computing Grid Grid Computing • Data Grid: • distributed data storage • controlled sharing and management of large amounts of distributed data. • Computing Grid: • Parallel execution • divide pieces of a program among several computers • Data Grid + Computing Grid

  6. Grid Computing Slaves Task Master The Grid

  7. Grid Computing • Motivation: high performance, improving resources utilization • Aims to create illusion of a simple, yet powerful computer out of a large number of heterogeneous systems • Tasks are submitted and distributed on nodes in the grid

  8. Cloud Computing • “The interesting thing about cloud computing is that we’ve redefined cloud computing to include everything that we already do. “ • Larry Ellisonduring Oracle’s Analyst Day

  9. Cloud Computing • Pay-as-you-go • No initial investments • Reduced operation costs • Scalability • Availability

  10. Grid vs. Cloud Computing

  11. Cloud Computing - Open Issues • Bandwidth and latency • Lack of standard and portability • „Black-box“ implementations • Security and lack of control • Immature tools and framework support • Legal issues (ownership, auditing, etc) • Limited Service Level of Agreements (SLAs)

  12. Data Grid vs. Computing Grid Grid Computing • Data Grid: • distributed data storage • controlled sharing and management of large amounts of distributed data. • Computing Grid: • Parallel execution • divide pieces of a program among several computers • Data Grid + Computing Grid

  13. Data Grid (Hadoop FS - Overview) • Caching of Data Index: Namenode (master node) Metadata (Name, .., ..) Ask specific text … Client Block ops Datanodes (Slave node) Replication

  14. Data Grid (HDFS - Replication Data)

  15. Counting Words in Text Files Split-Operation Map-Operation Reduce-Operation w1: countWords(File) w1: 6 w2: w2: 14 countWords(File) w3: 15 w3: … … w4: 17 countWords(File) w4: … countWords(File) w5: w5: 1

  16. Advantages of Hadoop • Purely written in Java, requires installation of Cygwin under Windows • Available under LGPL and Apache 2.0 license • Usually offers only one implementation for the different features of a grid framework • May also use other file systems than Hadoop FS • Very flexible implementation of MapReduce • For split operation only supports FileSplit out of the box • Better suited for computations where … • … large data collections should be handled • … if reduce-operation is more than a simple aggregation of the map‘s output

  17. Danke! • Questions?

More Related