1 / 31

MapReduce: Simplified Data Processing on Large Clusters

MapReduce: Simplified Data Processing on Large Clusters. Authors: Jeffrey Dean and Sanjay Ghemawat Presenter: Guangdong Liu Jan 28th, 2011. Presentation Outline. Motivation Goal Programming Model Implementation Refinement. Motivation. Large-scale data processing

Télécharger la présentation

MapReduce: Simplified Data Processing on Large Clusters

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. MapReduce: Simplified Data Processing on Large Clusters Authors: Jeffrey Dean and Sanjay Ghemawat Presenter: Guangdong Liu Jan 28th, 2011

  2. Presentation Outline • Motivation • Goal • Programming Model • Implementation • Refinement

  3. Motivation • Large-scale data processing • Many data-intensive applications involve processing huge amounts of data and then producing lots of other data • Certain common themes are shared when executing such applications • Hundreds or thousands of machines are used • Two categories of basic operation on the input data: 1) Map():process a key/value pair to generate a set of intermediate key/value pairs 2) Reduce(): merge all intermediate values with the same key

  4. Goal • MapReduce: an abstraction that allows users to perform simple computations across large data set which is distributed on large clusters of commodity PCs while hiding the details of parallelization, data distribution, load balancing and fault toleration • User-defined functions • Automatic parallelization and distribution • Fault tolerance • I/O scheduling • Status monitoring

  5. Programming Model • Inspired by Lisp primitives map and reduce • Map(key, val) • Written by a user • Process a key/value pair to generate intermediate key/value pairs • The MapReduce library groups all intermediate values associated with the same key together and passes them to the reduce function • Reduce(key,vals) • Also written by a user • Merge all intermediate values associated with the same key

  6. Programming Model

  7. Programming Model • Count words in docs • Input consists of (doc_url, doc_contents) pairs • Map(key=doc_url, val=doc_contents), for each word w in contents, emit(w, “1”) • Reduce(key=word, values=counts_list), sum all “1”s in value list and emit result “(word, sum)”

  8. Programming Model (Hello, 1) (Bye, 1) Hello World, Bye World!  (World, 1) (World, 1) (Hello, 2) (Bye, 1) (Welcome, 1) (to, 3) M3 M2 M1 R1 (Welcome, 1) (to, 1) (to, 1) Welcome to UNL, Goodbye to UNL. (Goodbye, 1) (UNL, 1) (UNL, 1) (World, 2) (UNL, 2) (Goodbye, 2) (MapReduce, 2) R2 (Hello, 1) (to, 1) Hello MapReduce, Goodbye to MapReduce. (Goodbye, 1) (MapReduce, 1) (MapReduce, 1) Map Phase Intermediate Result Reduce Phase DFS DFS

  9. Implementation • User to do list • Indicate input and output files • M: number of map tasks • R: number of reduce tasks • W: number of machines • Write map and reduce functions • Submit jobs • This requires no knowledge of parallel/distributed systems!!!

  10. Implementation Master Assign MapTask Assign ReduceTask Write to DFS Remote Read Local Write Read from DFS P1 P1 P1 Output 1 ... … … … … … Pr Pr Pr R1 M1 B1 … … … … B2 Input M2 … … … … Output r Rr Bn Mn Mapper Reducer Map Phase Intermediate Result Reduce Phase DFS DFS

  11. Implementation • Input files split (M splits) • Each block is typically 16~64MB • Start up many copies of user program on a cluster of machines • Master & Workers • One special instance becomes the master • Workers are assigned tasks by the master • There are M map tasks and R reduce tasks to assign • Master finds idle workers and assigns map or reduce tasks to them

  12. Implementation 3. Map tasks • Map workers read contents of corresponding input partition • Perform user-defined map computation to create intermediate <key,value> pairs • The intermediate <key,value> pairs produced by the map function are buffered in memory • Writing intermediate data to disk (R regions) • Buffered output pairs written to local disk periodically • Partitioned into R regions by a partitioning function • Location of these buffered pairs on the local disk are passed back to the master

  13. Implementation • Read & Sorting • Use remote procedure calls to read the buffered data from the local disks of map workers • Sort intermediate data by the intermediate keys • Reduce tasks • Reduce worker iterates over ordered intermediate data • Each unique key encountered – key & values are passed to user's reduce function • Output of user's reduce function is written to output file on a global file system • When all tasks have completed, the master wakes up user program

  14. Implementation • Fault tolerance-in a word, redo • Workers are periodically pinged by master • No response = failed worker • Reschedule failed tasks • Note: completed map task by the failed worker need to be re-executed because the output is stored on the local disk

  15. Implementation • Locality • Input data is managed by GFS and has several replicas • Schedule a task on a machine containing a local replica or near a replica • Task Granularity • M map tasks and R reduce tasks • Make M and R much larger than number of worker machines

  16. Implementation • Backup tasks • Straggler: a machine that takes an unusually long time to complete one of the last few map or reduce tasks in the computation. • Cause: bad disk, competition for CPU … • Resolution: schedule backup executions of in-progress tasks when a MapReduce operation is close to completion

  17. Source • The example is quoted from: • Wei Wei; Juan Du; Ting Yu; Xiaohui Gu; , "SecureMR: A Service Integrity Assurance Framework for MapReduce," Computer Security Applications Conference, 2009. ACSAC '09. Annual , vol., no., pp.73-82, 7-11 Dec. 2009

  18. Making Cluster Application Energy-Aware Authors: Nedeljko Vaasic, Martin Braistits and Vincent Salzgerber Jan 28th, 2011

  19. Outline • Introduction • Case Study • Approach

  20. Introduction • Power consumption • A critical issue in large scale clusters • Data centers consume as much energy as a city • 7.4 billion dollars per year • Current techniques for efficiency • Consolidate workload into fewer machines • Minimize the energy consumption while keeping the same overall performance level • Problems • Cannot operate at multiple power levels • Cannot deal with energy consumption limits

  21. Case Study • Google’s Server Utilization and Energy Consumption

  22. Case Study • Hadoop Distributed File System (HDFS)

  23. Case Study • Hadoop Distributed File System (HDFS)

  24. Case Study • MapReduce

  25. Case Study • Conclusion • It is a wise decision to aggregate load on a fewer number of machines for saving energy • Distributed applications must actively participate in the power management in order to avoid poor performance

  26. Approach

  27. On the Energy (In)efficiency of Hadoop Clusters Authors: Jacob Leverich, Christ Kozyrakis Jan 28th, 2011

  28. Introduction • Improvement of energy efficiency of a cluster • Place some nodes into low-power standby modes • Avoid energy waste on oversized components for each node • Problems

  29. Approach • Hadoop data layout overview • Distribute replicas across different nodes in order to improve performance and reliability • The user specifies a block replication factor n to ensure n identical copies of any data-block are stored across a cluster (typically n=3) • The largest number of nodes that can be disabled without impacting data availability is n-1

  30. Approach • Covering subset • At least one replica of a data-block must be stored in a subset of nodes called covering subset • Make sure that a large number of nodes can be gracefully removed from a cluster without affecting the availability of data or interrupting the normal operation of a cluster

More Related