1 / 18

Dryad: Distributed Data-Parallel Programs for Sequential Building Blocks

Dryad: Distributed Data-Parallel Programs for Sequential Building Blocks. Presented by: Theodoros Ioannou. Why Dryad. Efficient way for parallel and distributed applications Take advantage of the multicore servers Data parallelism Motivation: GPUs, Map Reduce and Parallel DBs.

suchin
Télécharger la présentation

Dryad: Distributed Data-Parallel Programs for Sequential Building Blocks

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Dryad: Distributed Data-Parallel Programs for Sequential Building Blocks • Presented by: Theodoros Ioannou

  2. Why Dryad • Efficient way for parallel and distributed applications • Take advantage of the multicore servers • Data parallelism • Motivation: GPUs, Map Reduce and Parallel DBs

  3. What is Dryad • Data flow graph with • Vertices and • Channels • Execute vertices and communicate through channels

  4. What is Dryad(cont’d) • Vertices: Sequential Programs given by the programer • Channels: File, TCP pipe, Shared-memory FIFO

  5. Differences from other systems • Allows developers to define communication between vertices • More difficult - Provides better options • A programer can master it in a few weeks • Not as restrictive as MapReduce • Multiple Input and Output • Scales from multicore computers to clusters (~1800 machines)

  6. System Overview • Everything based on the communication flaw • Every vertex runs on a CPU of the cluster • Channels are the data flows between the vertexes • Logical communication graph • Mapped to physical resources at run-time

  7. System Organization Schema

  8. Operators of Graph Descr. Language

  9. SQL Example “It finds all the objects in the database that have neighboring objects within 30 arc seconds such that at least one of the neighbors has a color similar to the primary object’s color.”

  10. SQL Example(cont’d)

  11. SQL ExampleJob’s Skeleton

  12. Execution • Input - The datafile is a distributed file • The graph is dynamically changed because of the positions of datafile partitions • Output - The result is again a distributed file • The scheduler on the JM keeps history of each vertex • On fail, the job is terminated • Replication of vertexes to void that • Use versioning to get the right result • Only fail if it re-run for more than a threshold

  13. Execution(cont’d) • JM assumes it is the only job running on the cluster • Uses greedy algorithm • Vertex programs are deterministic • Same result whenever you run them • If it fails the JM is notified or get a heartbeat timeout • If using FIFO or pipes, kill all the connected vertexes and re-execute all of them

  14. Execution(cont’d) • Run vertexes on the machines (or cluster) as close as possible to the data they use • Because the JM can not know the amount of intermediate data - need for dynamic solution

  15. Experiments • First: SQL Query to Dryad application (Compare to SQL Server - varies the number of machines used) • Second: Simple MapReduce data-mining operation to Dryad application (10.2 TB date and 1800 machines) • Use horizontal partitioning of data, pipelined parallelism within processes and inter-partition exchange operations to move partial results

  16. Results

  17. Shortcomings -Future work • Programer can manipulate inter-process communication - Deadlocks • Programer should know the physical resources of the system - breaks abstraction • Assumption of one job on the cluster - Only one job running • SQL Experiment - Less capabilities from the SQL Server • MapReduce Experiment - Only to show that their system works “sufficiently well” for handling those cases - No results about it • Use statistics for resource prediction before execution of a known program - “we may be able to...” • Sacrifice simplicity - more relaxed kind of code compared with the MapReduce

  18. The End.Questions?

More Related