1 / 30

Work Queue: A Scalable Master/Worker Framework

Peter Bui June 29, 2010. Work Queue: A Scalable Master/Worker Framework. Master/Worker Model. Central Master  application Divides work into tasks Sends tasks to Workers Gathers results Distributed collection of Workers Receives input and executable files Runs executable files

xue
Télécharger la présentation

Work Queue: A Scalable Master/Worker Framework

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Peter Bui June 29, 2010 Work Queue:A Scalable Master/Worker Framework

  2. Master/Worker Model • Central Master application • Divides work into tasks • Sends tasks to Workers • Gathers results • Distributed collection of Workers • Receives input and executable files • Runs executable files • Returns output files

  3. MPI Number of workers static Scale up to limited number of workers (16, 32, 64) Reliable at application level but no fault tolerance Requires homogeneous deployment environment Workers can communicate with anyone Work Queue versus MPI Work Queue  • Number of workers dynamic • Scale up to large number of workers (100s - 1000s) • Reliable and fault tolerant at the task level  • Allows for heterogeneous deployment environments • Workers communicate only with Master

  4. Success Stories All-Pairs Makeflow Wavefront SAND

  5. Architecture (Overview)

  6. Architecture (Master) • Uses Work Queue library • Creates a Queue • Submits Tasks • Command • Input files • Output files • Library keeps tracks of Tasks • When a Worker is available, the library sends Tasks • When Tasks complete • Retrieve output files

  7. Architecture (Workers) • User start workers on any machine • Contact Master and request work • When Task is received, perform commutation, return results • After set idle timeout, quit and cleanup

  8. API Overview (Work Queue) Simple C API • Work Queue • work_queue_create(int port)Create a new work queue. • work_queue_delete(struct work_queue *q)Delete a work queue. • work_queue_empty(struct work_queue *q)Determine whether there are any known tasks queued, running, or waiting to be collected.

  9. API Overview (Task) Simple C API • Task • work_queue_task_create(const char *command)Create a new task specification.  • work_queue_task_delete(struct work_queue_task *t)Delete a task specification. • work_queue_task_specify_input_file(struct work_queue_task *t, const char *fname, const char *rname);Add input file specification. • work_queue_task_specify_output_file(struct work_queue_task *t, const char *rname, const char *fname);Add output file specification.

  10. API Overview (Execution) Simple C API • Execution • work_queue_submit(struct work_queue *q, struct work_queue_task *t)Submit a job to a work queue. • work_queue_wait(struct work_queue *q, int timeout)Wait for tasks to complete. 

  11. Software Configuration Web Information     http://cse.nd.edu/~ccl/software/installed.shtml AFS   $ setenv PATH ~ccl/software/cctools/bin:$PATH   $ setenv PATH ~condor/software/bin:$PATH CRC   $ module use /afs/nd.edu/user37/ccl/software/modulefiles   $ module load cctools   $ module load condor

  12. Example 1: DConvert • Goal: convert set of input images to specified format in parallel • Input:<format> <input_image1> <input_image2> ... • Output: converted images in specified format • Skeleton: • ~pbui/www/scratch/workqueue-tutorial.tar.gz

  13. DConvert (Preparation) Setup scratch workspace $ mkdir /tmp/$USER-scratch $ cd /tmp/$USER-scratch $ pwd Copy source tarball and extract it $ cp ~pbui/www/scratch/workqueue-tutorial.tar.gz . $ tar xzvf workqueue-tutorial.tar.gz $ cd workqueue-tutorial $ ls Open dconvert.c source file for editting $ gedit dconvert.c &

  14. DConvert (TODO 1, 2, and 3) // TODO 1: include work queue header file #include"work_queue.h" // TODO 2: declare work queue and task structs struct work_queue *q; struct work_queue_task *t; // TODO 3: create work queue using default port q = work_queue_create(0);

  15. DConvert (TODO 4, 5, 6) // TODO 4: create task, specify input and output file, submit task t = work_queue_task_create(command); work_queue_task_specify_input_file(t, input_file, input_file); work_queue_task_specify_output_file(t, output_file, output_file); work_queue_submit(q, t); // TODO 5: while work queue is empty wait for task, then delete returned task while (!work_queue_empty(q)) {         t = work_queue_wait(q, 10);         if (t) work_queue_task_delete(t); } // TODO 6: delete work queue work_queue_delete(q);

  16. DConvert (Demonstration) Build and prepare application $ make $ cp /usr/share/pixmaps/*.png . Start batch of workers $ condor_submit_workers `hostname` 9123 5 Start application $ ./dconvert jpg *.png

  17. Tips and Tricks (Debugging) Debugging • Enable cctools debugging system • In master application: • debug_flags_set("wq"); • debug_flags_set("debug"); • In workers: • work_queue_worker -d debug -d wq <hostname> <port> • Incrementally test number of workers Failed Execution • Include executable and dependencies as input files • Right target platform (32-bit vs 64-bit, OS, etc.)

  18. Tips and Tricks (Tasks) Tag Tasks • Give a task an identifying tag so Master can keep track of it Use input and output buffers • work_queue_task_specify_input_buf • Contents of buffer will be materialized as a file at worker • task->output • Buffer that contains standard output of task Check task results • task->result: result of task • task->return_status:exit code of command line

  19. Tips and Tricks (Batch) Custom Worker Environment • Modify batch system specific submit scripts • condor_submit_workers • Set requirements • sge_submit_workers • Set environment • Set modules

  20. Tips and Tricks (CRC) Submit master, find host, submit workers • qsub myscript.sh#!/bin/cshmaster • qstat -u <afsid> | grep myscript.sh • sge_submit_workers <hostname> <port>

  21. Example 2: Mandelbrot Generator • Goal: generate mandelbrot image • Input:<width> <height> <xmin> <xmax> <ymin> <ymax> <max_iterations> • Output: mandelbrot image in PPM format • Skeleton: • ~pbui/www/scratch/workqueue-tutorial.tar.gz

  22. Mandelbrot (Overview) z(n+1) = z^2 + c Escape Time Algorithm • For each pixel (r, c) in image calculate if corresponding point (x, y) escapes boundary • Iterative algorithm where each pixel computation is independent Application design • Master partitions image into tasks • Workers compute Escape Time Algorithm on partitions

  23. Mandelbrot (Naive Approach) Master • For each pixel (r, c) in image (width x height) • Computer corresponding x, y • Submit task with for pixel with x, y • Pass x, y parameters as input buffer • Tag task with r, c values • Wait for each task to complete: • Retrieve output of worker from task->output • Retrieve r, c from task->tag • Store pixel[r, c] = output • Output pixels in PPM format

  24. Mandelbrot (Naive Approach) Worker • Read in parameters from input file: • x0, y0, max_iterations, black_value • Perform Mandelbrot computation as specified from Wikipedia: • http://en.wikipedia.org/wiki/Mandelbrot_set#For_programmers • Output result (iterations) to standard out

  25. Mandelbrot (Analysis) Problem • Processing each pixel as a single task is inefficient • Too-fine grained • Overhead of sending parameters, running tasks, and retrieving results > than computation time Work Queue Golden Rule: Computation Time > Data Transfer Time + Task setup overhead

  26. Mandelbrot (Better Approach) Send Rows • Process groups of pixels rather than individual ones: • Send a row and have the worker return a series of results • Perhaps send multiple rows? • Should take execution time from minutes to seconds

  27. Mandelbrot (Demonstration) Build application $ make Start batch of workers $ condor_submit_workers `hostname` 9123 10 Start application $ ./mandelbrot_master 512 512 -2 1 -1.5 1.5 250 > output.ppm $ display output.ppm

  28. Advanced Features Fast Abort • Allow Work Queue to pre-emptively kill slow tasks • work_queue_activate_fast_abort(q, X) • X is the fast abort multiplier • if (runtime >= average_runtime * X) fast_abort Scheduling • Change how workers are selected • FCFS: first come, first serve • FILES: has the most cached files • TIME: fastest average turn around time • Can be set for queue or for task

  29. Advanced Features (More) Automatic Master Detection • Start master with a project name: • setenv WORK_QUEUE_NAME="project_name" • Enable master auto selection mode with workers • work_queue_worker -a -N "project_name" • work_queue_pool -T condor -a -N "project_name" • Checkout master at http://chirp.cse.nd.edu Shut down workers • work_queue_shut_down_workers

  30. Web Resources Website http://www.nd.edu/~ccl/software/workqueue/ • User manual and C API documentation Bug Reports and Suggestions  http://www.cse.nd.edu/~ccl/software/help.shtml Python-API http://bitbucket.org/pbui/python-workqueue/ • Experimental Python binding

More Related