1 / 1

Optimizing Grid Execution with Pegasus: Effective Planning for Complex Workflows

Pegasus is a powerful framework designed for planning and executing workflows in grid computing environments. It efficiently captures and manages relationships among data, programs, and execution needs, applying algorithmic and AI-based techniques for resource allocation. By utilizing tools like Globus RLS and MDS, Pegasus automates the discovery of data and resources, enhances workflow organization, and provides data provenance. Features such as deferred planning, node aggregation, and integration with Condor's job retry mechanism minimize scheduling overhead and enhance reliability in dynamic systems.

holland
Télécharger la présentation

Optimizing Grid Execution with Pegasus: Effective Planning for Complex Workflows

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Pegasus: Planning for Execution in Grids http://pegasus.isi.edu Virtual Data Concepts -- Capture and manage information about relationships among -- Data (of widely varying representations) -- Programs (& their execution needs) -- Computations (& execution environments) -- Apply this information to, e.g. -- Discovery: Data and program discovery -- Workflow: Structured paradigm for organizing, locating, specifying, & requesting data -- Explanation: provenance -- Ressearch part of NSF funded GriPhyN project Logic Pegasus: Planning for Execution in Grids -- Maps from abstract to concrete workflow -- Algorithmic and AI based techniques -- Automatically locates physical locations for both components (transformations) and data -- Uses Globus RLS and the Transformation Catalog -- Finds appropriate resources to execute -- via Globus MDS -- Reuses existing data products where applicable -- Publishes newly derived data products -- Chimera virtual data catalog & MCS -- Uses Globus COG Kit for authentication Planning and Scheduling Granularity Deferred Planning -- Partitioning -- Allows to set the granularity for planning ahead. -- Node Aggregation -- Allows to combine nodes in the workflow and schedule them as one unit. -- Minimizes scheduling overhead and planning overhead -- Related But Separate Concepts -- Small Jobs > High level of Node Aggregation > Large Partitions -- Very Dynamic System > Small Partitions Re Planning -- Leverage Condor’s job retry mechanism to trigger retry on partition in case of failure. -- Parse Condor Log files to determine the sites at which job failed. -- Subsequent invocation of Pegasus on the same partition are aware of the bad sites. Pegasus Portal Grid Setup People Involved: USC/ISI:Ewa Deelman, Carl Kesselman, Gaurang Mehta, Gurmeet Singh, Mei-Hui Su, Karan Vahi ,James Blythe, Yolanda Gil

More Related