50 likes | 186 Vues
This document explores the use of workflows as a key methodology in scientific computing, highlighting the work of Ian Foster at the Computation Institute, Argonne National Lab, and the University of Chicago. It delves into the execution of workflows in biology, particularly focusing on querying and analyzing protein data against vast gene sequences. Examples from the PUMA knowledge base and Cancer Bioinformatics Grid illustrate the execution environment, job scheduling, and tools used. A detailed examination of data services is also included, demonstrating the integration of various analytic services.
E N D
Workflow as the Methodology of Science Ian Foster Computation Institute Argonne National Lab & University of Chicago
Science as Workflow Executed Executing Query Executable Not yet executable What I Did What I Am Doing Edit … What I Want to Do Execution environment Schedule
Example: Biology Public PUMA Knowledge Base Information about proteins analyzed against ~2 million gene sequences Back OfficeAnalysis on Grid Millions of BLAST, BLOCKS, etc., onOSG and TeraGrid Natalia Maltsev et al.,http://compbio.mcs.anl.gov/puma2
Genome Analysis & Database Update (GADU) on OSG April 24, 2006 3,000 jobs GADU
Data Service @ uchicago.edu Example:Cancer Bioinformatics Grid <BPEL Workflow Doc> <Workflow Inputs> link BPEL Engine Analytic service @ duke.edu link link <Workflow Results> link Analytic service @ osu.edu caBiG: https://cabig.nci.nih.gov/; BPEL work: Ravi Madduri et al.