190 likes | 320 Vues
This presentation discusses the integration of the Galaxy and Swift systems, combining their capabilities into a unified platform. We will explore different integration schemes tailored to user needs and application patterns. Key features of each system will be highlighted, showcasing their complementary roles in executing workflows and managing data. Through real-world use cases, such as climate model analysis and super-cooled glass simulations, we demonstrate how the integrated system enhances computational efficiency. A potential demo will illustrate these applications in action.
E N D
Extending the Galaxy portal with parallel and distributed execution capability Ketan Maheshwari, Alex Rodriguez, David Kelly, Ravi Madduri, Justin Wozniak, Michael Wilde, Ian Foster Argonne National Laboratory & University of Chicago
Overview • Introduce the Galaxy and Swift systems • Couple the Swift and Galaxygateway frameworks • Combine the features offered by Galaxy and Swift into an integrated platform • Different integration schemes based on user requirements, and application patterns • Data management schemes • Example use-case • A demo screencast (if time permits) • Summary and Future work swift-lang.org
Overview of the Galaxy Workflow System* Monitor/ History Panel workspace Tools panel swift-lang.org *slide courtesy: Center for Genomic Regulation, Barcelona, Spain
Simulation of super-cooled glass materials Protein folding using homology-free approaches Climate model analysis and decision making in energy policy Simulation of RNA-protein interaction Multiscale subsurfaceflow modeling Modeling of power gridapplications All have published science results obtained using Swift Overview of Swift Parallel Scripting Framework A B C A B D E T0623, 25 res., 8.2Å to 6.3Å (excluding tail) F > F E Initial D swift-lang.org Protein loop modeling. Courtesy A. Adhikari C Predicted Native
Motivation : Swift and Galaxy are Complementary in many ways • Galaxy (galaxyproject.org) offers a simple, user-friendly web-based interface for composing, execution, monitoring workflows • Galaxy results are sharable, reproducible and reusable • Galaxy is a widely used: well-supported by user community e.g. Next Generation Sequencing (NGS) Community • Swift provides sophisticated interface to parallel and distributed platforms • Swift scripts are structured expressions of complex application flows which are readily executable on multiple, diverse and independent remote resources swift-lang.org
Swift-Galaxy Integration Overview Clouds Clusters Supercomputers Grids • Approaches enabling integration in different ways: • At tool level • At Workflow level • At language/expression level Galaxy web-console Galaxy server Galaxy-tool user computer Swift app libraries swift-lang.org
Computational Infrastructure • Galaxy offers a limited support for Distributed and Parallel Resources • Needs additional adhoc configuration to interface • Constrained in some ways, e.g. needs shared file system* • Swift is robustly interfaced to a wider types of Resource Managers with finer control over job submission parameters: • Supports: PBS/Torque, SGE, SLURM, Condor • Supports bag-of-workstations: clouds, workstation clusters • Supports distributed file system, multiple execution sites simultaneously swift-lang.org * To the best of our knowledge
Interface with heterogeneous parallel systems is a challenge ## SLURM #!/bin/bash #SBATCH -J ... #SBATCH -oe ... #SBATCH –p ... #SBATCH –N ... ibrun./my_execargs # CONDOR Executable=e Universe=std Error=err.$ Input=in.$ Output=out.$ Log=foo.log Queue ##TORQUE/PBS #!/bin/bash #PBS -q ccs_short #PBS -N my_serial_job #PBS -l walltime=01:00:00 #PBS -l nodes=1:noib:ppn=1 #PBS -m e ./a.out SGE #!/bin/bash #$ -cwd #$ -j y #$ -S /bin/bash pwd ./my_execargs swift-lang.org
Scheme1: Wrap Swift around Galaxy Tools swift-tool A execution history swift-tool B execution history . . swift-tool N . . . . . . Other Galaxy tools execution history swift-lang.org
Scheme 2: Interoperability between expressions • Internally both Swift and Galaxy codes are represented in XML dialects • Automated transformation to convert from one form into another • Currently under development XML transformation Swift script Galaxy Workflow swift-lang.org
Scheme 3: Harness Data Parallelism using foreach foreach protein, idx in proteinList{ runBlast (protein); tracef(“The index is: %i\n", idx); } foreachidx in [begin:end:step]{ runmyapp (idx); } swift-foreach wrapper Galaxy-tool Galaxy-tool out-data in-data . . (merge) (split) Galaxy-tool swift-lang.org
Cloud Interfaces • Galaxy instances running on cloud nodes are already taking advantage of cloud-based resources • Swift’s coasters mechanism can farm resources and combine multiple cloud and non-cloud resources in a single application run. swift-lang.org
Data Management • Both Galaxy and Swift offer various data management capabilities • Galaxy offers remote data uploading and viewing capabilities • Swift allows disc resident data to be operated upon as program variables • Swift’s data-providers are interfaced with various data management protocols and can manage data motions at runtime swift-lang.org
Evaluation Application: Inference analysis for power prices generate sample generate sample samples … Candidate Solution Candidate Solution … batches batches generate sample generate sample generate sample generate sample … … batch size lower bound upper bound lower bound … … upper bound … Variance & Mean swift-lang.org
Swift Script for Inference Analysis import "mappings"; import "apps”; type file; intnS[] = [10, 100, 1000, 10000, 100000]; foreach S, idxs in nS { sample0 = gensample(S, wind_data); obj[idxs] = ampl(sample0); foreach B, idxb in [10:40:10] { foreachk in [0:B]{ sample1 = gensample(S, wind_data); obj_l[idxs][idxb][k] = ampl_L(sample1); sample2 = gensample(S, wind_data); obj_u[idxs][idxb][k] = ampl_U(sample2, obj[idxs]); }}} swift-lang.org
Summary • Swift-Galaxy integration improves science gateways: • User control • Structured distributed computing • Simple • Interactive • Commonalities in basic execution model of Galaxy and Swift leads to many avenues of integration schemes • Broadly, Swift acts as a backend manager while Galaxy being the frontend for operations • Example of combining command-line and GUI based frameworks swift-lang.org
Future Work • A generic approach for each of the integration schemes • Wider application adaptation • Finer and broader exposure to configuration options to users • Interactive monitoring features • Authentication features, Globus based identity management swift-lang.org
Acknowledgements • This work was supported in part by the NIH through the NHLBI grant: The Cardiovascular Research Grid (R24HL085343) and by the U.S. Department of Energy under contract DE-AC02- 06CH11357. • We are grateful to Amazon, Inc., for an award of Amazon Web Services time that facilitated early experiments. • Colleagues at Swift and Globus groups at the MCS Division, Argonne National Laboratory swift-lang.org
Thank you!Visit swift-lang.org for more information about Swift parallel scripting framework swift-lang.org