120 likes | 243 Vues
The project "EMBOSS Over a Grid" aims to implement the EMBOSS toolkit for bioinformatics analysis on a distributed computing environment. Funded by EELA, this initiative enables execution of EMBOSS jobs through the grid using command-line interfaces, facilitating workflow execution, job retrieval, and enhanced performance. We address challenges like database redundancy and application updating within grid nodes, ensuring robust database management and ease of access. The portal will support larger files and streamline bioinformatics workflows, ultimately creating a research-friendly platform in Mexico.
E N D
EMBOSS over a Grid Eduardo MURRIETA LEON Romualdo ZAYAS-LAGUNAS Pierre-Alain BRANGER Jérôme VERLEYEN Roberto RODRIGUEZ César BONAVIDES Alfredo HERNANDEZ 1st EELA Grid School December 15th of 2006
The European Molecular Biology Open Software Suite • - From EMBNet • - We are EMBNet Mexico • Is a set of programs to analyze genetic sequence • a toolkit for creating robust bioinformatics applications or workflows • Database searching
Plan A • InputSandbox includes: • Binary program • DB ( big files!) • Bad performance UI CE WNs BDII RB
? Plan B • DB & application on each WN • Use of TAGs • DB redundancy! • Complex to update! UI CE WNs BDII RB
NFS Plan B.1 • DB & application shared by WNs • Easy to Update DB • Each CE needs DB! • NFS installed on WNs UI CE WNs BDII RB
NFS Plan C • Use of GFAL API: • Wrapping streams functions (works) • Now supports files bigger than 2GB when necessary • Direct replacement of system calls in the memory map of the program (working on) UI CE WNs GFAL BDII RB SE
1 2 EMBOSS workflow on “Lost Island GRID” DISTANCE • Use of Scripting techniques (DAG – like jobs) • Complex for final user CONSENSUS EXTRACT SEQ ALIGMENT PARSIMONY M. LIKELIHOOD
Original Objectives • « Get EMBOSS running over a Grid » • - EMBOSS jobs execution on a grid through command line • - Retrieving jobs results • - Be able to execute a complete workflow / pipeline sequence analysis • Complementary functions • EMBOSSed Databases research • Wrapping applications for EMBOSS over a Grid • Web interface and Project manager for EMBOSS • Have a BioGrid portal
New objectives • Implementation on the “Lost land of Mexico” • Creation of “EMBOSS over a Grid” rpms • EMBOSS portal using Genius + wEMBOSS • Workflow controlled through R-GMA
Muito obrigado! • EELA committee organizer • Tutors • Support staff
Bye Itacuruça Click here