200 likes | 381 Vues
European Desktop Grid Infrastructure = EDGI. Combining Desktop and Service Grids to Support e-Scientists to Run Simulations. G. Terstyanszky , T. Kukla, T. Kiss, S. Winter, J.: Centre for Parallel Computing School of Electronics and Computer Science, University of Westminster
E N D
European Desktop Grid Infrastructure = EDGI Combining Desktop and Service Grids to Support e-Scientists to Run Simulations G. Terstyanszky, T. Kukla, T. Kiss, S. Winter, J.: Centre for Parallel Computing School of Electronics and Computer Science, University of Westminster London, United Kingdom J. Kovacs, Z. Farkas, P. Kacsuk MTA-SZTAKI Budapest, Hungary, 2010-04-27
Docking and Molecular Dynamics Simulations Protein (receptor) Binding pocket Sugar (ligand) 2
Docking and Molecular Dynamics Simulations • In-vitro (or wet lab) research • It investigates components of an organism that have been isolated from their usual biological surroundings in order to permit a more detailed and convenient analysis than can be done with whole organisms. • In-silico simulation • It simulates components of an organism for example docking of ligands and proteins downloading them from public libraries, binding them and analysing the properties of the compound molecules. • Aims of in-silico docking simulation • Understanding how pathogens bind to cell surface proteins can lead to the design of carbohydrate-based drugs and diagnostic and therapeutic agents • Highlighting potential novel inhibitors and drugs for in vitro and on-chip testing.
Docking and Molecular Dynamics Simulations • Advantages of in-silicomethods: • Reduced time and cost • In vitro experiments are expensive • Better focusing wet laboratory resources: • Better planning of experiments by selecting best molecules to investigate • Increased number of molecules screened • Problems of in-silicoexperiments: • Time consuming • Weeks or months on a single computer • Simulation tools are too complex for an average bio-scientist • Linux command line interfaces • Bio-molecular simulation tools are not widely tested and validated • Are the results really useful and accurate?
In-silico Simulation in Service Grids PDB file 1 (Receptor) PDB file 2 (Ligand) Phase 2 Check (Molprobity) Phase 3 Energy Minimization (Gromacs) Perform docking (AutoDock) Validate (Molprobity) Phase 1 Phase 4 Molecular Dynamics (Gromacs)
In-silico Simulation in Service Grids • phase 1 – pre-processing of protein • phase 2 – pre-processing of sugar • phase 3 – docking • phase 4 – molecular dynamics simulation • Executed on 5 different sites of the UK NGS • Parameter sweeps in phase 3 and 4 • MPI in phase 4
EDGI Infrastructure 2010-04-27
Usage Scenario in Desktop – Service Grids search, select & download application’s implementation EDGI Application Repository query implementation Service Grid Compute Element(1) retrieve & deploy impl Desktop Grid EDGI Portal SG Broker Worker Node(1) Compute Element(2) e-scientist Worker Node(2) DG admin submit application’s implementation SG->DG Bridge Compute Element(n) Desktop Grid Server Worker Node(m) 2010-04-27 8 8 8
EDGI Application Repository: Actors, Entities and Operations Repository Entities Application represents an application which implementations can be executed on the EDGI infrastructure. It describes the inputs and outputs and explains what the application does. Implementation is an application implementation. It contains references (via e.g. URLs) to all the files and data necessary to run the application on a given platform and metadata. Platform describes desktop Grid and/or service Grid environment where the implementation can be executed. Configuration contains the implementation files required to run the applications. Repository Actors and Operations without registration with registration
EDGI Application Repository: User Interface Main menu: select users & groups + applications (implementations) + platforms + validation pages Action menu: create/delete entities + upload/download applications & implementations add/edit/remove metadata Search: users & groups + applications & implementations + platforms
EDGI Application Repository in the EDGI Infrastructure search, select & download application’s implementation EDGI Application Repository query implementation Service Grid Compute Element(1) retrieve & deploy impl Desktop Grid EDGI Portal SG Broker Worker Node(1) Compute Element(2) e-scientist Worker Node(2) DG admin submit application’s implementation SG->DG Bridge Compute Element(n) Desktop Grid Server Worker Node(m) 2010-04-27 13
6 1 2 5 4 3 University of Westminster Local Desktop Grid DG clients: New Cavendish St 576 nodes Marylebone Campus 559 nodes Regent Street 395 nodes Wells Street 31 nodes Little Titchfield St 66 nodes Harrow Campus 254 nodes • Lifecycle of a DG node: • PCs basically used by students/staff • If unused, switch to Desktop Grid mode • No more work from DG server -> shutdown (green solution)
In Silico Docking User Scenario • Research objectives: • Constructing a library of tens of thousands of small molecule candidates available in databases (eg. DrugBank) and preparing PDBQT files • To be screened against known targets using AutodockVina • Small molecule library will be made available to other researchers • Promising candidates can be validated in vitro dpf file map files AUTOGRID AUTODOCK AUTODOCK pdbqt file AUTODOCK AUTODOCK gpf file prepare_ligand4.py AUTODOCK pdb file (ligand) pdbqt file prepare_receptor4.py Bio Scientist pdb file (receptor) dlg files SCRIPT2 SCRIPT1 best dlg files pdb file
In-Silico Docking Workflow receptor.pdb gpf descriptor file Autogrid executables, Scripts (uploaded by the developer , don’t change it) number of work units ligand.pdb dpf descriptor file The Generator job creates specified numbered of AutoDock jobs. The AutoGrid job creates pdbqt files from the pdb files, runs the autogrid application and generates the map files. Zips them into an archive file. This archive will be the input of all AutoDock jobs. The AutoDock jobs are running on the Desktop Grid. As output they provide dlg files. The Collector job collects the dlg files. Takes the best results and concatenates them into a pdb file. output pdb file dlg files
EDGI Docking Portal • Free access to pre-deployed molecular docking “primitive” scenarios running on the EDGI infrastructure • Random blind docking and virtual screening • DG versions of applications are coming from the EDGI AR • Docking workflows are executed on the EDGeS@home Desktop Grid 17
Conclusions • Computer Scientists • They created the combined desktop grid and service grid infrastructure where e-scientists can run their application on • The EDGI Application Repository and Portal is able to support application developers, e-scientists and application validators • Bio Scientists • The EDGI infrastructure can provide potential for unlimited computational power to the biologists • They can offer access to methodology (application porting) and tools (portal and repository) • They have a library of small molecules available for screening and access to Chip based technology