1 / 18

Wrapping analytical services for caBIG

Wrapping analytical services for caBIG. Taverna-caGrid technical review meeting. Stian Soiland-Reyes, myGrid University of Manchester, UK 2009-01-23. http://www.mygrid.org.uk/dev/wiki/display/caGrid. Agenda. Project overview Primary goals Service selection Services identified

keenan
Télécharger la présentation

Wrapping analytical services for caBIG

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Wrapping analytical services for caBIG • Taverna-caGrid technical review meeting Stian Soiland-Reyes, myGridUniversity of Manchester, UK 2009-01-23 http://www.mygrid.org.uk/dev/wiki/display/caGrid

  2. Agenda • Project overview • Primary goals • Service selection • Services identified • Architecture • Service outputs • Service outputs • UML model • Template workflow • Work so far • Implementation plan

  3. Project overview • Taverna caGrid cooperation • Taverna workbench enhancements for caGrid • Grid-enabling analytical services • caGrid security support for Taverna • This presentation deals with the analytical services

  4. Primary goals • Identify two publicly available analytical web services currently accessible through Taverna • caGrid-enable the services; semantically described using caBIG’s infrastructure • Demonstrate building of workflows combining the new services with existing caBIG services

  5. Service selection • Selected services in collaboration with the caGrid Workflow working group, lead by Juli • Winners: • NCBI Blast hosted by EBI • InterProScan hosted by EBI

  6. Why these services? • Freely available • Highly reliable, hosted by EBI • Widely used by the scientific community • Can be combined with existing caBIG tools in biologically meaningful workflows • caBIO, GridPIR, etc.

  7. Services identified • NCBI Blast • A popular similarity search tool using local sequence alignment • Supports sequences of proteins, DNA, RNA • Searches sequences in a whole range of databases • SWISSPROT, UNIPROT, NCBI, EMBL, etc. • SOAP web service hosted by EMBL-EBI

  8. Services identified • InterProScan • Integrates various databases of protein domains and functional sites • Searches using protein signature recognition methods • SOAP web service hosted by EMBL-EBI

  9. Architecture

  10. Architecture as pseudo code • class CaGridClient: • def main(): • endpointReference = wrappedService.invoke(inputs) • endpointReference.subscribe() • def resourcePropertyChanged(): • outputs = endpointReference.getResourceProperty() • print"Result", outputs • class WrappedService: • def invoke(inputs): • convertedInputs = dataConverter.convertFromCaGrid(inputs) • jobId = serviceInvoker.invoke(convertedInputs) • endpointReference = new EndpointReference(jobId) • return endpointReference • def outputReturned(jobId, outputs): • convertedOutputs = dataConverter.convertToCaGrid(outputs) • endpointReference.setResourceProperty(convertedOutputs) • class ServiceInvoker: • def invoke(convertedInputs): • jobId = originalService.invoke(convertedInputs) • return jobId

  11. Output InterProScan (Untranslated) <EBIInterProScanResults xmlns="http://www.ebi.ac.uk/schema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="http://www.ebi.ac.uk/schema/InterProScanResult.xsd"> <Header>..</Header> <interpro_matches> <protein id="uniprot|P01174|WAP_RAT" length="137" crc64="1C2E8ADA9FD97949" > <interpro id="IPR008197" name="Whey acidic protein, 4-disulphide core" type="Domain" parent_id="IPR015874"> <child_list><rel_ref ipr_ref="IPR008198"/></child_list> <contains><rel_ref ipr_ref="IPR002098"/></contains> <classification id="GO:0030414" class_type="GO"> <category>Molecular Function</category> <description>protease inhibitor activity</description> </classification> <match id="G3DSA:4.10.75.10" name="Whey_acidic_protein_4-diS_core" dbname="GENE3D"> <location start="77" end="128" score="9.899996308397199E-5" status="T" evidence="Gene3D" /> </match> <match id="PF00095" name="WAP" dbname="PFAM"> <location start="30" end="72" score="6.30000254573025E-5" status="T" evidence="HMMPfam" /> <location start="79" end="126" score="1.59999889349247E-14" status="T" evidence="HMMPfam" /> </match> </interpro> <interpro id="IPR008198" name="Proteinase inhibitor I17" type="Domain" parent_id="IPR008197"> ...</interpro> </protein> </interpro_matches> </EBIInterProScanResults>

  12. Output InterProScan (Untranslated) <EBIInterProScanResults xmlns="http://www.ebi.ac.uk/schema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="http://www.ebi.ac.uk/schema/InterProScanResult.xsd"> <Header>..</Header> <interpro_matches> <protein id="uniprot|P01174|WAP_RAT" length="137" crc64="1C2E8ADA9FD97949" > <interpro id="IPR008197" name="Whey acidic protein, 4-disulphide core" type="Domain" parent_id="IPR015874"> <child_list><rel_ref ipr_ref="IPR008198"/></child_list> <contains><rel_ref ipr_ref="IPR002098"/></contains> <classification id="GO:0030414" class_type="GO"> <category>Molecular Function</category> <description>protease inhibitor activity</description> </classification> <match id="G3DSA:4.10.75.10" name="Whey_acidic_protein_4-diS_core" dbname="GENE3D"> <location start="77" end="128" score="9.899996308397199E-5" status="T" evidence="Gene3D" /> </match> <match id="PF00095" name="WAP" dbname="PFAM"> <location start="30" end="72" score="6.30000254573025E-5" status="T" evidence="HMMPfam" /> <location start="79" end="126" score="1.59999889349247E-14" status="T" evidence="HMMPfam" /> </match> </interpro> <interpro id="IPR008198" name="Proteinase inhibitor I17" type="Domain" parent_id="IPR008197"> ...</interpro> </protein> </interpro_matches> </EBIInterProScanResults>

  13. UML model: wrapped InterproScan

  14. UML model: wrapped NCBIBlast

  15. Template workflow EBI_dbfetch_fetchBatch will be replaced with the caBIG service caBIO This workflow uses both NCBIBlast and InterproScan which will be replaced with the wrapped services http://www.myexperiment.org/workflows/230

  16. Work so far • Identified services and example workflow • Described services (Deliverable 3.2) • Modelled service inputs and outputs in UML according to caGrid guidelines • Still a few tweaks needed for WS-Resource usage • Architecture and implementation plan for wrapping services (Deliverable 3.3) • JavaDoc needs updating for WS-Resource

  17. Implementation plan • Generate Common Data Elements for inputs and outputs and verify Silver compatability • Generate semantically annotated XMIs • Submit Silver compatability review package • Implement and deploy wrapped services • Using Introduce and possibly gRavi • Implement, test , deploy • We’ll start with this before submitting CDEs • Build caGrid-based workflow using services

  18. Any questions..?

More Related