1 / 55

BioMOBY Services

BioMOBY Services. Enrique de Andrés. Outline. The problem The BioMOBY idea BioMOBY ontologies How BioMOBY works Message exchanges BioMOBY elements. The problem…. Scientific work requires: Data resources: Genomic sequences, protein sets, expression data, … Computational resource:

afram
Télécharger la présentation

BioMOBY Services

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. BioMOBY Services Enrique de Andrés

  2. Outline • The problem • The BioMOBY idea • BioMOBY ontologies • How BioMOBY works • Message exchanges • BioMOBY elements BioMOBY Services

  3. The problem… • Scientific work requires: • Data resources: • Genomic sequences, protein sets, expression data, … • Computational resource: • Similarity searches, alignments, domain prediction, functional classification, clustering, … • Often, these resources are existent and available, but: • Hard to find. • Distributed all over the world. • No common format. BioMOBY Services

  4. Result… painful research! BioMOBY Services

  5. Solution… • Web Services: • Provides data or computational resources over the WWW. • Can be accessed automatically: • application-centric web • Additional advantages: • Works for every one who has internet access • No firewall obstacles, … • Independent of programming languages. • Usage of broadly accepted protocols. BioMOBY Services

  6. Outline • The problem • The BioMOBY idea • BioMOBY ontologies • How BioMOBY works • Message exchanges • BioMOBY elements BioMOBY Services

  7. BioMOBY • BioMOBY was initiated in 2001 as collaboration of some model organism database providers. • System for interoperability between biological data hosts and analytical services. • Simple, open source platform for discovery, integration, representation and retrieval of biological data. • Two branches: • MOBY-S: follows the Web Service paradigm. • S-MOBY: using semantic web technology (not covered here). BioMOBY Services

  8. The MOBY-S plan • Create an ontology of bioinformatics data-types. • Define a serialization of this ontology (data syntax). • Create an open API over this ontology (let independent service providers build data-types). • Define Web Service inputs and outputs using that ontology • Register services in an ontology-aware registry. • BioMOBY advantages: • Machines can find an appropriate service. • Machines can execute that service unattended. • Ontology is community-extensible BioMOBY Services

  9. MOBY-S vs. General WS • The registry is the MOBY-Central. • Usage of ontologies. • BioMOBY services operate on MOBY objects. • Usage of namespaces. • Own messaging structure for registration, detection and invocation of services BioMOBY Services

  10. Outline • The problem • The BioMOBY idea • BioMOBY ontologies • Object ontology • Service ontology • How BioMOBY works • Message exchanges • BioMOBY elements BioMOBY Services

  11. BioMOBY ontology • Ontology: • A formally defined system of things and relations between these things for representation of knowledge. • Usually, an ontology builds a hierarchy of objects to describe relations in a certain domain. • BioMOBY ontology: • Usage of namespaces. • Object (data) ontology: • Semantic/syntactic data-types. • Service ontology. BioMOBY Services

  12. Object ontology • Any identifiable piece of data is an “entity”. • Identifiers for these entities fall under “Namespaces” • NCBI has gi numbers (gi namespace) • GO terms have accession numbers (GO namespace) • Namespaces indicate data’s semantic type. • GO:0003476 a Gene Ontology Term • gi|163483 a GenBank record • Namespace + ID precisely specifies a data “entity” • Identifiers are not opaque – they are semantically rich BioMOBY Services

  13. node Edge node Object ontology • Data types defined in an open, shared GO-like ontology: • GO used as a model because of its familiarity in the community. • Nodes define data classes • Edges define the relationships between classes. • Edges define one of three relationships: • ISA: • Inheritance relationship. • All properties of the parent are present in the child. • HASA: • Container relationship of exactly 1. • HAS: • Container relationship with 1 or more BioMOBY Services

  14. The simplest MOBY data-type <Object namespace=‘NCBI_gi’ id=‘111076’/> The combination of a namespace and an identifier within that namespace uniquely identify a data entity, not its location(s), nor its representation Object BioMOBY Services

  15. Primitive Data-types ISA DateTime ISA Float ISA Integer <Integer namespace=‘’ id=‘’>38</Integer> Object ISA String BioMOBY Services

  16. Derived data-types ISA Integer HASA ISA Object String ISA Virtual Sequence <VirtualSequence namespace=‘NCBI_gi’ id=‘111076’> <Integer namespace=‘’ id=‘’ articleName=“length”>38</Integer> </VirtualSequence > BioMOBY Services

  17. Derived data-types ISA Integer HASA HASA ISA Object String ISA ISA Virtual Sequence Generic Sequence <GenericSequence namespace=‘NCBI_gi’ id=‘111076’> <Integer namespace=‘’ id=‘’ articleName=“length”>38</Integer> <String namespace=‘’ id=‘’ articleName=“SequenceString”> ATGATGATAGATAGAGGGCCCGGCGCGCGCGCGCGC </String> </ GenericSequence > BioMOBY Services

  18. Derived data-types ISA Integer HASA HASA ISA Object String ISA ISA ISA Virtual Sequence Generic Sequence DNA Sequence <DNASequence namespace=‘NCBI_gi’ id=‘111076’> <Integer namespace=‘’ id=‘’ articleName=“length”>38</Integer> <String namespace=‘’ id=‘’ articleName=“SequenceString”> ATGATGATAGATAGAGGGCCCGGCGCGCGCGCGCGC </String> </ DNASequence > BioMOBY Services

  19. Legacy file formats • Containing “String” allow us to define ontological classes that represent legacy data-types. • <NCBI_Blast_Report namespace=‘NCBI_gi’ id=‘115325’> • <String namespace=‘’ id=‘’ articleName=‘content’> • TBLASTN 2.0.4 [Feb-24-1998] • Reference: Altschul, Stephen F., Thomas L. Madden, Alejandro A. • Sch&auml;ffer, Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman • (1997), "Gapped BLAST and PSI-BLAST: a new generation of protein database search • programs", Nucleic Acids Res. 25:3389-3402. • Query= gi|1401126 • (504 letters) • Database: Non-redundant GenBank+EMBL+DDBJ+PDB sequences • 336,723 sequences; 677,679,054 total letters • Searchingdone • Score E • Sequences producing significant alignments: (bits) Value • gb|U49928|HSU49928 Homo sapiens TAK1 binding protein (TAB1) mRNA... 1009 0.0 • emb|Z36985|PTPP2CMR P.tetraurelia mRNA for protein phosphatase t... 58 4e-07 • </String> • </NCBI_Blast_Report> BioMOBY Services

  20. Binaries – pictures, movies, … • We base64 encode binaries, and then define a hierarchy of data classes that Contain String • base64_encoded_jpeg ISA text/base64 ISA text/plain HASA String • <base64_encoded_jpeg namespace=‘TAIR_image’ id=‘3343532’> • <String namespace=‘’ id=‘’ articleName=‘content’> • MIAGCSqGSIb3DQEHAqCAMIACAQExCzAJBgUrDgMCGgUAMIAGCSqGSIb3DQEHAQAAoIIJQDCC • Av4wggJnoAMCAQICAwhH9jANBgkqhkiG9w0BAQQFADCBkjELMAkGA1UEBhMCWkExFTATBgNV • MIAGCSqGSIb3DQEHAqCAMIACAQExCzAJBgUrDgMCGgUAMIAGCSqGSIb3DQEHAQAAoIIJQDCC • Av4wggJnoAMCAQICAwhH9jANBgkqhkiG9w0BAQQFADCBkjELMAkGA1UEBhMCWkExFTATBgNV • BAgTDFdlc3Rlcm4gQ2FwZTESMBAGA1UEBxMJQ2FwZSBUb3duMQ8wDQYDVQQKEwZUaGF3dGUx • HTAbBgNVBAsTFENlcnRpZmljYXRlIFNlcnZpY2VzMSgwJgYDVQQDEx9QZXJzb25hbCBGcmVl • bWFpbCBSU0EgMjAwMC44LjMwMB4XDTAyMDkxNTIxMDkwMVoXDTAzMDkxNTIxMDkwMVowQjEf • MB0GA1UEAxMWVGhhd3RlIEZyZWVtYWlsIE1lbWJlcjEfMB0GCSqGSIb3DQEJARYQamprM0Bt • </String> • </base64_encoded_jpeg> BioMOBY Services

  21. Extending legacy data-types • With legacy data-types defined, we can extend them as we see fit • annotated_jpeg ISA base64_encoded_jpeg • annotated_jpeg HASA 2D_Coordinate_set • annotated_jpeg HASA Description • <annotated_jpeg namespace=‘TAIR_Image’ id=‘3343532’> • <2D_Coordinate_set namespace=‘’ id=‘’ articleName=“pixelCoordinates”> • <Integer namespace=‘’ id=‘’ articleName=“x_coordinate”>3554</Integer> • <Integer namespace=‘’ id=‘’ articleName=“y_coordinate”>663</Integer> • </2D_Coordinate_set> • <String namespace=‘’ id=‘’ articleName=“Description”> • This is the phenotype of a ufo-1 mutant under long daylength, 16’C • </String> • <String namespace=‘’ id=‘’ articleName=“content”> • MIAGCSqGSIb3DQEHAqCAMIACAQExCzAJBgUrDgMCGgUAMIAGCSqGSIb3DQEHAQAAoIIJQDCC • Av4wggJnoAMCAQICAwhH9jANBgkqhkiG9w0BAQQFADCBkjELMAkGA1UEBhMCWkExFTATBgNV • </String> • </annotated_jpeg> BioMOBY Services

  22. Additional information • Information Blocks provides the ability of including additional information into the objects • Cross Reference Information Blocks (CRIB) • Provision Information Blocks (PIB) • <annotated_jpeg namespace=‘TAIR_Image’ id=‘3343532’> • <CrossReference> • <Object namespace=“TAIR_Allele” id=“ufo-1”/> • </CrossReference> • <2D_Coordinate_set namespace=‘’ id=‘’ articleName=“pixelCoordinates”> • <Integer namespace=‘’ id=‘’ articleName=“x_coordinate”>3554</Integer> • <Integer namespace=‘’ id=‘’ articleName=“y_coordinate”>663</Integer> • </2D_Coordinate_set> • <String namespace=‘’ id=‘’ articleName=“Description”> • This is the phenotype of a ufo-1 mutant under long daylength, 16’C • </String> • <String namespace=‘’ id=‘’ articleName=“content”> • MIAGCSqGSIb3DQEHAqCAMIACAQExCzAJBgUrDgMCGgUAMIAGCSqGSIb3DQEHAQAAoIIJQDCC • Av4wggJnoAMCAQICAwhH9jANBgkqhkiG9w0BAQQFADCBkjELMAkGA1UEBhMCWkExFTATBgNV • </String> • </annotated_jpeg> BioMOBY Services

  23. Cross Reference Information Blocks (CRIB) • <CrossReference> • ... one or more cross-references... • </CrossReference> • Content of the CRIB may include only two types of element: • A base MOBY Object ('Object' Class) • cross-referenced piece of data • An Xref type Cross-Reference object • service which could be executed in order to interpret the meaning of the piece of data <Object namespace=‘’ id=‘’/> • <Xref namespace='' id='‘ • authURI='' serviceName='‘ • evidenceCode='' xrefType=''> • ... Description ... • </Xref> BioMOBY Services

  24. Cross Reference Information Blocks (CRIB) • Namespace and id: fulfil the same role as in the Object style cross-reference. • authURI and serviceName: act as a unique identifier to a particular MOBY Service that the current service provider suggests you execute using this cross-reference (namespace/id) in order to correctly interpret its meaning. • xrefType: • should get its value from the Cross-Reference-Type Ontology which defines a variety of semantic relationships that may exist between cross-references and the Objects that contain them. This ontology doesn't exist yet. • now, xrefType’s are free form strings. • evidenceCode: indicates the 'quality' of the evidence that was used to make the cross-reference assertion. It is a term from the GO evidence codes list: • IC: Inferred by Curator • IDA: Inferred from Direct Assay • … BioMOBY Services

  25. Cross Reference Information Blocks (CRIB) • <moby:CrossReference> • <moby:Object moby:namespace="PMID" moby:id="12511062"/> • <moby:Object moby:namespace="PMID" moby:id="12075666"/> • <moby:Xref moby:namespace="EMBL“ • moby:id="X112345“ • authURI="www.illuminae.com“ • serviceName="getEMBLRecord" • evidenceCode="IEA“ • xrefType="transform"/> • </moby:CrossReference> BioMOBY Services

  26. Provision Information Blocks (PIB) • Contains metadata concerning the service that was invoked: • database version, software version, execution time • additional parameters used to invoke the service, ... • In the current MOBY API, the content of these elements is only loosely defined, and is meant primarily to be human-readable. • <ProvisionInformation> • ... one or more of the provision elements (below) ... • </ProvisionInformation> <serviceSoftware software_name="" software_version="" software_comment=""/> <serviceDatabase database_name="" database_version="" database_comment=""/> <serviceComment>comment here</serviceComment> BioMOBY Services

  27. Service ontology • Simple ISA hierarchy. • Primitive types include, but it can be modified: • Analysis • Parsing • Registration • Retrieval • Resolution • Conversion • Rendering BioMOBY Services

  28. Service ontology Parse_NCBI_Blast Parsing Service WU_Blast Analysis Alignment Blast NCBI_Blast BioMOBY Services

  29. Outline • The problem • The BioMOBY idea • BioMOBY ontologies • How BioMOBY works • Message exchanges • BioMOBY elements BioMOBY Services

  30. 3) Service discovery 2) Service publication 4) Service request 5) Service response How BioMOBY works Technologically, BioMOBY services are general Web Services 1) Service development BioMOBY Services

  31. How BioMOBY works • BioMOBY defines a new layer on the protocol stack in order to work with its ontology. • BioMOBY has its own messaging structure for registration, detection and invocation of services BioMOBY Services

  32. Outline • The problem • The BioMOBY idea • BioMOBY ontologies • How BioMOBY works • Message exchanges • BioMOBY elements BioMOBY Services

  33. BioMOBY service request 0 BioMOBY service request N BioMOBY service request 1 Primary articles (simples / collections) Primary articles (simples / collections) Primary articles (simples / collections) Primary articles (simples / collections) Primary articles (simples / collections) Primary articles (simples / collections) Secondary articles Secondary articles Secondary articles Biological Data Service Description 1 input parameter containing the full XML BioMOBY input 1 output parameter containing the full XML BioMOBY output XML Message Moby Network WSDL SOAP HTTP TCP / IP Client-Provider interaction BioMOBY Services

  34. Client → Provider messages • <?xml version="1.0" encoding="UTF-8"?> • <MOBY xmlns="http://www.biomoby.org/moby"> • <mobyContent> • <mobyData queryID=‘0'> • <!– Primary/Secondary articles --> • </mobyData> • <mobyData queryID=“1"> • <!– Primary/Secondary articles --> • </mobyData> • … • <mobyData queryID=“N"> • <!– Primary/Secondary articles --> • </mobyData> • </mobyContent> • </MOBY> BioMOBY service request 0 SEVERAL SERVICE REQUESTS INTO ONE INVOCATION BioMOBY service request 1 BioMOBY service request N BioMOBY Services

  35. Provider → Client messages • <?xml version="1.0" encoding="UTF-8"?> • <MOBY xmlns="http://www.biomoby.org/moby"> • <mobyContent> • <mobyData queryID=‘0'> • <!– Primary articles --> • </mobyData> • <mobyData queryID=“1"> • <!– Primary articles --> • </mobyData> • … • <mobyData queryID=“N"> • <!– Primary articles --> • </mobyData> • </mobyContent> • </MOBY> BioMOBY service response 0 SEVERAL SERVICE RESPONSES INTO ONE INVOCATION RESPONSE BioMOBY service response 1 BioMOBY service response N BioMOBY Services

  36. Elemental requests/responses • <mobyData queryID=‘0'> • <Simple articleName=“in_or_out_data_name_0”> • <!– object from the ontology --> • </Simple> • … • <Collection articleName=“in_or_out_data_name_1”> • <Simple> • <!– object from the ontology --> • </Simple> • … • </Collection> • … • <Parameter articleName=“in_param_name_0”> • <Value>param_value</Value> • </Parameter> • … • </mobyData> BioMOBY Services

  37. Global service information • Global service information block: serviceNotes • <?xml version="1.0" encoding="UTF-8"?> • <MOBY xmlns="http://www.biomoby.org/moby"> • <mobyContent> • <serviceNotes> • <Notes>Free text Service Notes</Notes> • <serviceNotes> • … • </mobyContent> • </MOBY> BioMOBY Services

  38. Error handling • Extension of the global service information block (serviceNotes) (optional) refers to the queryID of the offending input mobyData error: fatal error in the service warning: service detects an error or potential problem but continues information: non erroneous informative message (optional) refers to the article of the offending input simple or collection • <serviceNotes> • <mobyException severity=“” refQueryID=“” refElement=“”> • <exceptionCode>code</exceptionCode> • <exceptionMessage>message</exceptionMessage> • </mobyException> • <Notes>Free text Service Notes</Notes> • </serviceNotes> BioMOBY Services

  39. Error handling: example response • <?xml version="1.0" encoding="UTF-8"?> • <MOBY xmlns="http://www.biomoby.org/moby"> • <mobyContent> • <serviceNotes> • <mobyException refElement="“ refQueryID="1“ severity ="error"> • <exceptionCode>600</exceptionCode> • <exceptionMessage>Unable to execute the service</exceptionMessage> • </mobyException> • <Notes>Free text Service Notes</Notes> • </serviceNotes> • <mobyData queryID="1“ /> • </mobyContent> • </MOBY> BioMOBY Services

  40. Outline • The problem • The BioMOBY idea • BioMOBY ontologies • How BioMOBY works • Message exchanges • BioMOBY elements • MOBY-Central • Client side • Server side BioMOBY Services

  41. BioMOBY Elements BioMOBY Services

  42. The Registry: Moby Central • Moby project provides Moby Central as a Perl server • It is a directory of services, datatypes and how to locate them Worldwide Distribution of MOBY Services BioMOBY Services

  43. Client Side • There are different kind of clients • Some of them allow the creation of workflows Programmatic libraries: BioMOBY Services

  44. Client Side: MOWServ • Web browser based client • Discovery of services based on data type ontology or on service type ontology • It allows to connect easily service outputs to service inputs • Interface helps to the Moby object construction BioMOBY Services

  45. Client Side: MOWServ Data types and service ontologies BioMOBY Services

  46. Client Side: MOWServ 6) Check results 5) Check execution status 1) Ontology browsing & service selection 2) Input submission 3) Selection output name 4) Service submission BioMOBY Services

  47. Client Side: MOWServ Integrated HTML visualizer Raw XML visualizer Download MOBY object List of available services for this datatype object BioMOBY Services

  48. Client Side: Taverna • Java based graphical integrated workbench • It allows the construction of complex distributed workflows • It can handle different kind of services (Moby and others) BioMOBY Services

  49. Client Side: Taverna Processors = Webservices Inputs Outputs BioMOBY Services

  50. Client Side: Dashboard 1) Select client execution tab 2) Select service to execute 4) Execute service 3) Fill up input 5) Check output BioMOBY Services

More Related