1 / 28

Cancer Bioinformatics Infrastructure Objects

Cancer Bioinformatics Infrastructure Objects. An open-source architecture for Biomedical Informatics. Peter A. Covitz Himanso Sahni Scott Gustafson. NCI Center for Bioinformatics. Established in early 2001 by Kenneth Buetow from the NCI LPG

ramona
Télécharger la présentation

Cancer Bioinformatics Infrastructure Objects

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Cancer Bioinformatics Infrastructure Objects An open-source architecture for Biomedical Informatics Peter A. Covitz Himanso Sahni Scott Gustafson

  2. NCI Center for Bioinformatics • Established in early 2001 by Kenneth Buetow from the NCI LPG • Folded in several existing programs, launched new ones • Mission to bridge basic and clinical research via bio- medical- informatics

  3. NCICB-supported initiatives • Cancer Genome Anatomy Project • ESTs, SAGE, SNPs, Pathways • Director’s Challenge, Molecular Signatures of Cancer • Microarray • Mouse Models of Human Cancer Consortium • Pathology, microarray • Clinical Trials • Protocols, agents, targets, patient cases

  4. caCORE • Core Infrastructure for biomedical application development • NCI Enterprise Vocabulary Services Thesauri and Ontologies • caBIO Object-modeled middleware and APIs • Cancer Data Standards Repository (caDSR) Metadata

  5. caBIO Overview • caBIO modeled in UML and implemented in J2EE • Genomics, Genetics • Clinical Trials • Access to a variety of data sources hosted at NCICB

  6. Development Process: UML • Use Cases • Class Diagrams • Sequence Diagrams • Iterative Development

  7. Object Model

  8. Java Packages • gov.nih.nci.caBIO.bean • 45 domain objects • gov.nih.nci.caBIO.util.das • 21 domain objects • Used JAXB to convert DAS DTDs to objects • gov.nih.nci.caBIO.evs • 3 domain objects • gov.nih.nci.caBIO.servlet • gov.nih.nci.caBIO.util • gov.nih.nci.caBIO.webservices

  9. Architecture Clients Presentation Layer Data Sources Object Layer Web Server Servlet Container Databases JSPs HTML/HTTP Data Access Objects Servlets Object Managers Browsers SOAP Engine JDBC XML/ HTTP RMI Other Apps HTTP UI Bean Domain Objects SOAP XML Builder URLs Chromosomes Genes XSLT Engine FTP Tissues Clusters Agents RDF Flat Files Libraries Sequences DTDs XML Docs Diseases XSL Style Sheet Java Apps Other

  10. Data Sources - NCI • Cancer Genome Anatomy Project • EST and SAGE library and expression data • Sequence trace files and verified clones • SNPs • Clinical Trials • Drug Agents • Trials • NCI Enterprise Vocabulary Services • Microarray data via NCI Gene Expression Data Portal

  11. Data Sources - External • Unigene (NCBI) • LocusLink (NCBI) • Homologene (NCBI) • Gene Ontology • Human Genome via DAS (UCSC) • BioCarta Pathways • GIFs re-processed into SVGs to support interactivity

  12. caBIO Powers Applications http://cmap.nci.nih.gov

  13. caBIO APIs • Java • NCICB-hosted data sources via RMI • SOAP • SOAP client in any language/environment can send request to NCICB server for object data • SOAP-XML envelope and payload returned • HTTP-URL • Properly formed URLs in any web browser/client can retrieve XML-formatted object data directly

  14. Java API Domain objects have companion SearchCriteria objects Gene myGene = new Gene(); GeneSearchCriteria criteria = new GeneSearchCriteria(); criteria.setSymbol("pTEN"); SearchResult result = myGene.search(criteria); Gene[] genes = (Gene[]) result.getResultSet();

  15. Nested SearchCriteria • SearchCriteria from one object type can be fed as parameters into SearchCriteria of another type. • Complex queries without any SQL

  16. Complex Queries on Objects Find me the Pathways, with Genes that are expressed, in tissues with a particular Histopathology that includes a particular Organ and a particular Disease.

  17. Traverse Relationships in Model INPUT Disease Histopathology Genes Organ Pathways OUTPUT

  18. findPathway Input disease, organ; create SearchCriteria Objects: public Pathway[] findPathway(String disease, String organ) { DiseaseSearchCriteria diseaseCriteria = new DiseaseSearchCriteria(); OrganSearchCriteria organCriteria = new OrganSearchCriteria(); HistopathologySearchCriteria histoCriteria = new HistopathologySearchCriteria(); GeneSearchCriteria geneCriteria = new GeneSearchCriteria(); PathwaySearchCriteria pathCriteria = new PathwaySearchCriteria();

  19. findPathway Nest the SearchCriteria, then do the search: . diseaseCriteria.setName(disease); organCriteria.setName(organ); histoCriteria.putSearchCriteria(diseaseCriteria,CriteriaElement.AND); histoCriteria.putSearchCriteria(organCriteria, CriteriaElement.AND); geneCriteria.putSearchCriteria(histoCriteria, CriteriaElement.AND); pathCriteria.putSearchCriteria(geneCriteria, CriteriaElement.AND); Pathway myPathway = new Pathway(); myPathwayList = myPathway.searchPathways(pathCriteria); }

  20. Web Services: SOAP

  21. SOAP API Perl Example use SOAP::Lite; $s = SOAP::Lite ->uri(urn:nci-gene-service) ->proxy("http:caBIO.nci.nih.gov:5080:/soap/servlet/rpcrouter"); my %searchCriteria=(); $searchCriteria{symbol}=“pTEN”; $som=$s->getGenes(SOAP::Data->type(map =>\%searchCriteria)); $xmldoc = $som->result;

  22. SOAP output with xlinks <?xml version="1.0" encoding="UTF-8" ?> <nci-core> - <gov.nih.nci.caBIO.bean.Gene id="2221" xmlns:xlink="http://www.w3.org/1999/xlink/"> <name>PTEN</name> <title>phosphatase and tensin homolog (mutated in multiple advanced cancers 1)</title> <dbCrossRefs>{LOCUS_LINK=5728, OMIM=601728, UNIGENE=10712}</dbCrossRefs> <Pathwayxlink:href= "http://lpgprot101.nci.nih.gov:5080/CORE/GetXML?operation=Pathway&GeneId=2221" /> [Additional xlinks for ExpressionExperiment, Organ, Chromosome, GeneHomolog, Sequence, Gene Alias, Protein, SNP, and MapLocation] </gov.nih.nci.caBIO.bean.Gene> [2 Additional Genes with “PTEN” in their name] - <searchResult> <hasMore>false</hasMore> <startsAt>1</startsAt><endsAt>3</endsAt> </searchResult> </nci-core>

  23. SOAP with returnHeavyXML Data is now returned in full. Here is a Pathway object snippet: <gov.nih.nci.caBIO.bean.Pathway id="92"> <name>ptenPathway</name> <displayValue>PTEN Dependent Cell Cycle Arrest and Apoptosis</displayValue> <pathwayDiagram>ptenPathway.svg</pathwayDiagram> </gov.nih.nci.caBIO.bean.Pathway>

  24. HTTP API Direct access to XML-formatted data via URLs: http://lpgprot101.nci.nih.gov:5080/CORE/GetXML ?operation=Gene&Symbol=pTEN Method Parameter Value Search Parameter

  25. Status and Future • caBIO/caCORE 1.0, late August 2002 • Updated caBIO.JAR and examples • Technical Guide • Stable data servers • Next Release • More domain objects, standard data sources (especially proteins) • Adaptor to add local data sources • More Web Services protocols (WSDL?) • Whatever you guys contribute!

  26. caBIO Team NCI Center for Bioinformatics NCI Laboratory of Population Genetics Science Applications International Corporation (SAIC)

  27. NCI Kenneth Buetow Carl Schaefer Robert Clifford Mike Edmonson SAIC Himanso Sahni Scott Gustafson Sharon Settnek Mike Connolly Brian Levine Acknowledgements

  28. Info, and mailing lists • caCORE http://ncicb.nci.nih.gov/core • caBIO users list http://list.nih.gov/archives/cabio_users.html • caBIO developers list http://list.nih.gov/archives/cabio_developers.html

More Related