1 / 57

BridgeDb

BridgeDb. Martijn van Iersel BiGCaT Maastricht. The 7 Virtues of Bioinformatics. Solve a problem Start small Modularity Design for code re-use Open Source Attention to detail Eat your own dog-food. Solve a problem. What problem are you solving?. Problem: Identifier Mapping.

aisha
Télécharger la présentation

BridgeDb

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. BridgeDb Martijn van Iersel BiGCaT Maastricht

  2. The 7 Virtues of Bioinformatics • Solve a problem • Start small • Modularity • Design for code re-use • Open Source • Attention to detail • Eat your own dog-food

  3. Solve a problem • What problem are you solving?

  4. Problem: Identifier Mapping Entrez Gene 3643 ? Agilent reporter A46_P45789

  5. Solution: Conversion tools

  6. Problem: Usability • Check for double IDs • Check for missing IDs • Only 1000 at once • Check alignment of Excel columns • Manual • Error-prone

  7. Solution: Built-in Mapping • Generic bioinformatics platforms should have identifier mapping built-in. BioConductor PathVisio Cytoscape ... Batteries Included

  8. Solution: Built-in Mapping Entrez Gene 3643 Mappingservice Agilent reporter A46_P45789

  9. Problem: Which mapping service? • Synergizer • EnsMart • DAVID • CRONOS • AliasServer • MatchMiner • OntoTranslate

  10. Solution: Abstraction Layer

  11. Solution: Abstraction Layer classIDMapperRdbrelational database interfaceIDMapper class IDMapperFiletab-delimited text classIDMapperBiomart web service

  12. CyThe-saurus NetworkMerge WikiPathways PathVisio Tools Cytoscape Plugins BridgeDb Internet webservices LocalDatabase Tab-delimitedtext files MappingServices BioMart PICR BridgeDb-REST BMC Bioinformatics. 2010 Jan 4;11(1):5

  13. 1: JAVA interface 2: REST interface BridgeDb interface

  14. API Overview BridgeDb.connect(...) IDMapper.mapID(...) Xref.getUrl() DataSource.getUrl()

  15. Easy & Flexible Code

  16. Easy & Flexible Code

  17. Easy & Flexible Code

  18. 1: JAVA interface 2: REST interface BridgeDb interface

  19. REST API http://webservice.bridgedb.org/Human/xrefs/L/1234 ILMN_1713029 Illumina 3255967 Affy NP_001025186 RefSeq IPI00005930 IPI GO:0042752 GeneOntology NM_033282 RefSeq 3255968 Affy 94233 Entrez Gene ENSG00000122375Ensembl Human 234226_at Affy A6NEB4 Uniprot/TrEMBL 0001780601 Illumina GO:0008020 GeneOntology 606665 OMIM A_23_P24234 Agilent 14449 HUGO

  20. REST API http://<Base URL>/<Species>/<function> [ /<argument> ... ]\ http://webservice.bridgedb.org/Human/xrefs/L/1234 http://webservice.bridgedb.org/Human/search/ENSG00000122375 http://webservice.bridgedb.org/Human/attributeSet http://webservice.bridgedb.org/Human/properties http://webservice.bridgedb.org/Human/targetDataSources http://webservice.bridgedb.org/Human/attributes/L/3643 http://localhost:8183/Human/xrefs/L/3643

  21. R Example

  22. Types of Mapping Services

  23. Available Mapping Services

  24. Problem: Custom Microarrays ? Custom probe #QXZCY!34

  25. EnsMart Custom table Solution: Stacking

  26. Entrez Custom microarray Ensembl Relation defined by mapping source A Relation defined by mapping source B Inferred, transitive relationship

  27. Comparison

  28. Comparison

  29. CyThesaurus

  30. MIRIAM Resources http://www.ebi.ac.uk/miriam/

  31. Solution: MIRIAM Resources Regular expression for autodetection Pattern for generating URLs Link to documentation

  32. The 7 Virtues of Bioinformatics • Solve a problem • Start small • Eat your own dog-food • Attention to detail • Modularity • Design for code re-use • Open Source

  33. A Question to Linus Torvalds Q: “Do you have any tips for people who want to undertake a large open source project?” A: “Nobody should start to undertake a large project. You start with a small trivial project, and you should never expect it to get large.… … If it doesn't solve some fairly immediate need, it's almost certainly over-designed.… …You need to get something half-way useful first, and then others will say "hey, that almost works for me", and they'll get involved in the project”

  34. Also from Linus Torvalds “I'm right and anyone who disagrees is stupid and ugly” “My name is Linus Torvalds and I am your god.”

  35. Code Re-Use BAD Bioinformatician No Twinkie For you! • Reinventing the wheel is one of the 7 Deadly sins of Bioinformatics

  36. Code Re-Use

  37. Code Re-Use Q: How to design re-usable code? A: Actually use it in more than one project from the start Cytoscape bridgedb PathVisio

  38. Modularity

  39. Modularity

  40. Modularity

  41. Open source • Public money -> Public code • Reproducibility • Academic ideal • Trust • Insurance against vendor lock-in

  42. Open source • Now where are all those free programmers?

  43. Open Source Web site Bug tracker Version control Mailing list

  44. http://www.helixsoft.nl/blog

  45. Eat your own dog food

  46. Eat your own dog food • Are you named “alkfdjlkdsf”? • Why not “Hélène O’Brian?” • …or “Bobby Tables”?

  47. Eat your own dog food • Real data has missing values • Real data has commas instead of dots • Real data has duplicate identifiers • Real data starts with “ID” in the first cell* *Which Excel doesn’t like

  48. User friendliness

  49. User friendliness

  50. Hallway usability testing • Grab a passer-by from the hallway and put them in front of your program • (We usually use students)

More Related