1 / 20

CASIMIR WP4 Data Representation

CASIMIR WP4 Data Representation. John Hancock Duncan Davidson. Objectives. Assessment of technical aspects of database interoperability as a barrier to scientific and financial sustainability

jun
Télécharger la présentation

CASIMIR WP4 Data Representation

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CASIMIR WP4Data Representation John Hancock Duncan Davidson CASIMIR Networking Meeting Heathrow, July 2007

  2. Objectives • Assessment of technical aspects of database interoperability as a barrier to scientific and financial sustainability • Assessment of the variability of practice in the semantics of biological data representation, e.g. genotype, gene expression • Assessment of emerging standards and current practice for data representation, annotation and ontologies CASIMIR Networking Meeting Heathrow, July 2007

  3. 4.1 - D9 - Classified list of data representations in European mouse-centric and related databases • 4.4 - Network meeting 1 - June-Sep 07 - Bring together bioinformatics reps from (EU-funded) mouse projects to discuss data representation • 4.4 - Joint work package meeting to discuss results (4-5 Oct 07) • 4.5 - Sep - Dec 07 - Report of network meeting • 4.6 - Present conclusions at meetings CASIMIR Networking Meeting Heathrow, July 2007

  4. Discussion Points • What do we understand by “data representation” - is it just CVs/Ontologies? • Interaction with other work packages • What kinds of data? • What ontologies? How many on the PRIME list do you use? Do you use others? Do you use OBO ontologies by default? • What processes are they involved in elsewhere to discuss/unify data representation? CASIMIR Networking Meeting Heathrow, July 2007

  5. Future: Cross-Species Interactions • Mouse-Human must be a priority because of the disease angle • Mouse-Rat - already quite well integrated (?To what extent?) because of MGI-RGD-OBO interactions • Other important models • Chick (ChickEST (UK), ChickVD (CN), Ensembl, others?) • Xenopus • Zebrafish • Drosophila • C. elegans • Yeast, E.coli • In longer term get together with community reps to discuss similarities & differences CASIMIR Networking Meeting Heathrow, July 2007

  6. Extant Resources • PRIME Expert Group Report and Outcomes • Euromouse • Interphenome discussion group & pilots • EUMORPHIA/EUMODIC bioinformaticians CASIMIR Networking Meeting Heathrow, July 2007

  7. PRIME Expert Group • Draft lists of: • Databases • Ontologies CASIMIR Networking Meeting Heathrow, July 2007

  8. Interphenome • Phenotype data: • Common data description • Common protocol description • Standard for data exchange CASIMIR Networking Meeting Heathrow, July 2007

  9. Interphenome - Current Status • Ontologies • Investigate cross-mapping of current approaches and eventual possible convergence (?) • Protocols • Work on developing a format that can accommodate all information needed for a protocol • Encode this as an XML schema • PPML? • Data Exchange • Work on an XML schema that will allow structured exchange of phenotype data and metadata - started work on this in EUMODIC Publication in Mammalian Genome 18, 157-163 (March 2007): “Integration of Mouse Phenome Data Resources” By The Mouse Phenotype Database Integration Consortium CASIMIR Networking Meeting Heathrow, July 2007

  10. WP4 - 1st Actions • Update the PRIME list of European mouse projects • Also identify “mouse-related” projects • Identify contacts • To hold a meaningful dialogue, get as many as possible to a networking meeting CASIMIR Networking Meeting Heathrow, July 2007

  11. Ontologies - So Far • We have a little list • Test how many of these are actually in use - Questionnaire • Check how up to date it is, and track developments (e.g. Relationships Ontology, potential Synapse Ontology) CASIMIR Networking Meeting Heathrow, July 2007

  12. The CASIMIR Questionnaire • http://www.casimir.org.uk/questionnaire.php • 1a. Are you using a relational database, object database or flat files? • 1b. If relational, what is your chosen RDBMS (Relational Database Management System)? • 2a. Is your database providing external links to other on-line resources; possibly via URL/HTTP (if yes please name them)? • 2b. Supported/Installed Web Services (if yes please name them)? Do you plan to install or develop web services in the near future? CASIMIR Networking Meeting Heathrow, July 2007

  13. The CASIMIR Questionnaire • 3a. Please list the sorts of data entities you store (e.g. protein sequence data, mouse strain information etc...) • 4a. Can you provide a brief explanatory description/schema of your data/data structure? • 4b. Are you willing to provide a entity relationship diagram and would you be willing to provide it under an open source license? CASIMIR Networking Meeting Heathrow, July 2007

  14. The CASIMIR Questionnaire • 5a.Are you currently using or do you intend to use any ontologies or controlled vocabularies to describe your data? • 5b. Do you plan to expand your use of ontologies in future? • 5c. Do you use OBO ontologies? • 5d. Do you perceive the need for additional ontologies to serve your domain of knowledge? CASIMIR Networking Meeting Heathrow, July 2007

  15. The CASIMIR Questionnaire • 6. Do you make use of Minimum Information standards (such as MIAME for microarray experiments) to describe any data? If so, which ones? If you do not make use of these standards, are you likely to do so in future? CASIMIR Networking Meeting Heathrow, July 2007

  16. Minimum Standards • MIAME - Brazma et al (2001) Nat. Genet.29, 365-71 CASIMIR Networking Meeting Heathrow, July 2007

  17. The CASIMIR Questionnaire • 7. What do you perceive as the main limiting factor in data representation/interoperability etc. in European bioinformatics databases? • 8. Do you have any comments/thoughts on standards for data representation that need to be developed or that you might like discussed in CASIMIR? CASIMIR Networking Meeting Heathrow, July 2007

  18. The CASIMIR Questionnaire Please fill it in as soon as humanly possible! We will be chasing around database coordinators over the next few months to make sure we have as much information as possible CASIMIR Networking Meeting Heathrow, July 2007

  19. Agenda for Today • Reports from some databases: • MUGEN - Christina Chandras • EMMA - Glenn Proctor • EUMODIC - Niels Adams • EUCLIS - Eduardo Mendoza • Discussion, e.g. • Comments on the questionnaire/CASIMIR’s aims • How to get widest possible participation • What do people see as the main obstacles to the aim of integrating all this data? CASIMIR Networking Meeting Heathrow, July 2007

  20. Mouse to Human Human DISEASE Phenotypic Attributes Phenotypic Attributes Phenotypic Attributes Phenotypic Attributes Mouse Phenotypic Attributes Phenotypic Attributes Phenotypic Attributes Phenotypic Measures Phenotypic Measures Phenotypic Measures PHENOTYPING CASIMIR Networking Meeting Heathrow, July 2007

More Related