1 / 23

MIAME, ArrayExpress and the data submission tool MIAMExpress

MIAME, ArrayExpress and the data submission tool MIAMExpress. Helen Parkinson Microarray Informatics Team European Bioinformatics Institute Bio-ontologies workshop, 5 December,2001. Talk Structure. MIAME Ontologies in a database context Datasubmission tool - MIAMExpress.

cate
Télécharger la présentation

MIAME, ArrayExpress and the data submission tool MIAMExpress

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. MIAME, ArrayExpress and the data submission tool MIAMExpress Helen Parkinson Microarray Informatics Team European Bioinformatics Institute Bio-ontologies workshop, 5 December,2001

  2. Talk Structure • MIAME • Ontologies in a database context • Datasubmission tool - MIAMExpress

  3. Standards in a database context • Data input,avoiding free text • Data curation,ontology building • Data query (web interface) • Data exchange (via MAGE-ML) • Linking to external databases for sequence, samples, and cluster annotations

  4. General MIAME principles • Recorded info should be sufficient to interpret and replicate the experiment • Information should be structured so that querying and automated data analysis and mining are feasible

  5. Experiment Source (e.g., Taxonomy) Gene (e.g., EMBL) Sample Hybridisation Array Data Normalisation MIAME – Minimum Information About a Microarray Experiment External links Publication 6 parts of a microarray experiment www.mged.org

  6. Use case scenarios • Return a summary of all experiments that use a specified type of biosource (primary source). • Group the experiments according to treatment. • Return a summary of all experiments done examining effects of a specified treatment • Group the experiments according to biosource. • Return a summary of all experiments measuring the expression of a specified gene. • Indicate when experiments confirm results, provide new information, or conflict.

  7. Why do we need an ontologyfor the database • To perform structured queries • To ensure data is described accurately and consistently • To avoid problems with free text searching • To avoid excessive curation workload in future

  8. MIAME Section on Sample Source and Treatment • organism (NCBI taxonomy) • cell source - provider • cell type (if derived from primary sources (s)) • sex • age • growth conditions • development stage • organism part (tissue) • animal/plant strain or line • genetic variation (e.g., gene knockout, transgenic variation) • individual • individual genetic characteristics (e.g., disease alleles, polymorphisms) • disease state or normal • target cell type • cell line and source (if applicable) • in vivo treatments (organism or individual treatments) • in vitro treatments (cell culture conditions) • treatment type (e.g., small molecule, heat shock, cold shock, food deprivation) • compound • is additional clinical information available (link) • separation technique (e.g., none, trimming, microdissection, FACS) • laboratory protocol for sample treatment……

  9. What sort of annotation do we see? • Free text (free text is bad) complex sentence construction • No references, no defintions, synonyms • Incomplete annotation e.g. “control” • Inconsistent use of terms e.g. experiment, probe, target…… • Publication references to websites with supplementary pdf’s

  10. Excerpts from a (good) Sample Descriptioncourtesy of M. Hoffman, S. Schmidtke, Lion BioSciences • Organism: Mus musculus [ NCBI taxonomy browser ] • Cell source: in-house bred mice (contact: person@somewhere.ac.uk) • Sex: female [ MGED ] • Age: 3 - 4 weeks after birth [MGED] • Growth conditions: normal • controlled environment • 20 - 22 oC average temperature • housed in cages according to EU legislation • specified pathogen free conditions (SPF) • 14 hours light cycle • 10 hours dark cycle • [Developmental stage]: stage 28 (juvenile (young) mice)) [ GXD "Mouse Anatomical Dictionary" ] • Organism part: thymus [ GXD "Mouse Anatomical Dictionary" ] • Strain or line: C57BL/6 [International Committee on Standardized Genetic Nomenclature for Mice] • Genetic Variation: Inbr (J) 150. Origin: substrains 6 and 10 were separated prior to 1937. This substrain is now probably the most widely used of all inbred strains. Substrain 6 and 10 differ at the H9, Igh2 and Lv loci. Maint. by J,N, Ola. [International Committee on Standardized Genetic Nomenclature for Mice ] • Treatment: in vivo [MGED][intraperitoneal] injection of [dexamethasone] into mice, 10 microgram per 25 g bodyweight of the mouse • Compound: drug [MGED] synthetic [glucocorticoid] [dexamethasone], dissolved in PBS

  11. Submitter LIMS User Login Large Scale Submissions MAGE-ML format Browse Arrays Browse Protocols Array Submission Curation Database Protocol Sub. Experiment submission MIAMExpress ArrayExpress Database MAGE-OM Model Browse Arrays Browse Protocols Query Interface for Public Data External Applications External Databases, EMBL, Ontology Resources… etc Data File Export Analysis Tools Expression Profiler

  12. External Ontologies MGED/ ArrayExpress Ontology Production Curation Tool/Browser Public Browser MAGE-ML Data checking ontologies LIMS MIAMExpress LIMS

  13. Introduction to MIAMExpressa tool for datasubmisson • The submission tool is simpler implementation of the ArrayExpress model in Mysql • Faster, easier to update, cheap • Short term solution to the problem of data submission in a non XML format • Must be granular enough to be useful • And not be too time consuming to complete a submission

  14. MIAMExpress • Based on MIAME concepts and questionnaire • Experiment, Array, Protocol submissions • CV wherever possible • Future versions organism specific pages and related linked ontologies • Allow user driven ontology development • Will be developed according to user needs • Will also need to be an update tool

  15. Create account Login Pending/New Experiment En En En En E1 E1 E1 E1 E2 E2 E2 E2 Samplen Sample1 Sample2 Sample3 Sample protocol Extracts 1…n Extracts 1…n Extracts 1…n Extracts 1…n Extraction protocol Hyb protocol Hybridisations Array1 Array2 Array3 Arrayn Scanning protocol Data1 Data2 Data3 Datan Image analysis protocol Transformation protocol Combined Experiment Data Submit Final free text comment

  16. Design Considerations • Speed and ease of use, scalability • Need to browse existing protocols and array designs in ArrayExpress • Requirement for curator control over submissions • Submissions tracking • Future use as a LIMS • Flexibility

  17. Problems with tool design • Granularity • Including ontology information in a usable format • Length of submission time • Getting lost within the pages • Users don’t start to submit till they have a proof • Conforming to MAGE-OM

  18. Features of MIAMExpress • Creates a user login account instead of on-the-fly submissions so sessions can be saved • Allows existing protocols to be copied and saved and linked to more than one hyb/expt • Forms the basis of a LIMS using the ArrayExpress model • Will be available as a stand alone tool for local installation • Is open source and free • Will be supported by curation staff and developers

  19. Expected Users • Users with limited local bioinformatics support • Users of bought in arrays without LIMS • Small scale users with self made arrays who will need to provide a description • Array Submissions are expected from manufacturers (MAGE-ML format)

  20. MIAMExpress v2.0KeyLargoExpress? • Dynamic • Species specific • Browsable ontologies including MGED • QVS removed • Less free text,more controlled vocabularies • Pretty up the front end • Curation staff interface

  21. Acknowledgments • Microarray Informatics Team • Industry Support team, EBI • MGED • Chris Stoeckert, U. Penn. • Ontology builders everywhere • Liz Ford

  22. Demo Version of MIAMExpress • Coming soon to www.ebi.ac.uk.microarray • Beta tester recuitment

More Related