1 / 43

Ontology Overview

Rita
Télécharger la présentation

Ontology Overview

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


    1. Ontology Overview Microarray Database Systems

    3. Ontology

    4. Not Ontology a collection of facts that arise from an actual, specific situation a model for an application domain (which would be a theory) a database schema which defines categories and their data types a tree instance of, part of, etc

    5. Why Ontology Consolidation of the understandings for a given area through public discussion and review Knowledge sharing automatic uniform queries and assertions

    6. Gene Ontology Consortium www.geneontology.org produce a dynamic controlled vocabulary that can be applied to all organisms even as knowledge of gene and protein roles in cells is accumulating and changing. joint project FlyBase, Mouse Genome Informatics (MGI), Saccharomyces Genome Database (SGD)

    7. GO Background Growing sequence data Growing functional analysis of genes

    8. Why GO For genome research permit powerful analysis methods if multiple databases use the same ontology to describe data e.g. query multiple organisms using shared biology help discover new gene functions for sequences unification of biology automatic annotation transformation

    9. The Topic: Genes Genes are expressed in temporally and spatially characheristic patterns. Gene product a protein or RNA ribosomal RNA Gene product groups entities that function as complexes ribosome

    10. The Topic: Genes (Attributes) Gene may have more than one product, each may have distinct function Gene product gene products are often located in specific cellular compartments gene products maybe part of a multi-component complexes may have one or more biochemical, physiological, or structural functions may include small molecules

    11. The Topic: Genes (Q & A) Where is a gene expressed? What is the (sub)-cellular localization of a gene product? When is a gene expressed? What is the function of a gene product? What larger process is the function of a gene product a part? What processes is a genes activities controlled? What larger complex is this function a component of? What genes in species A have a function of gene X in species B? etc.

    12. GO Objective to provide controlled vocabularies for the description of gene products molecular function biological process cellular component Notes independent attributes many to many relationship to gene products Gene product groups may include small molecules; these are not represented in GO

    13. Molecular Function a capability that a physical gene product (group) carries as a potential what a gene product can do, not where/when often a gene product is named by its function a product has many to many relationship with a molecular function Example enzyme, transporter, ligand, mortor protein adenylate cyclase, Toll receptor ligand

    14. Molecular Function (2)

    15. Biological Process a biological objective accomplished via one or more ordered assemblies of molecular functions temporal transformational more than one step Examples cell growth and maintenance, signal transduction pyrimidine metabolism, cAMP biosynthesis

    16. Biological Process (2)

    17. Cellular Component a component of a cell part of some larger object Examples anatomical structure: nucleus gene product group: ribosome, proteasome

    18. Cellular Component (2)

    19. What GO is NOT not a way to unify biological database knowledge changes and updates lag behind individual curators evaluate data differently many aspects of biology are not included (domain structure, 3D structure, evolution, expression, etc) not a dictated standard not a database of gene sequences not a catalog of gene products

    20. Data Representation Terms directed acyclic graphs (DAGs) text format Attributes unique identifier: GO:nnnnnnn %: is-a relationship < : part-of relationship synonym unique identifier database cross-reference

    21. GO Terms file: GO.defs syntax: tag: text or value All tags are mandatory with the exception of the "comment" tag. tags term: the term cardinality 1 goid: the goid of the term cardinality 1 definition: the definition of the term cardinality 1 comment: a free text comment for the help of GO annotators cardinality 0, 1 definition_reference: a reference for the definition cardinality 1, >1

    22. GO Terms (2) term: 1,3-beta-glucanosyltransferase goid: GO:0042124 definition: Catalysis of the splitting and linkage of 1,3-beta-glucan molecules, resulting in 1,3-beta-glucan chain elongation. definition_reference: GO:jl definition_reference: PMID:10809732

    23. Go Terms (GO ID) each term defined has a unique ID if the wording but not the meaning of a term is changed, the GO ID stay the same if the meaning is changed, a new ID is added

    24. Go Terms (DB Ref) Database Cross References form: database:ID Function ontology EC: - Enzyme Commission e.g.: EC:3.5.1.6 TC: Transport Catalog e.g.: TC:2.A.29.10.1 UM-BBD_enzymeID: e.g.: UM-BBD_enzymeID:e0310 Process ontology UM-BBD_pathwayID: e.g.:UM-BBD_pathwayID:dcb MetaCyc: MetaCyc e.g.: MetaCyc:2ASDEG-PWY Component ontology none

    25. Syntax Parent-child relationship (by indentation) parent_term child_term Instance relationship: %term0 %term1 % term2 term1 being an instance of term0 and also an instance of term2 e.g. process.ontology

    26. Syntax (2) Part of relationship: %term0 %term1 < term2 < term3 term1 being an instance of term0 and also a part-of of term2 and term3 Line syntax: < | % term [; db cross ref]* [; synonym:text]* [ < | % term]*

    27. XML Syntax <?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE go:go> <go:go xmlns:go=http://www.geneontology.org/xml-dtd/go.dtd# xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"> <go:version timestamp="Wed May 9 23:55:02 2001" /> <rdf:RDF> <go:term rdf:about="http://www.geneontology.org/go#GO:0016209"> <go:accession>GO:0016209</go:accession> <go:name>antioxidant</go:name> <go:definition></go:definition> <go:isa rdf:resource="http://www.geneontology.org/go#GO:0003674" /> <go:association> <go:evidence evidence_code="ISS"> <go:dbxref> <go:database_symbol>fb</go:database_symbol> <go:reference>fbrf0105495</go:reference> </go:dbxref> </go:evidence> <go:gene_product> <go:name>CG7217</go:name> <go:dbxref> <go:database_symbol>fb</go:database_symbol> <go:reference>FBgn0038570</go:reference> </go:dbxref> </go:gene_product> </go:association> </rdf:RDF> </go:go>

    28. XML Syntax (2) RDF, not plain XML semantics network XML id and idref restrictions (e.g. no multiple parentage) in RDF, unique url as the ID

    29. Data Representation Term definitions Evidence code GO Term Bibliography

    30. The 3 Ontologies Biological Process (process.ontology) Molecular Function (function.ontology) Cellular Component (component.ontology)

    31. Annotation Collaborating databases annotate their gene products (or genes) with GO terms, providing references and indicating what kind of evidence is available to support the annotations.

    32. Annotation (2)

    33. Annotation (3) Indices of other Classification systems to GO SWISS-PROT spkw2go Enzyme Commission ec2go EGAD egad2go GenProtEC genprotec2go TIGR role tigr2go InterPro interpro2go MIPS Funcat mips2go

    34. Database Abbreviations The annotation file syntax calls for an identifier from a foreign database to be prefixed by the abbreviation of that database. syntax: DB:identifier. (e.g. EC:1.8.1.4 ) legal database abbreviations: GO.xrf_abbs abbreviation: ENSEMBL database: Database of automatically annotated genomic data. object: Identifier. example: ENSEMBL:ENSP00000265949 generic_url: http://www.ensembl.org/ url_syntax: example_url:http://www.ensembl.org/perl/protview?peptide=ENS

    35. True Path Rule the pathway from a child term all the way up to its top level parent(s) must always be true.

    36. True Path Rule (2) chitin (???) metabolism chitin biosynthesis chitin catabolism (????) cuticle chitin metabolism cuticle chitin biosynthesis cuticle chitin catabolism cell wall chitin metabolism cell wall chitin biosynthesis cell wall chitin catabolism

    37. True Path Rule (3)

    38. Parent-child Relationships child: instance of part of Example casein kinase II has two children casein kinase II regulator (part of) casein kinase II catalyst (part of)

    39. Logical Relationships if A part_of B and C instance_of B A part_of C if A instance_of B and B instance_of C A instance_of C if A part_of B and B part_of B A part_of C if A instance_of B and C part_of A not necessarily C part_of B

    40. Other Guidelines avoid species-specific definitions no more specific than any of its children no mutant processes avoid cellular components and gene products in molecular function ontology components that reside in multiple loations e.g. not all chromosomes are inside nucleus

    41. Other Guidelines (2)

    42. Other Guidelines (3)

    43. Application

More Related