1 / 72

ChEBI: an EBI chemistry reference

ChEBI: an EBI chemistry reference. Overview. Introduction to ChEBI Searching and browsing Understanding the ontology Downloads and programmatic access. Introduction to ChEBI. Block 1. Small Molecules within Bioinformatics. Genomes. Literature. Expressions. Nucleotide sequences.

darius
Télécharger la présentation

ChEBI: an EBI chemistry reference

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. ChEBI: an EBI chemistry reference

  2. Overview • Introduction to ChEBI • Searching and browsing • Understanding the ontology • Downloads and programmatic access ChEBI – Chemical Entities of Biological Interest

  3. Introduction to ChEBI Block 1

  4. Small Molecules within Bioinformatics Genomes Literature Expressions Nucleotide sequences Protein sequences Protein domains, families Enzymes 3D structures Small molecules Pathways Systems

  5. Small Molecules within Bioinformatics Genomes Literature Expressions Nucleotide sequences Protein sequences Protein domains, families Enzymes 3D structures Small molecules Small molecules Small molecules Small molecules Small molecules Small molecules Pathways Systems

  6. Small molecules participate in all the processes of life

  7. γ-aminobutyric acid Signaling • GABA: chief inhibitory neurotransmitter in the mammalian central nervous system. • In humans, also regulates muscle tone. • synthesized by neurons • found mostly as a zwitterion, that is, with the carboxyl group deprotonated and the amino group protonated • conformational flexibility of GABA is important for its biological function, as it has been found to bind to different receptors with different conformations • GABA deficiency linked to • anxiety disorder, depression, alcoholism • multiple sclerosis, action tremors, tardive dyskinesia

  8. Adenosine 5'-triphosphate Metabolism Adenosine 5’-triphosphate (ATP): the "molecular unit of currency" of intracellular energy transfer. • generated in the cell by energy-consuming processes, broken down by energy-releasing processes • proteins that bind ATP do so in a characteristic protein fold known as the Rossmann fold, which is a general nucleotide-binding structural domain that can also bind the cofactor NAD

  9. Enzymes • Enzyme inhibitors are molecules that bind to enzymes and decrease their activity. • Many drugs are enzyme inhibitors. They are also used as herbicides and pesticides. • Enzyme activators bind to enzymes and increase their enzymatic activity. • Enzyme activators are often involved in the allosteric regulation of enzymes in the control of metabolism. clavulanic acid acts as a suicide inhibitor of bacterial β-lactamase enzymes

  10. Pathways http://www.genome.jp/kegg-bin/highlight_pathway?scale=1.0&map=map00231&keyword=tryptophan

  11. Systems biology BioModels: quantitative models of biochemical and cellular systems tryptophan D-enantiomer: sweet L-enantiomer: bitter

  12. Drug design • Ligand-based: relies on knowledge of other molecules that bind to the biological target of interest. • Structure-based: relies on knowledge of the 3D structure of the biological target. • A lead has • evidence that modulation of the target will have therapeutic value: e.g. disease linkage studies showing associations between mutations in the biological target and certain disease states. • evidence that the target is druggable, i.e. capable of binding to a small molecule and that its activity can be modulated by the small molecule. • Target is cloned and expressed, then libraries of potential drug compounds are screened using screening assays

  13. Drug types 2003 - 2009 'Small molecules' in various shades of blue (http://chembl.blogspot.com/)

  14. Small molecule annotations Often appear as free text in biological databases, in which they are not the core data Are frequently referred to by common names which may be chemically ambiguous eg. adrenaline = (S)-adrenaline ? (R)-adrenaline ? • May be referred to by several different names • paracetamol, acetaminophen, 4-acetamidophenol, N-(4-hydroxyphenyl)acetamide, …

  15. Getting the chemistry right • Thalidomide a non-barbiturate hypnotic • Thalidomide displays immunosuppresive and anti-angiogenic activity. It inhibits release of tumor necrosis factor-alpha from monocytes, and modulates other cytokine action. • Thalidomide is racemic — it contains both left and right handed isomers in equal amounts: one enantiomer is effective against morning sickness, and the other is teratogenic. • Enantiomers are interconverted in vivo. That is, if a human is given D-thalidomide or L-thalidomide, both isomers can be found in the serum. Hence, administering only one enantiomer does not prevent the teratogenic effect in humans. http://www.drugbank.ca/drugs/DB01041

  16. Small molecule data sources http://pubchem.ncbi.nlm.nih.gov/ Deposition-driven publicly available compound repository, containing more than 25 million unique structures. http://www.chemspider.com/ Automatic aggregation of publicly available chemistry data with crowdsourced annotation. Small molecules and bioactivity http://www.ebi.ac.uk/chembldb/ http://www.ebi.ac.uk/chebi/ Manually annotated database and ontology

  17. Chemicals - ChEBI Nomenclature Ontology metaboliteCNS stimulanttrimethylxanthines caffeine1,3,7-trimethylxanthine methyltheobromine Chemical data Database Xrefs Formula: C8H10N4O2Charge: 0 Mass: 194.19 MSDchem: CFFKEGG DRUG: D00528 Chemical Informatics Visualisation InChI=1/C8H10N4O2/c1-10-4-9-6-5(10)7(13)12(3)8(14)11(6)2/h4H,1-3H3 SMILES CN1C(=O)N(C)c2ncn(C)c2C1=O

  18. What is ChEBI? Chemical Entities of Biological Interest Freely available Focused on ‘small’ chemical entities (no proteins or nucleic acids) Illustrated dictionary of chemical nomenclature High quality, manually annotated Provides chemical ontology Access ChEBI at http://www.ebi.ac.uk/chebi/ ChEBI – Chemical Entities of Biological Interest

  19. ChEBI home page http://www.ebi.ac.uk/chebi ChEBI – Chemical Entities of Biological Interest

  20. How is ChEBI maintained? • Automatic loading of preliminary data • Automatic loading of 2 star annotated data • Manual annotation • User requests via Submission Tool • Public release: First Wednesday of every month. ChEBI – Chemical Entities of Biological Interest

  21. ChEBI entries contain • A unique, unambiguous,recommended ChEBI name and an associated stable unique identifier • An illustration where appropriate (compounds and groups, but generally not classes) • A definition where appropriate (mostly classes) • A collection of synonyms, including the IUPAC recommended name for the entity where appropriate • A collection of cross-references to other databases • Links to the ChEBI ontology ChEBI – Chemical Entities of Biological Interest

  22. ChEBI entry view ChEBI – Chemical Entities of Biological Interest ChEBI – Chemical Entities of Biological Interest

  23. Automatic Cross-references ChEBI – Chemical Entities of Biological Interest

  24. Chemical Structures • Chemical structure may be interactively exploredusing MarvinView applet • Available in formats • Image • Molfile • InChI and InChIKey • SMILES ChEBI – Chemical Entities of Biological Interest

  25. Molfile format ChEBI – Chemical Entities of Biological Interest

  26. Time for Exercises

  27. Searching and browsing ChEBI Block 2

  28. Simple text search Wildcard: * Enter any text ChEBI – Chemical Entities of Biological Interest

  29. Simple Text Search ChEBI – Chemical Entities of Biological Interest

  30. Advanced Search ChEBI – Chemical Entities of Biological Interest

  31. Advanced text search Narrow to category AND, OR and BUT NOT ChEBI – Chemical Entities of Biological Interest

  32. Structure search Structure drawing tools Search options ChEBI – Chemical Entities of Biological Interest

  33. Search Results Download your search results Hover-over for zoomed in image Click to go to entry page ChEBI – Chemical Entities of Biological Interest

  34. Fingerprints • Chemical substructure searching is computationally expensive… ChEBI – Chemical Entities of Biological Interest

  35. Fingerprints [2] • … so heuristics must be used to decrease the number of search candidates cannot be a substructure of an entity which does not have at least 8 carbon atoms, 9 hydrogen atoms… C8H9NO2 • Fingerprints are a generalized, abstract encoding of structural features which can be used as an effective screening device ChEBI – Chemical Entities of Biological Interest

  36. Fingerprints [3] • Encoding of structural patterns water (HOH) 0-bond paths H O H 1-bond paths HO OH 2-bond paths HOH • Hashed to create bit strings, which are added together to give final fingerprint ChEBI – Chemical Entities of Biological Interest

  37. Types of structure search • Identity – based on InChI • Substructure – uses fingerprints to narrow search range, then performs full substructure search algorithm • Similarity – based on Tanimoto coefficient calculated between the fingerprints InChI=1/H2O/h1H2 0010110010 1010110111 Tanimoto(a,b) = c / (a+b-c) = 4 / (4+7-4) = 0.57 a 0010110010 b 1010110111 ChEBI – Chemical Entities of Biological Interest

  38. Browse via Periodic Table Molecular entities / Elements ChEBI – Chemical Entities of Biological Interest

  39. Navigate via links in ontology Click to follow links ChEBI – Chemical Entities of Biological Interest

  40. Time for Exercises

  41. Understanding the ChEBI ontology Block 3

  42. Annotation of bioinformatics data • Essential for capturing understanding and knowledge associated with core data • Often captured in free text, which is easier to read and better for conveying understanding to a human audience, but… • Difficult for computers to parse • Quality varies from database to database • Terminology used varies from annotator to annotator • Towards annotation using standard vocabularies: ontologies within bioinformatics ChEBI – Chemical Entities of Biological Interest

  43. The ChEBI ontology Organised into three sub-ontologies, namely • Molecular structure ontology • Subatomic particle ontology • Role ontology (R)-adrenaline ChEBI – Chemical Entities of Biological Interest

  44. Molecular structure ontology ChEBI – Chemical Entities of Biological Interest

  45. Role ontology ChEBI – Chemical Entities of Biological Interest

  46. ChEBI ontology relationships • Generic ontology relationships • Chemistry-specific relationships ChEBI – Chemical Entities of Biological Interest

  47. Viewing ChEBI ontology ChEBI – Chemical Entities of Biological Interest

  48. Viewing ChEBI ontology [2] Tree view ChEBI – Chemical Entities of Biological Interest

  49. Browsing ChEBI ontology (OLS) Browse the ontology Ontology Lookup Service (OLS): http://www.ebi.ac.uk/ontology-lookup/ ChEBI – Chemical Entities of Biological Interest

  50. Ontology Lookup Service • Provides a centralised query interface for ontology and controlled vocabulary lookup • Can integrate any ontology available in OBO (Open Biomedical Ontologies) format • At last release, 58 ontologies integrated, including • GO • ChEBI • Molecular interaction (PSI MI) • Pathway ontology (PW) • Human disease (DOID) • and many more… • Provides a search and a browse facility, as well as displaying a graph of terms and relationships ChEBI – Chemical Entities of Biological Interest

More Related