270 likes | 410 Vues
The Pathway Tools Schema provides a structured framework for understanding complex biological data within Pathway/Genome Databases (PGDBs). It outlines the organization of interconnected objects representing biological entities and details the frames, slots, and annotations that comprise a knowledge base. This guide emphasizes the significance of writing precise queries to extract specific information and the relationships between various biological components such as genes and enzymatic reactions. The insights gained from visualizations and analyses are foundational for research and bioinformatics applications.
E N D
Motivations for Understanding Schema • Pathway Tools visualizations and analyses depend upon the software being able to find precise information in precise places within a Pathway/Genome DB • When writing complex queries to PGDBs, those queries must name classes and slots within the schema • A Pathway/Genome Database is a web of interconnected objects; each object represents a biological entity
Reference • Pathway Tools User’s Guide, Volume I • Appendix A: Guide to the Pathway Tools Schema
Web of Relationships for One Enzyme Succinate + FAD = fumarate + FADH2 Enzymatic-reaction Succinate dehydrogenase Sdh-flavo Sdh-Fe-S Sdh-membrane-1 Sdh-membrane-2 sdhC sdhD sdhA sdhB TCA Cycle
Frame Data Model • Frame Data Model -- organizational structure for a PGDB • Knowledge base (KB, Database, DB) • Frames • Slots • Facets • Annotations
Knowledge Base • Collection of frames and their associated slots, values, facets, and annotations • AKA: Database, PGDB • Can be stored within • An Oracle or MySQL DB • A disk file • Pathway Tools binary program
Frames • Entities with which facts are associated • Kinds of frames: • Classes: Genes, Pathways, Biosynthetic Pathways • Instances (objects): trpA, TCA cycle • Classes: • Superclass(es) • Subclass(es) • Instance(s) • A symbolic frame name (id, key) uniquely identifies each frame
Slots • Encode attributes/properties of a frame • Integer, real number, string • Represent relationships between frames • The value of a slot is the identifier of another frame • Every slot is described by a “slot frame” in a KB that defines meta information about that slot
Slot Links Succinate + FAD = fumarate + FADH2 Enzymatic-reaction Succinate dehydrogenase Sdh-flavo Sdh-Fe-S Sdh-membrane-1 Sdh-membrane-2 sdhC sdhD sdhA sdhB TCA Cycle in-pathway reaction catalyzes component-of product
Slots • Number of values • Single valued • Multivalued: sets, bags • Slot values • Any LISP object: Integer, real, string, symbol (frame name) • Slotunits define properties of slots: datatypes, classes, constraints • Two slots are inverses if they encode opposite relationships • Slot Product in class Genes • Slot Gene in class Polypeptides
Representation of Function EC# Keq Succinate + FAD = fumarate + FADH2 Cofactors Inhibitors Enzymatic-reaction Molecular wt pI Succinate dehydrogenase Sdh-flavo Sdh-Fe-S Sdh-membrane-1 Sdh-membrane-2 sdhC sdhD sdhA sdhB TCA Cycle Left-end-position
Monofunctional Monomer Pathway Reaction Enzymatic-reaction Monomer Gene
Bifunctional Monomer Pathway Reaction Reaction Enzymatic-reaction Enzymatic-reaction Monomer Gene
Monofunctional Multimer Pathway Reaction Enzymatic-reaction Multimer Monomer Monomer Monomer Monomer Gene Gene Gene Gene
Pathway and Substrates Reactant-1 Pathway left in-pathway Reactant-2 Reaction Reaction Reaction Reaction Product-1 right Product-2
Transcriptional Regulation trp Int005 apoTrpR Int001 TrpR*trp site001 pro001 Int003 RpoSig70 trpL trpLEDCBA trpE trpD trpC trpB trpA
Principle Classes • Class names are capitalized, plural, separated by dashes • Genetic-Elements, with subclasses: • Chromosomes • Plasmids • Genes • Transcription-Units • RNAs • rRNAs, snRNAs, tRNAs, Charged-tRNAs • Proteins, with subclasses: • Polypeptides • Protein-Complexes
Principle Classes • Reactions, with subclasses: • Transport-Reactions • Enzymatic-Reactions • Pathways • Compounds-And-Elements
Frame IDs of Instances • Instance frame ID conventions have evolved over time • Examples: • Pathways • TRPSYN-PWY, P23-PWY • Genes • AG10045 • Monomers • TRPA-MONOMER, AG10045-MONOMER
Slots in Multiple Classes • Common-Name • Synonyms • Names (computed as union of Common-Name, Synonyms) • Comment • Citations • DB-Links
Genes Slots • Component-Of (links to replicon, transcription unit) • Left-End-Position • Right-End-Position • Centisome-Position • Transcription-Direction • Product
Proteins Slots • Molecular-Weight-Seq • Molecular-Weight-Exp • pI • Locations • Modified-Form • Unmodified-Form • Component-Of
Polypeptides Slots • Gene
Protein-Complexes Slots • Components
Reactions Slots • EC-Number • Left, Right • Substrates (computed as union of Left, Right) • DeltaG0 • Keq • Spontaneous?
Enzymatic-Reactions Slots • Enzyme • Reaction • Activators • Inhibitors • Physiologically-Relevant • Cofactors • Prosthetic-Groups • Alternative-Substrates • Alternative-Cofactors
Pathways Slots • Reaction-List • Predecessors • Primaries