1 / 8

UniProt: Universal Protein Resource

UniProt: Universal Protein Resource. Central Resource of Protein Sequence and Function. International Consortium PIR at GUMC European Bioinformatics Institute Swiss Institute of Bioinformatics Unifies PIR-PSD, Swiss-Prot, TrEMBL Protein Sequence Databases. http://www.uniprot.org.

tuari
Télécharger la présentation

UniProt: Universal Protein Resource

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. UniProt: Universal Protein Resource Central Resource of Protein Sequence and Function • International Consortium • PIR at GUMC • European Bioinformatics Institute • Swiss Institute of Bioinformatics • Unifies PIR-PSD, Swiss-Prot, TrEMBL Protein Sequence Databases http://www.uniprot.org

  2. UniParc: Comprehensive Sequence Archive with Sequence History UniRef: Non-redundant Reference Databases for Sequence Search UniProtKB: Knowledgebase with Full Classification and Functional Annotation UniProt Databases

  3. UniProt Archive (UniParc) • An archive for tracking protein sequences • Comprehensive: All published protein sequences • Non-Redundant: Merge identical sequence strings • Traceable: Versioned, with ‘Active’ or ‘Obsolete’ status tag • Concise: no annotation of function, species, tissue, etc. • 5 million unique entries from 13 million source-database entries

  4. Sub-fragments Splice variants UniProt Reference Clusters (UniRef) • Non-Redundant Reference Clusters for Sequence Searching • UniRef100 for Comprehensive Sequence Similarity Search • 100% sequence identity from all species, merging sub-fragments • Derived from UniProtKB – Splice variants as separate entries • Additional UniParc sources (e.g. Ensembl, IPI, EMBL_WGS)

  5. Release 4.4 (03/29/05) Database Size UniProt Reference Clusters (UniRef) • UniRef90/50 for Faster Searches using Reduced Data Sets • UniRef90: 90% sequence identity (35% reduction from UniRef100) • UniRef50: 50% sequence identity (65% reduction) • Representative Sequence for cluster

  6. UniProt Knowledgebase (UniProtKB) • Objective: Stable, Comprehensive, Fully Classified, Richly and Accurately Annotated • Describe in a single record all protein products derived from a certain gene in a given species • Information Content • Isoform Presentation: Alternatively Spliced Forms, Proteolytic Cleavage, and Post-Translational Modification (each with FTid) • Nomenclature: Gene/Protein Names (Nomenclature Committees) • Family Classification and Domain Identification: InterPro and PIRSF • Functional Annotation: Function, Functional Site, Developmental Stage, Catalytic Activity, Modification, Regulation, Induction, Pathway, Tissue Specificity, Subcellular Location, Disease, Process

  7. UniProtKB Report (I)

  8. UniProtKB Report (II) http://www.pir.uniprot.org/cgi-bin/upEntry?id=PH4H_HUMAN

More Related