1 / 10

Protein Structure Database Introduction

Protein Structure Database Introduction. ModBase. Database of Comparative Protein Structure Models. 生資所 g934251 詹濠先. General Information.

marcie
Télécharger la présentation

Protein Structure Database Introduction

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Protein Structure Database Introduction ModBase Database of Comparative Protein Structure Models 生資所 g934251詹濠先

  2. General Information • MODBASE is a queryable database of annotated protein structure models. The models are derived by ModPipe, an automated modeling pipeline relying on the programs PSI-BLAST and MODELLER. The database also includes fold assignments and alignments on which the models were based. MODBASE contains theoretically calculated models, which may contain significant errors, not experimentally determined structures. Thus, special care is taken to assess the quality of the models. More information about MODBASE can be found in the HTML or PDF version of ModBase: A database of annotated comparative protein structure models. R. Sánchez, U. Pieper, N. Mirkovic, P.I.W. deBakker, E. Wittenstein & A. Sali. Nucl. Acids Res.28, 250-253, 2000.

  3. MODBASE core • Models in MODBASE are calculated using MODPIPE, our entirely automated software pipeline for comparative modeling (16). MODPIPE can calculate comparative models for a large number of protein sequences, using many different template structures and sequence–structure alignments. MODPIPE relies on the various modules of MODELLER for its functionality and is streamlined for large-scale operation on a cluster of PCs using scripts written in PERL.

  4. Models in MODBASE are organized into data sets. The largest data set contains models of all sequences in the Swiss-Prot/TrEMBL database that are detectably related to at least one known structure in the PDB. • Currently, there are 1,262,629 models for domains in 659,495 of the 1,182,126 sequences in the Swiss-Prot/TrEMBL database, with an average length of 235 residues per model. Human 32,985:sequences, Arabidopsis thaliana : 22 880 sequences, Drosophila melanogaster : 15,195 sequences Escherichia coli : 9,691 sequences • Because the sequence databases contain sequence information of different strains and mutations, the number of unique sequences for a given organism exceeds the number of genes in the genome.

  5. Specialities • Predicted interacting proteins Residue contacts between the two models are predicted based on a match of both modeled sequences to different parts of a single PDB file. The residue contacts in a hypothetical interface are scored by their propensities to span an interface. False positive ratio : 25%

  6. Specialities • Predicted Ligand Binding Sites ModBase contains a list of the binding sites of known structure for ∼50 000 ligands found in the PDB. Forty-four percent of the models in MODBASE have at least one predicted binding site for a small ligand. • Application of MODBASE to Structural Genomics NYSGXRC structures, PSI-BLAST E-value

  7. Access and Interface • MODBASE is queryable http://salilab.org/modbase PDB codes, Swiss-Prot/TrEMBL and GenPept accession numbers, annotation keywords, model reliability, model size, target–template sequence identity, alignment significance, and sequence similarity to the modeled sequences as detected by BLAST • The output of a search is displayed on pages with varying amounts of information about the modeled sequences, template structures, alignments and functional annotations. These tables also contain links to other sequence, structure and function annotation databases, such as PDB (4), GenBank (3), Swiss-Prot/TrEMBL (2), CATH (32), Pfam (33), ProDom (34), and UCSC Genome Browser (35). In addition, MODBASE models are directly accessible from the Swiss-Prot/TrEMBL sequence pages at http://www.expasy.org and UCSC Genome Browser at http://genome.ucsc.edu. 2 Press “Search”! 1 Enter PDB, Swiss-Prot GenPept…etc here.

  8. Search Select your purpose, e.g., “Find Ligand binding site” Select any of your interests.

  9. Homolog Structure Prediction of Ligand Binding Sites

  10. Reference • Ursula, Pieper et al. MODBASE, a database of annotated comparative protein structure models, and associated resources. Nucleic Acids Res. 2004 January 1; 32(Database issue): D217–D222.

More Related