Enhancing Homology Modeling for Structural Genomics

Topic 3: Structural Genomics and Models Contributors: S.K. Burley, A. Fiser, A. Godzik, A. Joachimiak, J. Markley, G. Montelione, C. Orengo, A. Sali, and M. Sauder Discussion Leader: Stephen K. Burley Workshop on Biological Macromolecular Structure Models RCSB PDB • Piscataway, NJ • November 19-20, 2005

Role of Comparative Protein Structure Modeling in Structural Genomics

Protein Structure Initiative 2: Need for Large-Scale Homology Modeling • PSI-2 will yield 3,000-4,000 protein structures, most at course granularity • Each structure will represent a large number of sequence homologues • Homology modeling must provide “useful” models for distant (15-30%) sequence homologuesprotein function assignment and evolutionary insights • Models should guide functional characterization • Models must be readily accessible • Models must be subject to rigorous peer review

Issues Addressed in Contributed Slides • Current limitations of homology modeling • Role of homology modeling in target selection/execution • Role of homology modeling in structure determination • Homology modeling pipelines

Current Limitations of Homology Modeling • Input from • Joachimiak--MCSG • Sali--NYSGXRC

Issues with Homology Modeling for Structural Genomics • Models for distant (15-30%) homologues are poor quality • For very large families only small fraction of sequences can be reliably modeled (<10%) • Modeling must guide target selection in fine coverage of protein families • Domain parsing needs improvement • We should be able to model multi-domain proteins from structures of individual domains • We should be able to model side chains and important structural and functional features that currently are difficult to assign and predict correctly • We need methods to predict unusual features and departures from the structure that is used for modelling • Modelling loop and high B factor regions needs improvement

Scope for further improvement (significant e-value, bad model score) Good Models <30% seq.id Good Models >30% seq.id. Only 363 bad-models ≥30% sequence identity. Models Based on NYSGXRC Target Structures Good Models: E-Value ≤ 1.0e-4 GAScore ≥ 0.7

Questions for Homology Modeling Community • Should models be stored in archives or calculated “on the fly”? • Should models from pipeline approaches be centrally accessible? • Should the output of pipeline approaches be made interoperable with the PDB? • Should there be a publicly available model database for storage of modeling results to facilitate peer review? • Should models currently on deposit in the PDB be moved elsewhere? If so, where?

Enhancing Homology Modeling for Structural Genomics

Enhancing Homology Modeling for Structural Genomics

Presentation Transcript

Macromolecular Chemistry

Atomic Structure Models

HTCondor and macromolecular structure validation

2012 CMACS Workshop on Modeling Biological Systems

Heteronuclear Relaxation and Macromolecular Structure and Dynamics Outline:

URBAN STRUCTURE MODELS

Urban Structure Models

Eugene Krissinel Macromolecular Structure Database keb@ebi.ac.uk

2012 CMACS Workshop on Modeling Biological Systems

Workshop on Biological Macromolecular Structure Models

Macromolecular structure

Network Structure Models

Macromolecular Crystallography Workshop 2004

Biological Valuation Workshop

Hierarchy of Biological Complexity Macromolecular machines Protein and nucleic acid structure

the European Macromolecular Structure Database (EMSD).

2014 CMACS Workshop on Modeling Biological Systems

Macromolecular structure refinement

Workshop Structure

Fusing and Composing Macromolecular Regulatory Network Models

High-speed macromolecular structure determination on a Superbend Beamline 8.3.1