1 / 17

Protein Structure Prediction: Methods and Evaluation

This chapter discusses various methods for predicting protein structure, including nucleotide alignment and artificial neural networks. It also explores the challenges of evaluating the accuracy and effectiveness of these methods.

esthern
Télécharger la présentation

Protein Structure Prediction: Methods and Evaluation

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Chapter 9 Structure Prediction

  2. Motivation • Given a protein, can you predict molecular structure • Want to avoid repeated x-ray crystallography, but want accuracy • You could use nucleotide alignment, but what do you do with the gapped regions? • More complex methods are only justified if they can be shown to perform better than simpler methods • Simpler methods are only justified if they can perform better than basic sequence alignment

  3. First Step • Some structure comparison methods use secondary structures of the new sequence • Predict location of secondary structure elements along the protein’s backbone and the degree of residue burial • Supervised learning has been shown to perform well in this task

  4. Artificial Neural Network Predicts Structure at this point

  5. Danger • You may train the network on your training set, but it may not generalize to other data • Perhaps we should train several ANNs and then let them vote on the structure

  6. Profile network from HeiDelberg • family (alignment is used as input) instead of just the new sequence • On the first level, a window of length 13 around the residue is used • The window slides down the sequence, making a prediction for each residue • The input includes the frequency of amino acids occurring in each position in the multiple alignment (In the example, there are 5 sequences in the multiple alignment) • The second level takes these predictions from neural networks that are centered on neighboring proteins • The third level does a jury selection

  7. PHD Predicts 4 Predicts 5 Predicts 6

  8. Threading • Threading matches structure to sequence • True threading considers 3D spatial interactions

  9. 3D-1D Matching (Bowie et al.) • Convert 3D structure into a string • Include -helix, -sheet or neither • Include buried or solvent accessible (6 levels) • Total of 3X6=18 distinct states • With Pa:j= probability of finding amino acid (a) in environment (j) and Pa=probability of finding (a) anywhere

  10. 3D-1D • Calculate the information values score on a training set of multiple alignments and the score was used as a profile for each column • When applied to the globin family an clearly identified myoglobins from nonglobins but not from other globins

  11. Methods using 3D interactions • Residues that have large separation in the sequence may end up next to each other when the protein is folded. • Define a measure of contact between residues (two atoms within 5Å) and count frequency of contact between all pairs in PDB • Use measure in alignment to evaluate cost, or to select the best alignment

  12. 3D interactions

  13. Potentials of mean force (POMF) • Since the notion of contact is somewhat arbitrary, a more general formulation can be tried • Derive an empirical function for the propensity of each of the 400 pairs of residues to be any given distance apart.

  14. Multiple Sequence Threading • Multiple Sequence Alignment • Align the most similar to create a consensus sequence • Align consensus sequences to create overall alignment • Use the same strategy with structures • Assume that conserved hydrophobic positions should pack in the core • This appears to be work in progress (1997)

  15. Example • Two small hydrophobic residues alanine (A) and valine (V), both of which favor packing in the core of the protein. • The POMF would have a peak around 5A • Aspartate (D) and valine since do not often pack together • The POMF will have a dip around 5A POMF(A,V) Probability Distance 5A POMF(D,V) Probability Distance 5A

  16. Sequence-Structure Alignment • For all know structures • Align the unknown sequence to that structure • Find the best alignment • Return the structure with the best global alignment • Unfortunately, we cant use dynamic programming (NP Complete) • Heuristics must be used to explore the space.

  17. Evaluating Methods • Is the complexity worth it? • This is difficult without a benchmark • Few comparative studies have been performed • When they have been performed, authors of competing methods have complained that wrong parameters were used … • Critical Assessment of Structure Prediction (CASP 1994) releases protein structures prior to publication. • All methods submit their predictions • Predictions are analyzed based on fold recognition, modeling accuracy and alignment accuracy. • No one method or approach is obviously superior

More Related