1 / 33

Protein Structure Prediction

Protein Structure Prediction. What is PSP ?. Primary sequence (1D). Tertiary Structure (3D). …ACLLYYTTCAT…. all bonds angles , dihedral angles and bond lengths between each amino acid residue in protein. “Solving” PSP. View PSP as a search.

benson
Télécharger la présentation

Protein Structure Prediction

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Protein Structure Prediction

  2. What is PSP ?

  3. Primary sequence (1D) Tertiary Structure (3D) …ACLLYYTTCAT… all bonds angles, dihedral angles and bond lengths between each amino acid residue in protein

  4. “Solving” PSP

  5. View PSP as a search Given any primary sequence of an unknown protein (in the sense of it’s 3D structure) Consider PSP as performing a search through the configuration space of the given protein The space of different configurations

  6. Steps in solving PSP Given primary sequence predict the final 3D structure (1D 3D). 2 Step process (1D 2D, 2D 3D) 1st find configuration for the secondary structure (SS Prediction) 2nd find configuration for the side-chains (side-chain conformation)

  7. Required “components” in solving PSP All methods require the definition of a protein model A simplified protein structure model A potential energy function

  8. Simplified structure model of Protein By the above we mean that the protein in question has simpler physical properties then an actual protein This is needed as trying to solve PSP is too complex for real proteins Good simplified models give a good approximation for the actual shape of the protein Determining a good model is a research area by itself

  9. Simplified model of the protein backbone “Actual” model of the protein backbone Example of Simplified Model of Protein 3 dihedral angles Bond angles is the only dihedral angle

  10. Potential Energy Function In thermodynamics, A molecule is most stable when it’s free energy is at a minimum How do we know when a predicted structure is the native shape of the protein ? native shape is at a free energy minimum • The potential energy function is a simplification of actual forces acting on a real protein molecule and it’s formulation is based on the given simplified structural model

  11. Purpose : Minimize Example of Potential Energy Function Hydrophobic Interaction Van der Waals Interaction Etotal= EHH+ Evdw Evdw = Cv · fvdw Van der Waals Potential Summation over all atoms with rij < 8A A = Angstrom = 1 ten-billionth of meter rij = distance between atom i and j Ri = van der waals radii of atom i

  12. Different approaches to PSP Ab Initio Methods Knowledge Based Methods

  13. Ab Initio Methods

  14. Ab Initio means from 1st principles • Use thermodynamic laws to figure out the configuration of the fold of the given protein protein folding problem • Global/semi-global minimization of the function • 1D 2D = secondary structure problem • 2D 3D = side-chain conformation What is Ab Initio ?

  15. Some Ab Initio Methods Molecular Dynamic Simulation Using complex energy functions simulate folding of the primary sequence until it reaches it’s native state (1D->3D) Genetic Algorithm Used in refining a given potential function so that it can best predict the native state of a protein Simulated Annealing Branch and Bound Methods (usually used in side-chain conformation) Approximation algorithms Comparative/Homologue Modeling Threading Docking

  16. Knowledge Based Methods

  17. Knowledge Based Methods Using knowledge of currently known protein folds, predict the shape of the target protein Assumption is the native fold of the target protein is similar to a currently known one i.e in the same family Unable to predict any novel folds, i.e new fold family

  18. Some Knowledge-Based Methods Comparative/Homologue Modeling Threading Docking

  19. Primary Protein Sequence Methodological Framework for solving PSP Knowledge-base, e.g PDB Ab Initio Methods Homologue Modeling Threading Predicted 3D Structure of Protein

  20. Side-Chain Prediction Find a conformation of the all the side chains along the given main chain of a protein Usually done as the 2nd step in predicting the 3D structure of protein Also useful in drug design, where drug structures have to be designed to be easily docked by enzymes for breaking down

  21. Side-Chain Prediction The main chain fold has been computed and given as input choose positions of all side chains so as to minimize some potential energy function Problem if solved Ab Initio is proven to be NP-Complete (reduce Clique to it)

  22. Central Dogma The more tightly packed Side-chains are, the more stable they will be. Ponder & Richards have shown that there are a fixed set of rotations (rotamers) side-chains can take. Most methods now make use of this library of rotamers (abt 67 different rotations) Main concern is the search strategy to find the best conformation

  23. Methods in Side-chain prediction Simulated Annealing A* algorithm Monte Carlo Minimization Molecular Dynamics Simulation Dead End Elimination Genetic Algorithm

  24. Dead End Elimination Deterministic method to determine the global minimum energy conformation (GMEC) of set of side-chains. Continuously eliminate rotamers from consideration in the GMEC, until only 1 rotamer is left in each side-chain position (thus giving final conformation). DEE can be viewed as a mathematical criteria that a rotamer must fulfill in order not to be eliminated

  25. Dead End Elimination Potential function is described in terms of pair-wise interactions of all rotamers at all positions. Therefore energy function to minimized can be formulated as Sum of pairwise interaction energy between rotamer r at position j and rotamer u at positions j energy of the given backbone fold Sum of energy of rotamer r at side chain i

  26. Dead End Elimination Assuming p side-chains and n rotamers for each side-chain Time complexity of finding the configuration that minimizes the energy function takes O(p*np) Not feasible to use the original formulation

  27. Original DEE A rotamer ir can be eliminated from consideration if there is an alternative rotamer it at the same position that satisifies Maximum pairwise interaction energy between itand every other side-chain j Energy resulting from using rotamer r at position i Minimum pairwise interaction energy between irand every other side-chain j Energy resulting from using rotamer t at position i

  28. Original DEE Rotamer r at side-chain i can only be eliminated only if maximum energy conformation using rotamer t at side-chain i is smaller than the minimum energy conformation of using rotamer r Given some relevant energy landscape, the previous inequality in fact says the following

  29. Original DEE A simplistic implementation of Original DEE by simply translating the inequality to code result in a time complexity of O(n2*p2) There is however still a problem if the following happens

  30. Simple Goldstein DEE A rotamer ir can be eliminated from consideration if there is an alternative rotamer it at the same position that satisifies Energy resulting from using rotamer r at position i Energy resulting from using rotamer t at position i Difference in energy of conformation using ir and conformation using itwhich are at the point of closest contact

  31. Simple Goldstein DEE Rotamer r at side-chain i can only be eliminated by both totamer t1 and rotamer t2 since the difference is +ve at the points of closest contact. Meaning for any given conformation using t1 or t2 will result in a smaller overall energy than using r Given some relevant energy landscape, the previous inequality in fact says the following

  32. Simple Goldstein DEE A simplistic implementation of Simple Goldstein DEE by simply translating the inequality to code result in a time complexity of O(n3*p2) There is however still a problem if energy profiles of rotamer r and every other rotamer intersect. More powerful criteria will have to be used General GoldStein DEE, Simple Split DEE and general Split DEE The more powerful the criterion the higher it’s time complexity

  33. Conclusion Myriad of methods to attempt to solve the protein prediction problem Knowledge-based methods have gained a edge over Ab initio methods However not much improvement in the prediction power of modern heuristics, since the 1st experiment by Anfisen 3 decades ago Either problem is too hard / More discovery awaits the adventurous researcher

More Related