1 / 26

Protein Structure Prediction With Evolutionary Algorithms

This presentation explores protein structure prediction using genetic algorithms. Analyzing algorithm parameters impacting performance, the authors aim to guide future design with suggestions. The discussion covers protein folding, HP models, internal coordinates, and energy formulation, highlighting constraint management and key results.

spridgen
Télécharger la présentation

Protein Structure Prediction With Evolutionary Algorithms

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Protein Structure Prediction With Evolutionary Algorithms Natalio Krasnogor, U of the West of England William Hart, Sandia National Laboratories Jim Smith, U of the West of England David Pelta, Universidad de Granada Presenter: Elena Zheleva

  2. Introduction • Problem Description • Biology Background • Protein Folding • HP Protein Folding Model • Genetic Algorithm (GA) Design Factors • Encodings for Internal Coordinates • Potential Energy Formulation • Constraint Management • Methods and Results • Conclusion

  3. Problem Description • Computational Biology open problem: protein structure prediction • Genetic algorithms have been used in the research literature • Authors analyze 3 algorithm parameters that impact performance and behavior of GAs • Goal: make suggestions for future algorithm design

  4. Outline • Problem Description • Biology Background • Protein Folding • HP Protein Folding Model • GA Design Factors • Encodings for Internal Coordinates • Potential Energy Formulation • Constraint Management • Methods and Results • Conclusion

  5. Protein Folding • Proteins: driving force behind all of the biochemical reactions which make biology work • Protein is an amino acid chain! • Amino acid chain -> Structure of a protein • Structure of a protein -> Function of a protein

  6. Protein Folding • Protein Folding: connection between the genome (sequence) and what the proteins actually do (their function). • Currently, no reliable computational solution for protein folding (3D structure) problem. • Chemistry, Physics, Biology, CS

  7. Outline • Problem Description • Biology Background • Protein Folding • HP Protein Folding Model • GA Design Factors • Encodings for Internal Coordinates • Potential Energy Formulation • Constraint Management • Methods and Results • Conclusion

  8. HP Protein Folding Model • Amino acid chains (proteins) are represented as connected beads on a 2D or 3D lattice • HP: hydrophobic – hydrophilic property • Hydrophobic amino acids can form a hydrophobic core w/ energy potential

  9. HP Protein Folding Model • Model adds energy value e to each pair of hydrophobics that are adjacent on lattice AND not consecutive in the sequence • Goal of GA: find low energy configurations!

  10. Outline • Problem Description • Biology Background • Protein Folding • HP Protein Folding Model • GA Design Factors • Encodings for Internal Coordinates • Potential Energy Formulation • Constraint Management • Methods and Results • Conclusion

  11. Encodings for Internal Coordinates • Proteins are represented using internal coordinates (vs. Cartesian) • Absolute vs. Relative encoding • Absolute Encoding: specifies an absolute direction cubic lattice: {U,D,L,R,F,B} • Relative Encoding: specifies direction relative to the previous amino acid cubic lattice: {U,D,L,R,F} n-1 n-1

  12. Encodings for Internal Coordinates • Encoding impacts global search behavior of GA • Example: One-point Mutations • Relative Encoding: FLLFRRLRLLR-> FLLFRFLRLLR • Absolute Encoding: RULLURURULU-> RULLUULULDL

  13. Outline • Problem Description • Biology Background • Protein Folding • HP Protein Folding Model • GA Design Factors • Encodings for Internal Coordinates • Potential Energy Formulation • Constraint Management • Methods and Results • Conclusion

  14. Potential Energy Formulation • Problem: same energy but different potential (Picture ) • Augment energy function to allow a distance-dependent hydrophobic-hydrophobic potential (Formula)

  15. Outline • Problem Description • Biology Background • Protein Folding • HP Protein Folding Model • GA Design Factors • Encodings for Internal Coordinates • Potential Energy Formulation • Constraint Management • Methods and Results • Conclusion

  16. Constraint Management • Methods for penalizing infeasible conformations • Method 1: Consider only feasible conformations • Weakness: shortest path from one feasible conformation to another may be very long • Method 2: Fixed Penalty Approach • Violations: • 2 amino acids lying on the same lattice point • Lattice point at which there are 2 or more amino acids • Penalty per violation = 2*number of hydrophobics + 2 (any infeasible conformation has positive energy)

  17. Outline • Problem Description • Biology Background • Protein Folding • HP Protein Folding Model • GA Design Factors • Encodings for Internal Coordinates • Potential Energy Formulation • Constraint Management • Methods and Results • Conclusion

  18. Methods and Results • 1-point and 2-point Mutation operators • 1-point, 2-point and Uniform Crossover operators • 5 polymer sequences (< 50 amino acids) • Each run of GA: 200 generations

  19. Methods and Results • Relative vs. Absolute Encoding (Diagram ) Distribution of relative ranks on the 3 lattices

  20. Methods and Results • Standard vs. Distant Energy • Does the modified energy potential improve the search capabilities of the GA? • No significant difference on test sequences • A guess: there might be on longer sequences

  21. Conclusion • GAs applied to Protein Structure Prediction problem have 3 important factors to consider • Relative encoding is at least as good as absolute encoding, in some cases much better • Modified energy potential does not improve search capabilities of GA • The proposed constraint/penalty method ensures feasibility of the optimal solution

  22. PE (Post Exhibitum)

  23. PE

  24. PE

  25. PE

  26. PE

More Related