html5-img
1 / 77

Algorithms Exploiting the Chain Structure of Proteins

Algorithms Exploiting the Chain Structure of Proteins. Itay Lotan Computer Science. Proteins 101. Involved in all functions of our body: metabolism, motion, defense, etc. Michael Levitt. Protein representation. Torsion angle model: C α model:. Structure determination.

don
Télécharger la présentation

Algorithms Exploiting the Chain Structure of Proteins

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Algorithms Exploiting the Chain Structure of Proteins Itay Lotan Computer Science

  2. Proteins 101 Involved in all functions of our body: metabolism, motion, defense, etc. Michael Levitt

  3. Protein representation • Torsion angle model: • Cα model:

  4. Structure determination X-ray crystallography Bernhard Rupp

  5. Outline • Fast energy computation during Monte Carlo simulation • Model completion for protein X-ray crystallography • Large scale computation of similarity Exploit specific properties of proteins to perform the computation efficiently

  6. Outline • Fast energy computation during Monte Carlo simulation • Model completion for protein X-ray crystallography • Large scale computation of similarity Lotan, Schwarzer, Halperin* and Latombe. J. Comput. Bio. 2004 (to appear) *CS Department, Tel-Aviv University

  7. Monte Carlo simulation (MCS) • Estimate thermodynamic quantities • Search for low-energy conformations and the folded structure Popular method for sampling the conformation space of proteins:

  8. MCS: How it works • Propose random change in conformation • Compute energy E of new conformation • Accept with probability: Requires >>106 steps to sample adequately

  9. Energy function • Bonded terms: • Bond lengths: • Bond angles: • Dihedral angles: • Non-bonded terms: • Van der Waals: • Electrostatic: • Heuristic: Go models, HP models, etc.

  10. Pair-wise interactions • Cutoff distance (6 - 12Å) • Linear number of interactions contribute to energy (Halperin & Overmars ’98) Challenge: Find all interacting pairs without enumerating all pairs

  11. Related work Biology • Neighbor lists • Verlet ’67 • Brooks et al. ’83 • Grid • Quentrec & Brot ’73 • Hockney et al. ’74 • Van Gunsteren et al. ’84 • Neighbor lists + grid • Yip & Elber ’89 • Petrella ’02 Computer Science • Bounding volume hierarchies for collision detection • Gotschalk et al. ’96 • Larsen et al. ’00 • Guibas et al. ’02 • Space partition methods for collision detection • Faverjon ’84 • Halperin & Overmars ’98 • Collisions detection for chains • Halperin et al. ’97 • Guibas et al. ’02

  12. Grid method • Linear complexity • Optimal in worst case d:Cutoff distance

  13. Contributions • Efficient maintenance and self-collision detection for kinematic chains • Efficient computation of pair-wise interactions in MCS of proteins • Scheme for caching and reusing partial energy sums during MCS • MCS software* Much faster than existing algorithm (grid method) *Download at: http://robotics.stanford.edu/~itayl/mcs

  14. Properties of kinematic chains • Small changes  large effects

  15. Properties of kinematic chains • Small changes  large effects

  16. Properties of kinematic chains • Small changes  large effects • Local changes  global effects

  17. Properties of kinematic chains • Small changes  large effects • Local changes  global effects • Few DoF changes  long rigid sub-chains

  18. Properties of kinematic chains • Small changes  large effects • Local changes  global effects • Few DoF changes  long rigid sub-chains

  19. ChainTree: A tale of two hierarchies • Transform hierarchy: approximates kinematics of protein backbone at successive resolutions • Bounding volume hierarchy: approximates geometry of protein at successive resolutions

  20. Hierarchy of transforms

  21. TAI TAE TEI TCE TEG TGI TAC TAB TBC TCD TDE TEF TFG TGH THI D C G H A B E F I Hierarchy of transforms

  22. BAH BEH BAD BCD BEF BGH BAB BB BA BC BD BE BF BG BH Hierarchy of bounding volumes

  23. D C G H A B E F I The ChainTree TAIBAH TAEBAD TEIBEH TACBAB TCEBCD TEGBEF TGIBGH TABBA TBCBB TCDBC TDEBD TEFBE TFGBF TGHBG THIBH

  24. D C G H A B E F I Updating the ChainTree TAIBAH TAEBAD TEIBEH TACBAB TCEBCD TEGBEF TGIBGH TABBA TBCBB TCDBC TDEBD TEFBE TFGBF TGHBG THIBH

  25. P N O J K L M A B C D E F G H Computing the energy Recursively search ChainTree for interactions • Pruning rules: • Prune search when distance between bounding volumes is more than cutoff distance • Do not search inside rigid sub-chains

  26. P N O J K L M A B C D E F G H Computing the energy [ ] P

  27. P N O J K L M A B C D E F G H Computing the energy [ ] P [ ] N

  28. P N O J K L M A B C D E F G H Computing the energy [ ] P [ ] [ ] N O

  29. P N O J K L M A B C D E F G H Computing the energy [ ] P [ ] [ ] [ ] N N-O O

  30. P N O J K L M A B C D E F G H Computing the energy [ ] P [ ] [ ] [ ] N N-O O [ ] [ ] [ ] J J-K K [ ] [ ] A-C C [ ] [ ] A-D C-D [ ] [ ] B-C D [ ] B-D

  31. P N O J K L M A B C D E F G H [ ] A-C [ ] [ ] B-C D-G [ ] D-H Computing the energy [ ] P [ ] [ ] [ ] N N-O O [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] J J-K K J-L J-M K-L K-M L L-M M [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] A C A-E A-G C-E C-G E E-G H [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] A-B A-D C-D A-F A-H C-F C-H E-F E-H H-G [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] B D B-E B-G D-E F F-G G [ ] [ ] [ ] [ ] [ ] B-D B-F B-H D-F F-H

  32. P N O J K L M A B C D E F G H [ ] A-C [ ] [ ] B-C D-G [ ] D-H Computing the energy E(O) [ ] P [ ] [ ] [ ] N N-O O [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] J J-K K J-L J-M K-L K-M L L-M M [ [ ] [ ] [ ] [ ] [ ] [ ] ] [ ] [ ] A C A-E A-G C-E C-G E E-G H [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] A-B A-D C-D A-F A-H C-F C-H E-F E-H H-G [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] B D B-E B-G D-E F F-G G [ ] [ ] [ ] [ ] [ ] B-D B-F B-H D-F F-H

  33. Computing the energy • Only changed interactions are found • Reuse unaffected partial sums • Better performance for • Longer proteins • Fewer simultaneous changes

  34. Computational complexity • Updating: • Searching: worst case bound Much faster in practice

  35. Test 1-DoF change 5-DoF change [68 res.] [144 res.] [374 res.] [755 res.] [68 res.] [144 res.] [374 res.] [755 res.]

  36. Simulation of α-Synuclein • 140 res. protein implicated in Parkinson’s disease • Multi-canonical Replica-exchange MC regime • Over 1000 CPU days of simulation • Study conformations at room temp. • Joint work with Vijay Pande

  37. Outline • Fast energy computation during Monte Carlo simulation • Model completion for protein X-ray crystallography • Large scale computation of similarity Lotan, van den Bedem*, Deacon* and Latombe, WAFR 2004 van den Bedem*, Lotan, Latombe and Deacon*, submitted to Acta. Cryst. D * Joint Center for Structural Genomics (JCSG) at SSRL

  38. Protein Structure Initiative • Reduce cost and time to determine protein structure 152K sequenced genes (30K/year) 25K determined structures (3.6K/year) • Develop software to automatically interpret the electron density map (EDM)

  39. EDM 3-D “image” of atomic structure • High value (electron density) at atom centers • Density falls off exponentially away from center

  40. Automated model building • ~90% built at high resolution (2Å) • ~66% built at medium to low resolution (2.5 – 2.8Å) • Gaps left at noisy areas in EDM (blurred density) Gaps need to be resolved manually

  41. The Fragment completion problem • Input • EDM • Partially resolved structure • 2 Anchor residues • Length of missing fragment • Output • A small number of candidate structures for missing fragment A robotics inverse kinematics (IK) problem

  42. Related work Biology/Crystallography • Exact IK solvers • Wedemeyer & Scheraga ’99 • Coutsias et al. ’04 • Optimization IK solvers • Fine et al. ’86 • Canutescu & Dunbrack Jr. ’03 • Ab-initio loop closure • Fiser et al. ’00 • Kolodny et al. ’03 • Database search loop closure • Jones & Thirup ’86 • Van Vlijman & Karplus ’97 • Semi-automatic tools • Jones & Kjeldgaard ’97 • Oldfield ’01 Computer Science • Exact IK solvers • Manocha & Canny ’94 • Manocha et al. ’95 • Optimization IK solvers • Wang & Chen ’91 • Redundant manipulators • Khatib ’87 • Burdick ’89 • Motion planning for closed loops • Han & Amato ’00 • Yakey et al. ’01 • Cortes et al. ’02, ’04

  43. Contributions • Sampling of gap-closing fragments biased by the EDM • Refinement of fit to density without breaking closure • Fully automatic fragment completion software for X-ray Crystallography Novel application of a combination of inverse kinematics techniques

  44. Two-stage IK method • Candidate generations: Optimize density fit while closing the gap • Refinement: Optimize closed fragments without breaking closure

  45. Stage 1: candidate generation • Generate random conformation • Close using Cyclic Coordinate Descent (CCD) (Wang & Chen ’91, Canutescu & Dunbrack Jr. ’03)

  46. Stage 1: candidate generation • Generate random conformation • Close using Cyclic Coordinate Descent (CCD) (Wang & Chen ’91, Canutescu & Dunbrack ’03)

  47. Stage 1: candidate generation • Generate random conformation • Close using Cyclic Coordinate Descent (CCD) (Wang & Chen ’91, Canutescu & Dunbrack ’03)

  48. Stage 1: candidate generation • Generate random conformation • Close using Cyclic Coordinate Descent (CCD) (Wang & Chen ’91, Canutescu & Dunbrack ’03)

  49. Stage 1: candidate generation • Generate random conformation • Close using Cyclic Coordinate Descent (CCD) (Wang & Chen ’91, Canutescu & Dunbrack ’03) CCD moves biased toward high-density

  50. Stage 2: refinement • Target function T(goodness of fit to EDM) • Minimize T while retaining closure • Closed conformations lie on Self-motion manifold of lower dimension 1-Dmanifold

More Related