Download
slide1 n.
Skip this Video
Loading SlideShow in 5 Seconds..
Alignment of Flexible Molecular Structures PowerPoint Presentation
Download Presentation
Alignment of Flexible Molecular Structures

Alignment of Flexible Molecular Structures

125 Vues Download Presentation
Télécharger la présentation

Alignment of Flexible Molecular Structures

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

  1. Alignment of Flexible Molecular Structures

  2. Motivation • Proteins are flexible. One would like to align proteins modulo the flexibility. • Hinge and shear protein domain motions (Gerstein, Lesk , Chotia). • Conformational flexibility in drugs.

  3. Motivation

  4. Flexible protein alignment without prior hinge knowledge FlexProt - algorithm • detects automatically flexibility regions • exploits amino acid sequence order

  5. Examples

  6. Experimental Results

  7. Task:largest flexible alignment by decomposing the two molecules into a minimal number of rigid fragment pairs having similar 3-D structure.

  8. FlexProt Main Steps Detection of Congruent Rigid Fragment Pairs Joining Rigid Fragment Pairs Rigid Structural Comparison Clustering (removing ins/dels)

  9. Congruent Rigid Fragment Pair Structural Similarity Matrix

  10. k+l-1 k t+l-1 t Fragkt(l) = vk…vi ...vk+l-1 wt…wj…wt+l-1 RMSD (Fragkt(l) ) < e Detection of Congruent Rigid Fragment Pairs i-1 i+1 i j-1 j+1 j vi-1vivi+1 wj-1 wjwj+1

  11. FlexProt Main Steps Detection of Congruent Rigid Fragment Pairs Joining Rigid Fragment Pairs Rigid Structural Comparison Clustering (removing ins/dels)

  12. How to Join Rigid Fragment Pairs ?

  13. Graph Representation Graph Node Graph Edge

  14. Graph Representation • The fragments are in ascending order. • The gaps (ins/dels) are limited. • Allow some overlapping. W a b +Size of the rigid fragment pair (node b) - Gaps (ins/dels) - Overlapping Penalties

  15. W_k W_m W_n W_t W_i Graph Representation • DAG (directed acyclic graph)

  16. W_k W_m W_n W_t W_i • “Single-source shortest paths” • O(|E|+|V|)

  17. FlexProt Main Steps Detection of Congruent Rigid Fragment Pairs Joining Rigid Fragment Pairs Rigid Structural Comparison Clustering (removing ins/dels)

  18. Clustering (removing ins/dels) T1 T2 If joining two fragment pairs gives small RMSD (T1 ~ T2) then put them into one cluster.

  19. FlexProt Main Steps Detection of Congruent Rigid Fragment Pairs Joining Rigid Fragment Pairs Rigid Structural Comparison Clustering (removing ins/dels)

  20. Rigid Structural Comparison

  21. Multiple Structural Alignment

  22. Multiple Structural Alignment Schemes • Linear progressive. Starts with one object and successively compares the other objects to the results. • Tree progressive. The alignment is created according to a similarity tree. The alignment direction is from the leaves to the tree root. • Gerstein and Levitt 1998. • Orengo and Taylor 1994. SSAPm method. • Sali and Blundell 1990 • Russell and Barton 1992 • Ding et al. 1994

  23. Multiple Structural Alignment Schemes • Pivot. Uses one object as the pivot and compares it to all other objects. The results are then analyzed to find the common similarities. • Leibowitz, Fligelman, Nussinov, and Wolfson 1999. Geometric Hashing technique. • Escalier, Pothier, Soldano, Viari 1998. Exploits all common substructures.

  24. Multiple Structural Alignment Schemes • Optimization Techniques. • Guda, Scheeff, Bourne, Shindyalov.Monte Carlo optimization.

  25. Previous Work – Multiple Structural Alignment • Disadvantages: • Most methods do not detect partial solutions. • The methods which detect partial solutions are not efficient for a large number of molecules.

  26. Partial Solutions B • Detection of local similarities. • Detection of subset of molecules that share some local structural pattern. A A B is harder to detect than A A B

  27. Multiple-LCP is NP-hard even in one dimensional space for the case of exact congruence (Akutsu 2000). • 3-D + ε-congruence more complex problem Largest Common Point Set (LCP) Given two point sets detect the largest common sub-set. [exact congruence or ε-congruence]

  28. Solution Space • The number of solutions, which answer the minimal criteria, could be exponential. α-1 α-2 α-3 3•2•3 kM α-1 α-2 α-1 α-2 α-3

  29. Partial Multiple-LCP Detect t largest alignments between exactly k molecules. We are interested in above solutions for each k, 2  k m.

  30. MultiProt /home/silly6/mol/demos/MultiProt/ • Non-predefined Pattern detection. • Partial Solutions. • Time Efficient – • 5 protein in 14 seconds • 20 proteins (~500 a.a.) in 10 minutes • 50 proteins (~200 a.a.) in 19 minutes • [PentiumII 500MHz 512Mb memory]

  31. α-1 α-2 α-3 α-1 α-1 α-1 α-2 α-2 α-2 α-1 α-1 α-1 α-2 α-2 α-2 α-3 α-3 α-3

  32. Algorithm Features • Assumption: any multiple alignment of proteins should align, at least short, contiguous fragments (minimum 3 points) of input points. • Reduction of solution space: The aligned contiguous fragments are of maximal length. • All (almost, because of ε-congruence) possible solutions (transformations) are detected (optimal solutions are ‘hard’ to select).

  33. Multiple Alignment with Pivot Input: Pivot Molecule: Mp (participates in all solutions) Set of Molecules: S`=S\{Mp } Error Threshold: ε • Detect all possibly aligned fragments of maximal length between the input molecules (chance to detect subtle similarities). • Select solutions that give high scoring global structural similarity. • Iterate over all possible pivots, Mp = M1… Mm

  34. Bio-Core Detection • Geom. + Bio. Constraints • Classification: • hydrophobic (Ala, Val, Ile, Leu, Met, Cys) • polar/charged(Ser, Thr, Pro, Asn, Gln, Lys, Arg, His, Asp, Glu) • aromatic(Phe, Tyr, Trp) • glycine(Gly) Or any other scoring matrix!

  35. Experimental Results

  36. Superhelix, 5 molecules.

  37. Concavalin, 6 molecules.

  38. Partial Solution Detection B 1adj 1hc7 1qf6 1ati A Task to detect A and B B x A z y A B A B

  39. Domain A ranked first(142 matched atoms) • Domain B ranked eight’th(85 matched atoms)

  40. 4 proteins aligned based on detected domain A

  41. Multiple Alignment of domain A

  42. Multiple Alignment of domain A(enlarged)

  43. 4 proteins aligned based on domain B

  44. Multiple Alignment of domain B

  45. Multiple Alignment of domain B(enlarged)

  46. B A Application to G proteins A

  47. Substrate assisted catalysis – application to G proteins Substrate assisted catalysis – application to G proteins. Mickey Kosloff and Zvi Selinger, TRENDS in Biochemical Sciences Vol.26 No.3 March 2001 161

  48. Aspects of Structural Comparison • A large number of structures (hundreds) – Molecular Dynamics. • Structural flexibility – proteins are not rigid structures. • Structure representation – • C-alpha atoms are suitable for comparisons of folds. • Detection of similar function requires different representation. This brings another problem – side chain flexibility. • Sequence order in structural alignment. • Detection of active sites might require different approach. Proteins with different folds might provide the same function. • Statistical Significance • Measure of geometrical similarity (RMSD, bottleneck, …), biological scoring function.