1 / 39

A computational study of protein folding pathways

A computational study of protein folding pathways. Reducing the computational complexity of the folding process using the building block folding model. Nurit Haspel, Chung-Jung Tsai, Haim Wolfson and Ruth Nussinov. The building blocks model (Chung Jung Tsai).

redford
Télécharger la présentation

A computational study of protein folding pathways

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. A computational study of protein folding pathways Reducing the computational complexity of the folding process using the building block folding model. Nurit Haspel, Chung-Jung Tsai, Haim Wolfson and Ruth Nussinov

  2. The building blocks model(Chung Jung Tsai) • Protein folding is a hierarchical process. • A protein is constructed from HFU’s. • HFU - the result of a combinatorial assembly of building blocks. • Building block - a contiguous, highly populated fragment. • The building block model allows illustrating the protein folding pathway.

  3. An outline of the building blocks algorithm • Scoring function - measures the relative stability of a candidate building block • Three ingredients: • Compactness • Degree of isolation • hydrophobicity • The result - an “anatomy tree” that illustrates the most probable folding route.

  4. The Scoring Function Z - Compactness H - hydrophobicity I - Isolation

  5. Compactness, Hydrophobicity and Isolation definitions • Compactness - • Hydrophobicity - • Isolation -

  6. The Cutting Procedure • Locating a basket of candidate building blocks (relatively stable contiguous fragments): • Assign a stability score to all the candidate fragments • Collect the local minima in the “fragment map” (best score in a given radius). • Recursively splitting the protein top-down: • Search the “basket” for a set of fragments that constitute the whole fragment, allowing a short overlap (7 residues) and a gap of up to 15 residues. • Minimum building block size - 15. • No node can have only one child (except for the root) • Stop when the node can not be split any further • In this work, building blocks up to level 6.

  7. Example - Annexin III

  8. Example (cont.)

  9. Example (cont.)

  10. Usefulness of the anatomy tree • It is possible to see whether a protein folds through single or multiple route(s). • These routes can be observed by inspecting the fragment map (there can be more than one way to construct a tree). • Sequential versus non-sequential folding. • Sequential – contact made only between consecutive building blocks. • Binary anatomy tree sequential folder. • Fast versus slow folding • Sequential folding proteins usually fold faster. • Climbing up the tree allows us to illustrate the folding process.

  11. Critical building blocks (Sandeep Kumar) • Some building blocks may be considered critical for correct folding. • A critical building block is in contact with other building blocks in the protein. • It likely to be inserted between sequentially connected building blocks. • Without it, the other building blocks are likely to mis-associate. • The structure and sequence of a critical BB is more likely to be conserved.

  12. Critical building block algorithm • For each building block: • Compute its diff. contacting surface area . • Compute its Critical building block index : • Compute its Z-score:

  13. Critical building blocks (cont.) • It is found at most levels below the hydrophobic folding unit level • It has a consistently high CIndex at different levels • Its CIndex is significant by at least 2 standard deviations in at least one level of protein anatomy A building block is critical if:

  14. The goals of my research • Clustering the building blocks according to their 3-D structures, using a rigid matching algorithm. • Analyzing the building blocks: Sequence, stability distribution, size. • Analyzing the clusters: Size, stability score distribution, sequence conservation, criticalness conservation.

  15. The goals of my research (cont.) • Analyzing the critical building blocks: position within the protein, relative stability, sequence and structure conservation. • Developing an algorithm that assigns a set of building blocks to a protein sequence, using sequence similarity, relative stability and more information.

  16. Clustering the building blocks • Each cluster has representative members (one or more) • For each building block structure: • Go over the clusters. • Match with cluster representative(s). • If matches (1.5A rmsd, 70% size) - join the building block to the cluster. • If no match found - open a new cluster with this building block as a representative. Problem -O(n²) comparisons n - number of clusters

  17. Clustering of the building blocks Cluster 1 Cluster 2 Cluster n … ? ?

  18. Making clustering more efficient • Dividing the building blocks into SCOP families (proteins from the same family usually produce the same building blocks). • Clustering each family and then merge all the clusters - reduces the number of clusters at each instance.

  19. Building block and cluster data

  20. Distribution of number of clusters

  21. An example of a cluster

  22. Sequence analysis of the clusters • Sequence clustering of each structural cluster (using BLAST). • Creating a non-redundant sequence dataset. • Goal - finding a connection between (short) sequences and structures.

  23. Statistical analysis of the clusters and of the critical building blocks • Stability score distribution among cluster members. • Criticalness score distribution among cluster members. • Position distribution of the critical building blocks. • Stability score as a function of criticalness score.

  24. An example of stability distribution

  25. Criticalness score distribution within a cluster

  26. An N-terminus critical building block example

  27. A C-terminus critical building block example

  28. A mid-sequence critical building block example

  29. Distribution of the position inside the protein - all-alpha, level 3

  30. Stability vs. Criticalness score example

  31. Stability score of critical and non-critical building blocks (histogram) Non-critical Critical

  32. Final goal Given a sequence and using the information accumulated so far - is there a way of matching a set of building blocks to it?

  33. The building block assignment algorithm • Perform sequence alignment of the protein sequence against the building block sequence database. • Construct a directed, acyclic graph. • Each matching building block is a graph vertex and is assigned a score depending on the sequence alignment score, building block stability and other parameters. • Directed edges connecting the fragments that match to consecutive areas in the protein sequence, allowing short overlaps and small gaps. • Edge score – average score of connected vertices.

  34. The building block assignment algorithm (cont.) • Add fictitious “start” and “target” vertices. • Connect start to all starting vertices • Connect all ending vertices to target. • Find shortest path from start to target using the Single source shortest path algorithm. • The path is an “optimal” building block assignment covering the protein sequence.

  35. Illustration of the algorithm

  36. Example – ROP protein from E. coli (1rpo)

  37. Example – Myoglobin from sea hare (1mba)

  38. Suggestions for future work • Improving the algorithm and adding new parameters to it (secondary structure alignment, trying other building blocks from the same cluster as the matching building blocks etc.). • Combinatorial assembly – Yuval’s work. • Further cluster analysis – inquiring into sequence conservation • Conformation stability measurements (molecular dynamics…)

  39. Conclusions Using the hierarchical folding model, It may be possible to reduce the folding complexity, assigning local substructures and then assembling them.

More Related