1 / 32

Graphical Models for Protein Kinetics

Graphical Models for Protein Kinetics. Nina Singhal CS374 Presentation Nov. 1, 2005. Outline. Background material on proteins Why study protein kinetics Graphical models for kinetics Motion planning view (Apaydin et al, 2003) Molecular dynamics view (Singhal et al, 2004) Conclusions.

kylia
Télécharger la présentation

Graphical Models for Protein Kinetics

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Graphical Models for Protein Kinetics Nina Singhal CS374 Presentation Nov. 1, 2005

  2. Outline • Background material on proteins • Why study protein kinetics • Graphical models for kinetics • Motion planning view (Apaydin et al, 2003) • Molecular dynamics view (Singhal et al, 2004) • Conclusions

  3. Alpha Helix Beta Strand and Sheet Beta Barrel Background on Proteins

  4. Structure Prediction MTYKLILNGKTLKGETTTEAVDAATAEKVFKQYANDNGVDGEWTYDDATKTFTVTE Given an amino acid sequence, what 3D structure will the protein form? ?

  5. Pathways and Kinetics How does a protein actually get from an unfolded configuration to a folded configuration? ?

  6. Folding Kinetics • Rate of folding • Uniqueness of pathway • Order of secondary structure formation • Secondary or tertiary structure

  7. Applications • Misfolded proteins and diseases • Alzheimer's • Cystic fibrosis • Mad cow disease • Intermediates may be important as drug targets • Protein design

  8. Representation of a Protein N1 N1 C Psi omega phi psi Ca N2 N2 R A protein with n amino acids can be represented using 2n phi-psi angles, each in the range [0, 2p)

  9. Graphical Models for Protein Kinetics • Protein conformations have different energies • Graphical models discretize the conformation space and connect nearby regions with edges

  10. Robotics Motion Planning 2p c1 c1 c2 c2 0 2p Robot with 2 degrees of freedom 2D configuration space Moving the robot arm from c1 to c2 is just finding a path in the configuration space from c1 to c2.

  11. Roadmap Method • Randomly sample points in configuration space. Keep feasible ones. • Connect these points to form a graph. • Process path queries using standard graph search techniques. 2p c1 c2 0 2p

  12. Protein Folding as a Search Problem • Protein folding can be represented as a search through the protein’s configuration space • Replace collision free constraint with a preference for low energy configurations • Instead of finding any path, want to find all the energetically favorable paths

  13. Stochastic Roadmap Simulation (Apaydin et. al. 2003) • Sample protein configuration at random • Add edges between nearby nodes • Take advantage of the many folding pathways contained within a roadmap • Efficiently calculate many properties of the entire landscape

  14. Roadmap Construction • Nodes in the graph are sampled uniformly at random • Edges are added between nearest neighbors with probability: if DEij > 0 otherwise

  15. Roadmap as a Markov Chain • We can view the molecular motion as a random walk over the roadmap • Roadmap can be regarded as discretely sampled version of Monte Carlo simulation • If fact, in the limit, probability distributions of Monte Carlo simulation and the roadmap converge

  16. Transmission Coefficients • Measures “kinetic distance” • Probability that a conformation will fold before unfolding • Can calculate by starting many Monte Carlo simulations from the conformation • Very computationally expensive ? ? Unfolded state Folded state

  17. Algebraic Method for Calculating Transmission Coefficients F vi U Pij vj

  18. Transmission Coefficients (cont) • System of linear equations • One equation and one unknown for each node • Can be solved iteratively • Low connectivity of the graph results in a sparse matrix

  19. Studied a synthetic landscape and a real protein, ROP Protein was represented with 6 degrees of freedom, two vectors connected by a loop Results • Correlation of transmission coefficients calculated by roadmaps and Monte Carlo simulations

  20. Benefits and Drawbacks • Extremely efficient at calculating kinetic properties like transmission coefficients • Unclear whether low-dimension representation of protein is adequate • Monte Carlo simulations may not be accurate enough for protein kinetics

  21. Molecular dynamics Simulate protein movement using Newton’s laws of motion Bond vibration Isomer- ation Water dynamics Helix forms Fastest folders typical folders slow folders 10-15 femto 10-12 pico 10-9 nano 10-6 micro 10-3 milli 100 seconds MD step long MD run where we need to be where we’d love to be

  22. Folding@Home:Worldwide desktop grid computing ~150,000 CPUs over the world (CPU locations from IP address)

  23. Markovian Model Method(Singhal et al. JCP 2004) • Generate molecular dynamics trajectories from transition path sampling or independently • Cluster nearby points into macrostates to build roadmap with also include transition time • Calculate the mean first passage time and Pfold using linear algebra

  24. Step 1: sampling of paths • Pick a random point from current path • Shoot a path from this point • If path reaches initial or final state by some cutoff time, stop simulation and accept it • Define new current path

  25. Step 2: Generation of roadmap • Nodes are accepted points, edges connect successive nodes • Cluster nearby points to make roadmap more connected • Calculate edge weights by counting number of transitions between nodes and normalize

  26. Step 3 (opt): Re-weighting of edges • Can analyze roadmap at parameter values other than the simulated ones without need for additional simulations • For temperature, can re-weight edges by the relative probabilities at the two temperatures according to the dynamics • Renormalize edges so outgoing probability sums to one

  27. Calculating Pfolds and MFPT • Equation for each node is conditioned on which neighbor it transitions to • One equation and one unknown for each node • Can be solved iteratively

  28. Energy landscape and initial pathway • 2-D energy landscape • Initial and final regions defined by circles around the two minima • Initial paths generated by Monte Carlo or Langevin dynamics I F

  29. Results - Pfold • Compare Pfold values to those from many direct simulations • Correlation coefficients are 0.99 for both

  30. Results - MFPT • Compare MFPT at different temperatures to those from 10,000 direct simulations

  31. Results – Trp zipper b-hairpin • Analyzed existing simulation data of a small, 12 residue, protein • 1750 trajectories, each 10 - 450 ns, resolution of 10 ns for non-folding and 250 ps for folding • Combine into roadmap • Depending on clustering cutoffs, MFPT = 2-9 ms • Agrees with experimental results of 2.47 ±0.05 ms and previous analysis of simulation data of 4.5 ms

  32. Conclusions • Graphical methods produce a network of possible protein pathways • These networks can be efficiently analyzed to compute kinetic properties • Very fast method for looking at simple protein models or analyzing existing molecular dynamics data

More Related