1 / 42

Structural Bioinformatics Workshop

Structural Bioinformatics Workshop. Max Shatsky Email: maxshats@post.tau.ac.il. Workshop home page: http://bioinfo3d.cs.tau.ac.il/Education/Workshop/. Schedule. Introduction to protein structure. Introduction to pattern matching. Protein structure alignment (comparison).

Télécharger la présentation

Structural Bioinformatics Workshop

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Structural Bioinformatics Workshop • Max Shatsky • Email: maxshats@post.tau.ac.il • Workshop home page: http://bioinfo3d.cs.tau.ac.il/Education/Workshop/

  2. Schedule • Introduction to protein structure. • Introduction to pattern matching. • Protein structure alignment (comparison). • Protein Docking • GAMB++ library.

  3. Grade Ingredients • Presentation and Design Review • Final Project • Software Engineering • Efficiency of Solution • Working Examples and Test Cases • Documentation • Knowledge of all project aspects

  4. Bioinformatics - Computational Genomics • DNA mapping. • Protein or DNA sequence comparisons. • Exploration of huge textual databases. • In essence one- dimensional methods and intuition.

  5. Structural Bioinformatics - Structural Genomics • Elucidation of the 3D structures of biomolecules. • Analysis and comparison of biomolecular structures. • Prediction of biomolecular recognition. • Handles three-dimensional (3-D) structures. • Geometric Computing. (a methodology shared by Computational Geometry, Computer Vision, Computer Graphics, Pattern Recognition etc.)

  6. Protein Structural Comparison Pseudoazurin - 1pmy ApoAmicyanin - 1aaj

  7. Algorithmic Solution About 1 sec. Fischer, Nussinov, Wolfson ~ 1990.

  8. Introduction to Protein Structure

  9. The central dogma • DNA ---> mRNA ---> Protein • {A,C,G,T}{A,C,G,U}{A,D,..Y} Guanine-Cytosine T->U Thymine-Adenine • 4 letter alphabets20 letter alphabet • Sequence of nucleic acidsseq of amino acids

  10. When genes are expressed, the genetic information (base sequence) on DNA is first transcribed (copied) to a molecule of messenger RNA in a process similar to DNA replication. The mRNA molecules then leave the cell nucleus and enter the cytoplasm, where triplets of bases ((codons) forming the genetic code specify the particular amino acids that make up an individual protein. This process, called translation, is accomplished by ribosomes (cellular components composed of proteins and another class of RNA) that read the genetic code from the mRNA, and transfer RNAs (tRNAs) that transport amino acids to the ribosomes for attachment to the growing protein. (From www.ornl.gov/hgmis/publicat/primer/ )

  11. Cα atoms Amino acids and the peptide bond Cb – first side chain carbon (except for glycine).

  12. Wire-frame or ribbons display

  13. Geometric Representation 3-D Curve {vi}, i=1…n

  14. Secondary structure

  15.  strands and sheets Hydrogen bonds.

  16. The Holy Grail - Protein Folding • From Sequence to Structure. • Relatively primitive computational folding models have proved to be NP hard even in the 2-D case.

  17. Determination of protein structures • X-ray Crystallography • NMR (Nuclear Magnetic Resonance) • EM (Electron microscopy)

  18. An NMR result is an ensemble of models Cystatin (1a67)

  19. The Protein Data Bank (PDB) • International repository of 3D molecular data. • Contains x-y-z coordinates of all atoms of the molecule and additional data. http://pdb.tau.ac.il http://www.rcsb.org/pdb/

  20. Why bother with structureswhen we have sequences ? • In evolutionary related proteins structure is much better preserved than sequence. • Structural motifs may predict similar • biological function . • Getting insight into protein folding. Recovering the limited (?) number of protein folds.

  21. Applications • Classification of protein databases by structure. • Search of partial and disconnected structural patterns in large databases. • Extracting Structure information is difficult, we want to extract “new” folds.

  22. Applications (continued) • Speed up of drug discovery. • Detection of structural pharmacophores in an ensemble of drugs (similar substructures in drugs acting on a given receptor – pharmacophore). • Comparison and detection of drug receptor active sites (structurally similar receptor cavities could bind similar drugs).

  23. Object Recognition

  24. Model Database

  25. Scene

  26. Recognition Lamdan, Schwartz, Wolfson, “Geometric Hashing”,1988.

  27. Protein Alignment = Geometric Pattern Discovery

  28. Protein Alignment • The superimposition pattern is not known a-priori– pattern discovery . • The matching recovered can be inexact. • We are looking not necessarily for the • largest superimposition, since other • matchings may have biological meaning.

  29. T Geometric Task : Given two configurations of points in the three dimensional space, find those rotations and translations of one of the point sets which produce “large” superimpositions of corresponding 3-D points.

  30. Geometric Task (continued) • Aspects: • Object representation (points, vectors, segments) • Object resemblance (distance function) • Transformation (translations, rotations, scaling) -> Optimization technique

  31. Transformations • Translation • Translation and Rotation • Rigid Motion (Euclidian Trans.) • Translation, Rotation + Scaling

  32. T Inexact Alignment. Simple case – two closely related proteins with the same number of amino acids. Question: how to measure alignment error?

  33. Superposition - best least squares(RMSD – Root Mean Square Deviation) Given two sets of 3-D points : P={pi}, Q={qi} , i=1,…,n; rmsd(P,Q) = √ S i|pi - qi |2 /n Find a 3-D rigid transformation T* such that: rmsd( T*(P), Q ) = minT√ S i|T*pi - qi |2 /n A closed form solution exists for this task. It can be computed in O(n) time.

  34. T Problem statement with RMSD metric. Given two configurations of points in the three dimensional space, and ε threshold find the largest alignment, a set of matched elements and transformation, with RMSD less than ε. (belong to NP, is it in NPC?)

  35. Distance Functions • Two point sets: A={ai} i=1…n • B={bj} j=1…m • Pairwise Correspondence: • (ak1,bt1) (ak2,bt2)… (akN,btN) (1) Exact Matching: ||aki – bti||=0 (2) RMSD (Root Mean Square Distance) Sqrt( Σ||aki – bti||2/N) < ε (3) Bottleneck max ||aki – bti|| • Hausdorff distance: h(A,B)=maxaєA minbєB ||a– b|| • H(A,B)=max( h(A,B), h(B,A))

More Related