1 / 24

Eugene Krissinel SSM - MSDfold

Eugene Krissinel SSM - MSDfold. MSDfold (SSM). Structure alignment. Structure alignment may be defined as identification of residues occupying “equivalent” geometrical positions. Unlike in sequence alignment, residue type is neglected Used for. measuring the structural similarity

yaholo
Télécharger la présentation

Eugene Krissinel SSM - MSDfold

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Eugene Krissinel SSM - MSDfold

  2. MSDfold (SSM)

  3. Structure alignment Structure alignment may be defined as identification of residues occupying “equivalent” geometrical positions • Unlike in sequence alignment, residue type is neglected • Used for • measuring the structural similarity • protein classification and functional analysis • database searches

  4. Method • Many methods are known: • Distance matrix alignment (DALI, Holm & Sander, EBI) • Vector alignment (VAST, Bryant et. al. NCBI) • Depth-first recursive search on SSEs (DEJAVU, Madsen & Kleywegt, Uppsala) • Combinatorial extension (CE, Shindyalov & Bourne, SDSC) • Dynamical programming on Ca (Gerstein & Levitt) • Dynamical programming on SSEs (SSA, Singh & Brutlag, Stanford University) • many other • SSM employs a 2-step procedure similar to VAST: • Initial structure alignment and superposition using SSE graph matching • Ca - alignment

  5. r2 a2 r1 a1 t L Graph representation of SSEs E. M. Mitchell et al. (1990) J. Mol. Biol. 212:151 SSE graphs differ from conventional chemical graphs only in that they are labelled by vectors of properties. In graph matching, the labels are compared with tolerances chosen empirically.

  6. H1 A B H1 H2 S1 S1 S4 S2 H2 H1 S3 S2 S1 S3 H4 S2 S4 H1 H5 S5 S2 S3 S6 S1 S4 S7 H2 H3 S7 H2 H3 H6 S6 H4 S3 H5 S4 S5 H6 SSE graph matching A Matching the SSE graphs yields a correspondence between secondary structure elements, that is, groups of residues. The correspondence may be used as initial guess for structure superposition and alignment of individual residues. B

  7. chain A matched helices matched strands chain B Ca - alignment • SSE-alignment is used as an initial guess for Ca-alignment • Ca-alignment is an iterative procedure based on the expansion of shortest contacts at best superposition of structures • Ca-alignment is a compromise between the alignment length Nalignand r.m.s.d. Longest contacts are unmapped in order to maximise the Q-score:

  8. Statistical significance of alignment x1 • Based on the same ideas as P-value estimations in VAST • Uses individual Q-scores of SSE deviations • P(S) is the probability of getting a score equal to S or higher at random picking structures from the PDB xi xn • P(S) is calibrated on SCOP folds • P(S) is often expressed through Z-score

  9. Multiple structure alignment • More than 2 structures are aligned simultaneously • Multiple alignment is not equal to a set of pairwise alignments • Helps to identify common structure motifs for a whole family of structures

  10. Number of successful pairwise alignments 3 2 Do not align 3 1 3 A B C Algorithm of multiple alignment • Based on the analysis of SSE correspondences in the course of all-to-all pairwise alignments • SSEs with number of successful pairwise alignments less then maximal are gradually removed in repeating iterations • Multiple SSE alignment is followed by multiple Ca-alignment in 3D

  11. SSM output • Table of matched Secondary Structure Elements • Table of matched backbone Ca-atoms with distances between them at best structure superposition • Rotation-translation matrix of best structure superposition • r.m.s.d. of Ca-alignment • Length of Ca-alignment Nalign • Number of gaps in Ca-alignment • Quality score Q • Statistical significance scores P(S), Z • Sequence identity

  12. SSM server map http://www.ebi.ac.uk/msd-srv/ssm

  13. SSM submission form http://www.ebi.ac.uk/msd-srv/ssm • 2. Set Target: • PDB/SCOP entry • Coordinate file • PDB/SCOP archive • SCOP subset • User’s archive • 1. Set Query: • PDB/SCOP entry • Coordinate file • List of pairs 3. Select chain(s) or a domain 4. Set similarity level 5. Set match options 6. Submit

  14. SSM wait page

  15. List of matches Match details

  16. Match details View in Rasmol Download superposed

  17. SSE alignment Links to Web services related to the structure

  18. Ca alignment

  19. Measuring the structure similarity • r.m.s.d. - good measure only if all residues are aligned • Nalign - not indicative, subject to alignment procedure and parameters • Q-score - an attempt to balance Nalign and r.m.s.d. : • P(S), Z - indicates only the statistical significance of matches and depends on the calibration database • ????????????????

  20. Q-score vs. r.m.s.d. and Nalign (1SAR:A)

  21. Scoring at low structural similarity - 1KNO:A vs SCOP 1.61 Maximal Q-score d1di2a_ (69 res) Q-score 0.213 RMSD 2.43 Nalign 67/184 Lowest RMSD d1emn_1 (43 res) Q-score 0.019 RMSD 0.9 Nalign13/184 Highest Nalign d1e1xb_ (449 res) Q-score 0.02 RMSD 5.82 Nalign89/184

  22. Performance data

  23. Acknowledgement This work has been supported by Collaborative Computational Project Number 4 (CCP4) of the UK Biotechnology and Biological Sciences Research Council.

  24. Number of SSM users

More Related