1 / 13

A New Approach for Alignment of Multiple Proteins

A New Approach for Alignment of Multiple Proteins. Adam Hebdon Zhang, Xu, Kahveci, Tamer, 2006, “A New Approach for Alignment of Multiple Proteins”, Pacific Symposium on Biocomputing, 11:339-350. Why Do We Need A New Approach?.

joshua
Télécharger la présentation

A New Approach for Alignment of Multiple Proteins

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. A New Approach for Alignment of Multiple Proteins Adam Hebdon Zhang, Xu, Kahveci, Tamer, 2006, “A New Approach for Alignment of Multiple Proteins”, Pacific Symposium on Biocomputing, 11:339-350.

  2. Why Do We Need A New Approach? • Multiple Sequence Alignment one of most fundamental problems of Bioinformatics • Used to make predictions, relation of protein sequences, etc. • More difficult when sequences are dissimilar • Traditional alignment methods add sequences one by one. • Progressive - Anchor-based • Iterative - Probabilistic • Number of sequences significantly affects quality of resulting alignment.

  3. Why Do We Need A New Approach? • Progressive • Never more than two sequences simultaneously aligned • Allows alignments of any size • Iterative • Start with initial alignment • Repeatedly refine alignment through series of iterations until no improvements made

  4. Why Do We Need A New Approach? • Anchor-Based • Use local motifs as anchors for alignment • Unaligned segments aligned using other methods • MAFFT, Align-m, L-align, Mavid, PRRP, etc. • Probabilistic • Analyze known multiple alignments & pre-compute substitution probabilities • Maximize alignment for given sequences

  5. Horizontal Sequence Alignment (HSA) • All proteins are considered at once and aligned simultaneously • Particularly accurate in “twilight zone” where other methods are not. • Twilight zone = % of Identities below 25% • HAS performs similar to traditional methods of high Identity percentage

  6. Horizontal Sequence Alignment (HSA) • Construct Initial Directed Graph from AA Sequence, Secondary Structure, etc. • Group Vertices Based on Residue Type • Insert Gap Vertices to Align Similar Vertices Topologically • Determine Alignment from High Scoring Cliques with Sliding Window • Adjust Gap Vertices of Initial Alignment

  7. Structure of Graph Sequences & Associated Secondary Structure Vertices Representing Secondary Structure & Gaps Horizontal Graph of Sequence & Color of Different Protein

  8. HSA Step 1: • Directed edge corresponds to consecutive AA in each protein • Undirected edge between vertices whose substitution score is sufficient

  9. HSA Step 2: • Group Fragments Most Likely To Be Aligned Together • totalScore = typeScore – positionPenalty – lengthPenalty

  10. HSA Step 3: • Update Graph with Gaps so like vertices are close topologically

  11. HSA Step 4: • Place Window over all sequences where w = number of vertices to cover • Ex: w = 3, window covers first 3 vertices of each sequence • For Each Clique, align letters of the clique and find the next best clique • Clique = complete sub-graph consisting of one vertex of each color (1 column of multiple alignment) • Slide window down & find next clique that: • Doesn’t conflict with previous clique - 1 letter next to letter in previous clique

  12. HSA Step 5: • Move gaps found inside fragment of alpha-helix or beta-sheet outside the fragment • Final alignment obtained from mapping each vertex in final graph back to original residue

  13. Observations of HSA • Identity = 0 – 20% • HSA method outperforms traditional alignment methods significantly • Identity = 20 – 40% • HSA method is comparable to traditional alignment methods • Identity = 40- 100 % • Little improvement can be made from traditional alignment • HSA performs slightly under traditional methods

More Related