1 / 9

Optimal Sum of Pairs Multiple Sequence Alignment

Optimal Sum of Pairs Multiple Sequence Alignment. David Kelley. Dynamic Programming Extension. Standard pairwise sequence alignment methods can be extended to handle k strings. But…. Runtime is O(2 k N k ) k = # of sequences N = average length of sequences Space is O(N k )

brighton
Télécharger la présentation

Optimal Sum of Pairs Multiple Sequence Alignment

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Optimal Sum of Pairs Multiple Sequence Alignment David Kelley

  2. Dynamic Programming Extension • Standard pairwise sequence alignment methods can be extended to handle k strings

  3. But… • Runtime is O(2kNk) • k = # of sequences • N = average length of sequences • Space is O(Nk) • Quickly becomes unfeasible

  4. Enter Carillo-Lipman • Lower bound the score • Estimate distance from cell to end • Calculate sum of all pairwise distances from cell to end • If current score + estimate < lower bound • Ignore that path

  5. MSA • Implemented in 1989 program MSA. • Used a simple progressive alignment procedure to obtain a lower bound • “generally can align 6 to 8 sequences of length 200-300 residues”

  6. Gupta 1995 update • Re-implemented MSA more efficiently • Uses a star-tree heuristic for lower bound • Ran on Sun SparcStation 10 with 128MB of RAM • Runtimes varied (based on similarity of sequences too) • 10 Globin B proteins of ~150 a.a. took 10 min

  7. Can we do better? • Better hardware • more RAM • multi-core processors • Better heuristics • MUSCLE, MAFFT very fast, accurate • Higher lower bound means more of the matrix can be ignored

  8. My Project • Implement concepts from Carillo-Lipman • Use MUSCLE for lower bound • Look for opportunities to parallelize • Using openMP • Run on modern hardware

  9. Can optimal alignment be made practical? • How much better can we do than the previous attempts? • How will maximizing sum of pairs compare to more popular alignment programs? • Compare on multiple sequence alignment database, BAliBase

More Related