1 / 14

Multiple Alignments

Multiple Alignments. Rhys Price Jones and Anne Haake Rochester Institute of Technology rpjavp@rit.edu , arh@it.rit.edu. Multiple Alignments. The computational aspects Textbook Chapter 3, pp 69-80

norina
Télécharger la présentation

Multiple Alignments

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Multiple Alignments Rhys Price Jones and Anne Haake Rochester Institute of Technology rpjavp@rit.edu, arh@it.rit.edu

  2. Multiple Alignments • The computational aspects • Textbook Chapter 3, pp 69-80 • An excellent gentle online tutorialhttp://www.techfak.uni-bielefeld.de/bcd/Curric/MulAli/mulali.html

  3. Similarity to Pairwise Alignment • pairwise alignment could be achieved via a dynamic programming technique to fill a 2-dimensional matrix • An alignment of three sequences can be achieved by applying a dynamic programming technique and filling a 3-dimensional matrix • A Java Visualization toolhttp://bibiserv.techfak.uni-bielefeld.de/visualign/ • Aligning k sequences would require filling a k-dimensional matrix.

  4. How do you score a multiple alignment? • ACTG ACTGAGG A-GGCCTG CCTG • Pairwise, say match 2, mismatch –1, gap -2 • column 1: AA AC AC, 2 + -1 + -1 = 0 • column 2: C- -C CC, -2 + -2 + 2 = -2 • column 3: TG GT TT, -1 + -1 + 2 = 0 • column 4: GG GG GG, 2 + 2 + 2 = 6 • overall score is then 4. • New feature: score two gaps!

  5. How do you score a multiple alignment? • Scoring Along a Tree is a way to calculate the score of a multiple alignment, where only the score of the alignments of sequences that are neighbors in an (evolutionary) tree are summed up for the calculation of the overall score.

  6. How do you score a multiple alignment? • Minimum Entropy • Basic idea is that the fewer bits necessary to specify a column, the better the score

  7. How do you score a multiple alignment? • In summary • There are many scoring functions used to evaluate combinations of residues either in single edit operations, or whole pairwise comparisons, or whole multiple columns. • Be sure you know what you’re doing. • We’ll look soon at options for CLUSTAL-W

  8. The Dynamic Programming Approach • Analysis: k sequences of length n require filling a nk element multidimensional array. • To compare 1000 nucleotide putative genes in 12 species, the array would have 100012 entries. • meg gig tera peta exa zetta yotta... • http://www.sdsc.edu/GatherScatter/gsq394/gsq3_f1.html

  9. Heuristic • Try to find a smallish area within that huge multidimensional array • Carrillo Lipman bound is discussed at page 19 of http://www.techfak.uni-bielefeld.de/bcd/Curric/MulAli/mulali.html • But even so...

  10. Practical Algorithms • like ClustalW or t-coffee • compute all pairwise alignment scores • from those create a guide tree • successively align pairs of sequences and already-computed alignments until one large multiple alignment remains

  11. Pro: • fast algorithms even with many and long sequences • do good alignments of subfamily motifs

  12. Con: • early errors persist – they don’t go away • a possibly erroneous guide tree retains its perfidious influence in the alignment and may prejudice use of the alignment for phylogeny studies • it’s hard to know how good or bad the alignment is

  13. We will return • to these issues

  14. In the meantime • Let’s look at the help for ClustalX • http://www-igbmc.u-strasbg.fr/BioInfo/ClustalX/Top.html • In particular we’ll look at the options and why they exist

More Related