1 / 14

Pair HMM and the Stepping Stone algorithm

Pair HMM and the Stepping Stone algorithm. Mani Right Now. Our Pair HMM state diagram. Viterbi algorithm. Time complexity O( mns 2 ), Space complexity O( mns ) m – length of genomic sequence n – length of cDNA sequence s – number of states (~13 now)

neena
Télécharger la présentation

Pair HMM and the Stepping Stone algorithm

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Pair HMM and the Stepping Stone algorithm Mani Right Now

  2. Our Pair HMM state diagram

  3. Viterbi algorithm • Time complexity O(mns2), Space complexity O(mns) • m – length of genomic sequence • n – length of cDNA sequence • s – number of states (~13 now) • Suffice to say that it could be a HUGE number!

  4. Viterbi Matrix Genomic sequence cDNA sequence

  5. Stepping stone – Seed Alignments Genomic sequence cDNA sequence

  6. Stepping Stone - Alignment Pins Genomic sequence cDNA sequence

  7. Viterbi Submatrices Savings: approx. 50% Genomic sequence cDNA sequence

  8. MGC test set • 10634 optimal alignments • 18000+ stepping stone alignments* • Compared the 10634 • Only 15 were different (0.14%!) * Still running

  9. Diff1 BC043644.12665. 474 TAGTAGAGGCGGGGTTTCTCCATGTTGGTCAGGCTGGTCTCGAAATCCCG 523 |||||||| | ||||||| ||||||||| ||||||||||| ||| ||| | BC043644 1556 TAGTAGAGACAGGGTTTCACCATGTTGGCCAGGCTGGTCTTGAACTCCTG 1605 BC043644.12665. 524 ACCTCAGGTGATCTGCCCACCTCAGCCTCCCAAAGTGCTGGGATT 568 ||||||||||||| ||||||| || ||||| |||||||||||| BC043644 1606 ACCTCAGGTGATCCACCCACCTGGGCTTCCCATAGTGCTGGGATT 1650 BC043644.12665. 474 TAGTAGAGGCGGGGTTTCTCCATGTTGGTCAGGCTGGTCTCGAAATCCCG 523 |||||||| | ||||||| ||||||||| ||||||||||| ||| ||| | BC043644 1556 TAGTAGAGACAGGGTTTCACCATGTTGGCCAGGCTGGTCTTGAACTCCTG 1605 BC043644.12665. 524 ACCTCAGGTGATCTGCCCA------------------------------- 542 ||||||||||||| |||| BC043644 1606 ACCTCAGGTGATCCACCCACCTGGGCTTCCCATAGTGCTGGGATTCAATT 1655 --//-- BC043644.12665. 543 ----------------------------------------------CCTC 546 |||| BC043644 3506 GTTGGCCAGGCTGGTCTCGAACTCCTGACATCAGGTGATCCACCTGCCTC 3555 BC043644.12665. 547 AGCCTCCCAAAGTGCTGGGATTAGAGGCGTGAACCAC 583 |||||||||||||||||||||| |||||||| |||| BC043644 3556 GGCCTCCCAAAGTGCTGGGATTACAGGCGTGAGCCAC 3592

  10. Diff2 BC011727.13323. 18583 GGACTGATGAGGTCTTAACAAAAACCAGTGTGGCAAAAAAAAAAAAAAAA 18632 |||||||||||||||||||||||||||||||||||||||||||||||||| BC011727 1857 GGACTGATGAGGTCTTAACAAAAACCAGTGTGGCAAAAAAAAAAAAAAAA 1906 BC011727.13323. 18633 AAAAAAAAAAAAA 18645 ||||||||||||| BC011727 1907 AAAAAAAAAAAAA 1919 Score = 39 (77.8 bits), Expect = 0., Sum P(8) = 0., Group = 1 Identities = 39/39 (100%), Positives = 39/39 (100%), Strand = Plus / Plus Query: 1917 AAAAAAAAAAAAAAAAAAAAAAAAAAAAATCCTAAAAAC 1955 ||||||||||||||||||||||||||||||||||||||| Sbjct: 18617 AAAAAAAAAAAAAAAAAAAAAAAAAAAAATCCTAAAAAC 18655 BC011727.13323. 18583 GGACTGATGAGGTCTTAACAAAAACCAGTGTGGCAAAAAAAAAAAAAAAA 18632 |||||||||||||||||||||||||||||||||||||||||||||||||| BC011727 1857 GGACTGATGAGGTCTTAACAAAAACCAGTGTGGCAAAAAAAAAAAAAAAA 1906 BC011727.13323. 18633 AAAAAAAAAAAAATCCTAAAAACAAACAAACAAAAAAAA 18671 ||||||||||||| ||||| ||| ||| |||||||| BC011727 1907 AAAAAAAAAAAAA----AAAAAAAAAAAAAAAAAAAAAA 1941

  11. Diff3

  12. Diff3 – cont’d

  13. Placeholder Genomic sequence cDNA sequence

  14. Now what? • Compare with EST_GENOME • Null model • Use pairHMM for the ENCODE gene prediction workshop • Double pins

More Related