1 / 30

GigAssembler

GigAssembler. Genome Assembly: A big picture. http://www.nature.com/scitable/content/anatomy-of-whole-genome-assembly-20429. GigAssembler – Preprocessing. Decontaminating & Repeat Masking. Aligning of mRNAs, ESTs, BAC ends & paired reads against initial sequence contigs. psLayout → BLAT

tavia
Télécharger la présentation

GigAssembler

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. GigAssembler

  2. Genome Assembly: A big picture http://www.nature.com/scitable/content/anatomy-of-whole-genome-assembly-20429

  3. GigAssembler – Preprocessing • Decontaminating & Repeat Masking. • Aligning of mRNAs, ESTs, BAC ends & paired reads against initial sequence contigs. • psLayout → BLAT • Creating an input directory (folder) structure.

  4. http://www.triazzle.com; The image from http://www.dangilbert.com/port_fun.html Reference: Jones NC, Pevzner PA, Introduction to Bioinformatics Algorithms, MIT press

  5. RepBase + RepeatMasker

  6. GigAssembler: Build merged sequence contigs (“rafts”)

  7. Sequencing quality (Phred Score)

  8. Sequencing quality (Phred Score) Base-calling Error Probability http://en.wikipedia.org/wiki/Phred_quality_score

  9. GigAssembler: Build merged sequence contigs (“rafts”)

  10. GigAssembler: Build merged sequence contigs (“rafts”)

  11. GigAssembler: Build sequenced clone contigs (“barges”)

  12. GigAssembler: Build a “raft-ordering” graph

  13. GigAssembler: Build a “raft-ordering” graph • Add information from mRNAs, ESTs, paired plasmid reads, BAC end pairs: building a “bridge” • Different weight to different data type: (mRNA ~ highest) • Conflicts with the graph as constructed so far are rejected. • Build a sequence path through each raft. • Fill the gap with N. • 100: between rafts • 50,000: between bridged barges

  14. Bellman-Ford algorithm http://compprog.wordpress.com/2007/11/29/one-source-shortest-path-the-bellman-ford-algorithm/

  15. Find the shortest path to all nodes. Take every edge and try to relax it (N – 1 times where N is the count of nodes) +5 B C -2 +6 +8 -3 A +7 -4 +7 D E +2 +9

  16. Find the shortest path to all nodes. Take every edge and try to relax it (N – 1 times where N is the count of nodes) +5 B C -2 +6 +8 -3 A +7 -4 +7 D E +2 +9

  17. Find the shortest path to all nodes. Take every edge and try to relax it (N – 1 times where N is the count of nodes) +5 B C Inf. Inf. -2 +6 +8 -3 A +7 START -4 +7 D E +2 Inf. Inf. +9

  18. Find the shortest path to all nodes. Take every edge and try to relax it (N – 1 times where N is the count of nodes) +5 B C +6 (→ A) Inf. -2 +6 +8 -3 A 0 START +7 -4 +7 D E +2 +7 (→ A) Inf. +9

  19. Find the shortest path to all nodes. Take every edge and try to relax it (N – 1 times where N is the count of nodes) +5 B C +4 (→ D) +6 (→ A) -2 +6 +8 -3 A 0 START +7 -4 +7 D E +2 +7 (→ A) +2 (→ B) +9

  20. Find the shortest path to all nodes. Take every edge and try to relax it (N – 1 times where N is the count of nodes) +5 B C +4 (→ D) +2 (→ C) -2 +6 +8 -3 A 0 START +7 -4 +7 D E +2 +7 (→ A) +2 (→ B) +9

  21. Find the shortest path to all nodes. Take every edge and try to relax it (N – 1 times where N is the count of nodes) +5 B C +4 (→ D) +2 (→ C) -2 +6 +8 -3 A 0 START +7 -4 +7 D E +2 +7 (→ A) -2 (→ B) +9

  22. Answer: A-D-C-B-E +5 B C +4 (→ D) +2 (→ C) -2 +6 +8 -3 A 0 START +7 -4 +7 D E +2 +7 (→ A) -2 (→ B) +9

  23. Next-generation sequencing

  24. Mardis ER, Annu. Rev. Genomics Hum. Genet., 2008 Illumina

  25. Mardis ER, Annu. Rev. Genomics Hum. Genet., 2008 Illumina

  26. Mardis ER, Annu. Rev. Genomics Hum. Genet., 2008 Roche/454

  27. Mardis ER, Annu. Rev. Genomics Hum. Genet., 2008 SOLiD

  28. An example of single molecule DNA sequencing, from Helicos (approx. 1 billion reads / run) Pushkarev, D., N.F. Neff, and S.R. Quake. Nat Biotechnol (2009) 27, 847-50 Harris, T.D., et al. Science (2008) 320, 106-9

  29. Mapping program Trapnell C, Salzberg SL, Nat. Biotech., 2009

  30. Two strategies in mapping Trapnell C, Salzberg SL, Nat. Biotech., 2009

More Related