1 / 20

Genomics Quick Start

Genomics Quick Start. Mikhail Dvorkin Vladislav Isenbaev Eugene Kapun. Scientific advisors Acad. Konstantin Skryabin, Bioengineering RAS Prof. Anatoly Shalyto, SPbSU ITMO   . Collaboration with Bioengineering RAS. Bioengineering RAS Conducts biological experiments Sets problems

kevork
Télécharger la présentation

Genomics Quick Start

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Genomics Quick Start Mikhail DvorkinVladislav IsenbaevEugene Kapun Scientific advisors Acad. Konstantin Skryabin, Bioengineering RASProf. Anatoly Shalyto, SPbSU ITMO   

  2. Collaboration with Bioengineering RAS • Bioengineering RAS • Conducts biological experiments • Sets problems • Provides biological data • SPbSU ITMO • Develops algorithms and programs • Started in the end of 2009 • Why us? SPbSU ITMO: Genomics Quick Start

  3. SPbSU ITMO at ACM ICPC We train Zürich ETHMay be, MIT? :-) EugeneKapun VladislavIsenbaev MikhailDvorkin GeorgiyKorneev SPbSU ITMO: Genomics Quick Start

  4. MikhailDvorkin SPbSU ITMO: Genomics Quick Start

  5. EugeneKapun VladislavIsenbaev SPbSU ITMO: Genomics Quick Start

  6. SPbSU ITMO: Genomics Quick Start

  7. Genome Team Coach • GeorgiyKorneev Members • Mikhail Dvorkin • Vladislav Isenbaev • Eugene Kapun SPbSU ITMO: Genomics Quick Start

  8. Problems Being Solved • DNA assembly de novo based on pair reads • Generalized suffix tree traversal • Reduction to single reads • DNA alignment with transfers SPbSU ITMO: Genomics Quick Start

  9. DNA Assembly 1 Generalized suffix tree traversal

  10. Suffix Tree • Built upon reads • Arc weight: number and quality of reads • Possible extensions • Erroneous nucleotides detection SPbSU ITMO: Genomics Quick Start

  11. Building up a Contig • Start with high-quality read • Use pair reads to select a nucleotide • “Backward” – match the past • “Forward” – match the future • Build up to a branch SPbSU ITMO: Genomics Quick Start

  12. Results • Caenorhabditis elegans • Escherichia coli K-12 SPbSU ITMO: Genomics Quick Start

  13. DNA Assembly 2 Reduction to single reads

  14. Concept • De Bruijn graph with all reads • Pair reads • Path in the graph • Low density – backtracking • Slow – Meet-in-the-middle SPbSU ITMO: Genomics Quick Start

  15. Error detection • Poorly covered vertices • Erroneous • Delete them • Repeat • Paths • Single reads • Use another tool SPbSU ITMO: Genomics Quick Start

  16. Results • 60% erroneous reads detected • < 0.1% errors left after one iteration • 99.5% DNA coverage SPbSU ITMO: Genomics Quick Start

  17. DNA Alignment with transfers

  18. Concept • Parts • Matched (small edit distance) • Unmatched • Swapping allowed • Penalties • Number of parts • Edit distance in matched parts • Length of unmatched parts SPbSU ITMO: Genomics Quick Start

  19. Implementation First DNA • Tear into small pieces • Hash ‘em and store ‘em Second DNA • Tear into small pieces • Look them up • Build them up SPbSU ITMO: Genomics Quick Start

  20. Results SPbSU ITMO: Genomics Quick Start

More Related