1 / 10

Workshop on FCP Accelerated NGS

Join us for a workshop led by Srinivas Aluru from Iowa State University, focusing on the evolution of Next-Generation Sequencing (NGS) technologies and their impact on genomic research. Explore the transformation from basic sequencing methods to advanced big data analytics, addressing challenges in complex disease identification, biological threats, and plant genotype-phenotype studies. Learn about innovative approaches and the vision to empower researchers through high-performance computing and cost-effective solutions. Featured speakers include experts from Rutgers, Stanford, and Michigan.

kaiyo
Télécharger la présentation

Workshop on FCP Accelerated NGS

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Workshop on FCP Accelerated NGS SrinivasAluru Iowa State University

  2. The Big Data Challenge Then (2005) Now ABI 3700 96 ~800 bp reads 76.8 X 103 bases ~$1 per kilo base IlluminaHiseq 2500 6 billion 100 bp reads 600 X 109 bases ~$1 per 200 million bases

  3. Many NGS Technologies

  4. Why FCP? • 1 NGS experiment = ~100 GB data • Sequencing Center decade ago  small budget individual investigator today • Many FCP technologies are inexpensive and widely available

  5. Driving Grand Challenges • Identification of complex disease traits • Detection of biological threats • Microbial studies and human health • Plant genotype to phenotype • ⁞ • ⁞ • Vision and Goals • Empower community migration to HPC • Preserve ability to create new solutions • Target researchers & software developers Genomes Galore – Big Data Analytics for High Throughput DNA Sequencing • Research and Dissemination Approach • The Team • SrinivasAluru (ISU) • Jaroslaw Zola (Rutgers) • KunleOlukotun (Stanford) • Wu Feng (V. Tech) • Domain Experts: • Patrick Schnable (ISU) • Charles Sing (U. of Michigan)

  6. NGS Application: Assembly reconstruct longer original sequences from the high coverage sampling of short fragments produced by NGS Multiple copies of the same source Sequence Unordered genome fragments Randomly fragment the copies

  7. NGS Application: Assembly • resequencing genome mapping • de novo sequencing  genome assembly • gene expression analysis  transcriptome assembly • metagenomicsampling  metagenomic clustering and/or assembly

  8. Graph Abstractions for Assembly • Overlap graphs • node: an NGS read • edge: suffix-prefix alignment between a pair of reads • De Bruijn graphs • node: a kmer from an NGS read • edge: length (k-1) suffix-prefix match between two reads

  9. Graph Operations for Assembly • Graph construction from reads • Collapsing chains • Features in local neighborhood to identify errors • Path walking subject to distance constraints on pairs of edges • Operations on multiple assembly graphs, or multiple genomes in a combined graph

  10. NGS Error Correction • Hamming/Edit distance graphs • Node: a kmer in an NGS read • Edge: two kmers with short hamming/edit distance • Graph operations needed • Concurrent access to many nodes for neighbor queries

More Related