140 likes | 167 Vues
Explore GASSST, a high-performance alignment tool for next-gen sequencing data, providing flexibility without sacrificing speed. Learn about its innovative Gibbs sampling strategy and efficient filtering steps. Compare its accuracy against MAQ software and discover its superior mapping capabilities.
E N D
Presentation : Kevin Charles Paruchuri Padmavathi Department of Computer Science UTSA 11/1/2010
Introduction • GASSST: global alignment short sequence search tool • A Gibbs sampling strategy applied to the mapping of ambiguous short-sequence tags.
Current Sequence Aligners • Next-generation sequencing machines are able to produce huge amounts data • Common techniques often restrict indels in the alignment to improve speed • Flexible aligners are too slow for large-scale applications
GASSST • GASSST is thus 2-fold—achieving high performance with no restrictions on the number of indels with a design that is still effective on long reads. • This method compares with BLAST, with a new efficient filtering step that discards most alignments coming from the seed phase • Carefully designed series of filters of increasing complexity and efficiency to quickly eliminate most candidate alignments • Algorithm manipulates pre-computed small table of 64KB which easily fits into the cache memory
Last step, extend, receives alignments that passed the filter step. • It is computed using a traditional banded NW algorithm. Significant alignments are then printed with their full description. • Provides a lower bound only
A Gibbs sampling strategy applied to the mapping of ambiguous short-sequence tags.
Gibbs Sampling for Ambiguous Seq • Maps ambiguous tags to individual genomic sites. • Mapping of ambiguous tags • Calculating LR for each site • For each map site the number of co-located tags are counted. This count is used for calculate likelihood ratio • Higher likelihood ratio, higher confidence, increases non-linearly with tag counts • LR is calculating conditional prob • Two steps are circular, led to adopt Gibbs Sampling. • For some set of ambiguous tags (σ), it reaches relative entropy between Ps and Pn.
Comparison • Compared against MAQ s/w method, which randomly selects a site for each ambiguous tag. • Comparison on the eight seq tag libraries (20 bp tags, 35 bp tags) shows that Gibbs Sampling correctly maps from 49% to 71%, MAQ method 8% to 23%.
Thank you for listening. Questions
Results We found that GASSST achieves high sensitivity in a wide range of configurations and faster overall execution time than other state-of-the-art aligners.