1 / 11

SRA Transcript BLAST

SRA Transcript BLAST. Tom Madden May 15, 2009. BLAST. B asic L ocal A lignment S earch T ool Calculates similarity for biological sequences. Produces local alignments: only a portion of each sequence must be aligned.

archer
Télécharger la présentation

SRA Transcript BLAST

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. SRA Transcript BLAST Tom Madden May 15, 2009

  2. BLAST • Basic Local Alignment Search Tool • Calculates similarity for biological sequences. • Produces local alignments: only a portion of each sequence must be aligned. • Uses statistical theory to determine if a match might have occurred by chance.

  3. BLAST databases used for searches

  4. Requirements for searching SRA sequences as a BLAST DB • Extract new or updated sequences. • Format into a BLAST database. • Provide disks for eight copies BLAST databases, each with 5 tera-bases (as of January). • Distribute databases to storage in Bethesda and Virginia. • Know how to quickly re-dump for policy changes or data corruption (e.g., unclipped or differently clipped reads should be searched).

  5. Direct BLAST searches against the SRA archive. • Uses SRA toolkit and C++ BLAST API. • Smallest search unit is a “run”. • Multiple runs may be searched together. • Offers searches of 454 SRA transcripts (grouped by organism) at NCBI web page. • Clipped application reads are searched.

  6. Clipped Application Read is Searched.

  7. Advantages • The search set offered no longer depends upon how fast BLAST database can be produced and distributed. • Changes to SRA archive are seen immediately (e.g., change in clipping algorithm).

  8. Three most popular organisms. • Human • Susscrofa • Tachyglossusaculeatus Counts searches after April 29, 2009 and only includes those with an average of two or more searches per session.

  9. Future development • Allow users to build custom search sets. • Take mate-pair information into account. • Combine SRA searches with traditional BLAST database searches.

  10. Acknowledgements • Kurt Rodarmer • Eugene Yaschenko • Ty Roach • Martin Shumway • Christopher O’Sullivan • Vahram Avagyan • Christiam Camacho • Yan Raytselis • Irena Zaretskaya

More Related