1 / 6

Comprehensive SNV Analysis Pipeline Using GATK and ANNOVAR for Variant Discovery

This document outlines a streamlined analysis pipeline for Single Nucleotide Variant (SNV) detection and annotation using GATK and ANNOVAR. It covers pre-processing steps such as quality control with NGS QC Toolkit, mapping with Bowtie2, and variant calling with GATK and Samtools. The pipeline emphasizes filtering variants based on specific criteria, consolidation of calls, and correction of aligned files. Finally, it elaborates on using ANNOVAR for annotation and prioritization of genetic variations, along with practical guidance on analyzing results.

dinh
Télécharger la présentation

Comprehensive SNV Analysis Pipeline Using GATK and ANNOVAR for Variant Discovery

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Illu_SNV_analysis_Pipeline Shiyi.Z

  2. Diagram of analysis process http://www.broadinstitute.org/gatk/guide/best-practices

  3. Data Pre-processing NGSQCToolkit filters raw data, and generates QC report Bowtie2 map filtered data to reference, samtools convert and make duplicates GATK realignment INDEL around sequencing data Using GATK and Samtools do variant and INDEL calling -o sample.gatk.raw1.vcf -o sample.samtools.raw1.vcf Consolidate and Filter the variant -o sample.concordance.raw1.vcf Filter: QD < 20; ReadPosRankSum < -8; FS > 10; QUAL < $MEANQUAL -o sample.concordance.flt1.vcf Correct aligned file based on filtered variant report -o sample.recal.bam

  4. Variant Discovery Patient: patient.concordance.filter1.vcf Father: father.concordance.filter1.vcf Mother: mother. concordance.filter1.vcf Control: control.concordance.filter1.vcf Based on previous *.filter1.vcf variant file, correct aligned file, and generate sample.recal.bam file Using GATK and Samtools recall variant again, and generate sample.final.vcf files

  5. Preliminary Analysis Submit VCF file to wANNOVARwebsite (http://wannovar.usc.edu/) Do annotation using ANNOVAR Variation prioritization Prioritization by ANNOVAR annotate_variation.pl -filter --dbtype generic --genericdbfile hg18_avsift.txt --score_threshold 0.05 ex1.human humandb/ Using Excel Open the file in Excel 2007 (select "tab-delimited" when opening the file). Click the "DATA" tab at the menu bar, then click the big "Filter" button. Then click any one of the headings such as 1000G_CEU or SIFT to filter out variants, essentially by clicking the check boxes. For SIFT score, make sure to use "less than 0.05 OR equal to (blank)" so that variants without SIFT score do not get filtered out. It should be straightfoward to do, but it may need a little practice for users not familiar with Excel.

  6. ANNOVAR analysis pipeline First: remove variations detected in Control Second: Genetic mode analysis Third: filtered by those parameters: (SIFT less than 0.05; PolyPhen2_HDIV greater than 0.909; PolyPhen2_HVAR greater than 0.909).

More Related