1 / 12

Pindel user manual

Pindel user manual. Kai Ye k.ye@lumc.nl. Preparation of Pindel input. Alignment BAM file generated by BWA. Alignment BAM file generated by other aligners. bam2pindel.pl Adaptor.pm. (2) sam2pindel.cpp. Pindel input with sample tag. (3) FilterPindelReads.cpp.

rudolf
Télécharger la présentation

Pindel user manual

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Pindel user manual Kai Ye k.ye@lumc.nl

  2. Preparation of Pindel input Alignment BAM file generated by BWA Alignment BAM file generated by other aligners • bam2pindel.pl • Adaptor.pm (2) sam2pindel.cpp Pindel input with sample tag (3) FilterPindelReads.cpp Filtered Pindel input with sample tag Merge Pindel input files for paired or population sequence data

  3. (1) bam2pindel.pl • Written by Keiran Raine at Sanger Institute (kr2@sanger.ac.uk) • This tool was designed for BWA based BAM/SAM Illumina data • You must prepare a name sorted bam file • Set BAM_2_PINDEL_ADAPT setenv  BAM_2_PINDEL_ADAPT  <path  to  Adaptor.pm> • Arguments: -i|input: Input BAM file (req) -o|output: Output ready for pindel -s|sample: Sample or label (sampA,sampB...) (req) -pi|insert: Required if BAM file does not have PI tag in header RG record -r|restrict: Restrict to chromosome xx • Example: ./bam2pindel_bwa.pl –i NameSorted.bam –o output_prefix -s tumour –om –pi 300

  4. (2) sam2pindel.cpp • Written by Kai Ye at Leiden University Medical Center (k.ye@lumc.nl) • This tool was designed for all BAM/SAM Illumina data • You must first compile the cpp source code: g++ sam2pindel.cpp –o sam2pindel –O3 • 5 arguments are required by sam2pindel • 1. Input sam file. • 2. Output for pindel. • 3. insert size. • 4. tag. • 5. number of extra lines (not start with @) in the beginning of the file. • If you start with standard sam file (Input.sam with insert size 300) ./sam2pindel Input.sam Output4Pindel.txt 300 tumour 0 • If you start with bam file ./samtools view Input.bam | ./sam2pindel - Output4Pindel.txt 300 tumour 0

  5. Running Pindel 1. Input: the reference genome sequences in fasta format; 2. Input: the unmapped reads in a modified fastq format; 3. Output folder 4. Which chr/fragment 5. BreakDancer result: Format per line: ChrALocAstringAChrBLocBstringB others If you don't have BreakDancer result, please provide an empty file here. Example: ./pindel hg19.fa pindel_input_chr1.txt Output_Folder chr1 empty

  6. Input format of Pindel @9113 TGGGGACCGGTGGAATGCTTCCACTGGCTGGGGGGC + chr2 41149518 50 Tumor Strand, chr, 3’ coordinate and mapping quality of the mapped reads; sample tag ref Anchor

  7. Output format: deletions D 321   ChrID 0 56173880        56174202              Supports: 15      70      130.916 TAAGAATGAGTTGGCAAATAAAGAGTTTGGTGAGTTTATAGAAATATAGGggccg<311>ataggACAAGGTACAAGGAATGGCTGAAGGAGAGAGGTTG                                GAGTTTATAGAAATATAGG               ACAAGGTACAAGGAATG       +    56173670 normal                              GTGAGTTTATAGAAATATAGG               ACAAGGTACAAGGAA         +    56173677 normal                                GAGTTTATAGAAATATAGG               ACAAGGTACAAGGAATG       +    56173681 normal                            TGGTGAGTTTATAGAAATATAGG               ACAAGGTACAAGG           +    56173687 normal                                GAGTTTATAGAAATATAGG               ACAAGGTACAAGGAATG       - 56173690 normal                              GTGAGTTTATAGAAATATAGG               ACAAGGTACAAGGAA         - 56173695 normal                        AGTTTGGTGAGTTTATAGAAATATAGG               ACAAGGTACAAGGA          - 56173697 normal                              GTGAGTTTATAGAAATATAGG               ACAAGGTACAAGGAA         +    56173700 tumor                                 AGTTTATAGAAATATAGG               ACAAGGTACAAGGAATGG      +    56173710 tumor                          TTTGGTGAGTTTATAGAAATATAGG               ACAAGGTACAA             + 56174339 tumor                               TGAGTTTATAGAAATATAGG               ACAAGGTACAAGGAATG       + 56174356 tumor                               TGAGTTTATAGAAATATAGG               ACAAGGTACAAGGAAT        -    56174357 tumor                                  GTTTATAGAAATATAGG               ACAAGGTACAAGGAATGGC     -    56174358 tumor                                GAGTTTATAGAAATATAGG               ACAAGGTACAAGGAATG       -    56174365 tumor                                 AGTTTATAGAAATATAGG               ACAAGGTACAAGGAATGG      -    56174373 tumor 1base - 1million bases

  8. Allow mismatches to accommodate sequence errors and SNPs D 10 ChrID 13 BP 32913041 32913052 AAATCAACTAGTGACCTTCCAGGGACAACCCGAACGTGATGAAAAGATCAaagaacctacTCTATTGGGTTTTCATACAGCTAGCGGGAAAAAAGTTAAAATTGCAAAGGAATCTTTGGACAAAGT GATGAAAAGATCA TCTGTTGGGTTTTCATACAGCTAGCGGGAAAAAAGTTAAAATTGCAAAGGAATCTTTGGACAA CAACCCGAACGTGATGAAAAGATCA TCTGTTGGGTTTTCATACAGCTAGCGGGAAAAAAGTTAAAATTGCAAAGGA CGTGATGAAAAGATCA TCTGTTGGGTTTTCATACAGCTAGCGGGAAAAAAGTTAAAATTGCAAAGGAATCTTTGGA CGTGATGAAAAGATCA TCTGTTGGGTTTTCATACAGCTAGCGGGAAAAAAGTTAAAATTGCAAAGGAATCTTTGGA TGAAAAGATCA TCTGTTGGGTTTTCATACAGCTAGCGGGAAAAAAGTTAAAATTGCAAAGGAATCTTTGGACAAAG GTGATGAAAAGATCA TCTGTTGGGTTTTCATACAGCTAGCGGGAAAAAAGTTAAAATTGCAAAGGAATCTTTGGAC TAGTGACCTTCCAGGGACAACCCGAACGTGATGAAAAGATCA TCTGTTGGGTTTTCATACAGCTAGCGGGAAAAAA CCTTCCAGGGACAACCCGAACGTGATGAAAAGATCA TCTGTTGGGTTTTCATACAGCTAGCGGGAAAAAAGTTAAA ACAACCCGAACGTGATGAAAAGATCA TCTGTTGGGTTTTCATACAGCTAGCGGGAAAAAAGTTAAAATTGCAAAGG CGAACGTGATGAAAAGATCA TCTGTTGGGTTTTCATACAGCTAGCGGGAAAAAAGTTAAAATTGCAAAGGAATCTT CCCGAACGTGATGAAAAGATCA TCTGTTGGGTTTTCATACAGCTAGCGGGAAAAAAGTTAAAATTGCAAAGGAATC AACCCGAACGTGATGAAAAGATCA TCTGTTGGGTTTTCATACAGCTAGCGGGAAAAAAGTTAAAATTGCAAAGGAA TGATGAAAAGATCA TCTGTTGGGTTTTCATACAGCTAGCGGGAAAAAAGTTAAAATTGCAAAGGAATCTTTGGACA ACCTTCCAGGGACAACCCGAACGTGATGAAAAGATCA TCTGTTGGGTTTTCATACAGCTAGCGGGAAAAAAGTTAA GATGAAAAGATCA TCTGTTGGGTTTTCATACAGCTAGCGGGAAAAAAGTTAAAATTGCAAAGGAATCTTTGGACAA AACCCGAACGTGATGAAAAGATCA TCTGTTGGGTTTTCATACAGCTAGCGGGAAAAAAGTTAAAATTGCAAAGGAA GAACGTGATGAAAAGATCA TCTGTTGGGTTTTCATACAGCTAGCGGGAAAAAAGTTAAAATTGCAAAGGAATCTTT

  9. Inversions sample ref

  10. Large insertions

  11. Non-template sequence in deletions, inversions and tandem duplications ref sample

  12. Non-template sequence: deletion of 4 bases with 2 bases inserted D 4 I 2 ChrID 3 BP 156978978 156978983 Supports 12 + 0 - 12 S1 13 SUM_MS 627 NumSupSamples 1 HCC1599a 12 CATGGCTGACTTATAAATCCCTACAGATATGTGGTTACTTCTCTACTTTCCCTTTCTTTGGCTTGGGCAACTGCCACGTTGATGCACTGGAGCCATTCTTCTGCATTCTTCTCATCCTTGGCCTTAAAGACATAGGTTTTATTGTC TTATAAATCCCTACAGATATGTGGTTACTTCTCTACTTTCCCTTTCTTTGCCTTGGGCAACTGCCAAA GATGCACT ATGTGGTTACTTCTCTACTTTCCCTTTCTTTGGCTTGGGCAACTGCCAAA GATGCACTGGAGCCATTCTTCTGCAT CTCTACTTTCCCTTTCTTTGGCTTGGGCAACTGCCAAA GATGCACTGGAGCCATTCTTCTGCATTCTTCTCATCCT AGATATGTGGTTACTTCTCTACTTTCCCTTTCTTTGGCTTGGGCAACTGCCAAA GATGCACTGGAGCCATTCTTCT TTTCCCTTTCTTTGGCTTGGGCAACTGCCAAA GATGCACTGGAGCCATTCTTCTGCATTCTTCTCATCCTTGGCCT TTCCCTTTCTTTGGCTTGGGCAACTGCCAAA GATGCACTGGAGCCATTCTTCTGCATTCTTCTCATCCTTGGCCTT TTACTTCTCTACTTTCCCTTTCTTTGGCTTGGGCAACTGCCAAA GATGCACTGGAGCCATTCTTCTGCATTCTTCT CTTGGGCAACTGCCAAA GATGCACTGGAGCCATTCTTCTGCATTCTTCTCATCCTTGGCCTTAAAGACATAGGTTT CTACAGATATGTGGTTACTTCTCTACTTTCCCTTTCTTTGGCTTGGGCAACTGCCAAA GATGCACTGGAGCCATTC AAATCCCTACAGATATGTGGTTACTTCTCTACTTTCCCTTTCTTTGGCTTGGGCAACTGCCAAA GATGCACTGGAG CTTGGGCAACTGCCAAA GATGCACTGGAGCCATTCTTCTGCATTCTTCTCATCCTTGGCCTTAAAGACATAGGTTT TTCCCTTTCTTTGGCTTGGGCAACTGCCAAA GATGCACTGGAGCCATTCTTCTGCATTCTTCTCATCCTTGGCCTT

More Related