380 likes | 500 Vues
This tutorial focuses on high-throughput sequencing (HTS) tools and analysis methodologies, emphasizing visualization with IGV and data management via the Galaxy platform. Learn how to effectively tune your pipelines by exploring base qualities, sample comparisons, and understanding mapping statistics. It covers crucial aspects such as false positives in indel detection, structural variations, and variant calling. Practical guidance on optimizing mapping parameters and filtration processes is provided, ensuring accurate analysis of genomic data, including gene expression studies.
E N D
Tutorial 6 High Throughput Sequencing
HTS tools and analysis • Visualization - IGV • Analysis platform – Galaxy • Tuning up the pipelines
Same mapping statistics – different meaning What might cause this low percentage of mapping?
The sample contains a high percentage of contamination The sample is very different from the reference genome
Structural Variations Large deletion in the sample compared to the reference genome
How can mapping parameters affect the results 5 mismatches per read 1 mismatch per read
One pipeline for all projects? False positives vs. true negatives 3-bases insertion
How can you tune your analysis? Try different programs. Mapping: • Change mapping parameters • Use non-unique mappings • Don’t filter duplicates Variants: • Change variant filtration • Change variant merging – penetrance, different heredity, low coverage in one individual… • Look for bigger variants: big insertions/ deletions, inversions, copy number variations etc. Gene expression: • Change the test threshold