520 likes | 632 Vues
TECNICHE DI NEXT GENERATION SEQUENCING IN CAMPO MEDICO. Dr. R. Piazza. R. Piazza – NGS Sequencing 30/10/13. XVI-XVII secolo : anatomia umana. XIX secolo : microbiologia. XX secolo : biochimica e biologia molecolare. 2008-2013: rivoluzione genetica.
E N D
TECNICHE DI NEXT GENERATION SEQUENCING IN CAMPO MEDICO Dr. R. Piazza R. Piazza – NGS Sequencing 30/10/13
XVI-XVII secolo: anatomiaumana XIX secolo: microbiologia XX secolo: biochimica e biologiamolecolare 2008-2013: rivoluzionegenetica R. Piazza – NGS Sequencing 30/10/13
SANGER SEQUENCING + DNA R. Piazza – NGS Sequencing 30/10/13
NEXT GENERATION SEQUENCING Flowcell R. Piazza – NGS Sequencing 30/10/13
NEXT GENERATION SEQUENCING Library di DNA Genomic DNA R. Piazza – NGS Sequencing 30/10/13
NEXT GENERATION SEQUENCING R. Piazza – NGS Sequencing 30/10/13
NEXT GENERATION SEQUENCING I 4 nucleotidimarcati con fluorocromi e bloccati in 3’ sonoaggiunticontemporaneamente Primer di sequenziamento Nucleotidimarcati e bloccati R. Piazza – NGS Sequencing 30/10/13
NEXT GENERATION SEQUENCING R. Piazza – NGS Sequencing 30/10/13
NEXT GENERATION SEQUENCING ACQUISIZIONE DELL’IMMAGINE RIMOZIONE DEL FLUOROFORO RIMOZIONE DEL BLOCCO AL 3’ R. Piazza – NGS Sequencing 30/10/13
HIGH-THROUGHPUT SEQUENCING R. Piazza – NGS Sequencing 30/10/13
T = Clusters#/Tile x Tile/Lane# x Lanes# x Seq_Length x 2 * 120 T = 300000 * 8 * 76 * 2 = ~ 45 Gigabasi! 76bp 76bp Genomaumano = 3 Gigabasi Un’analisirichiede ~ 6000 Gigabyte per lo storage deidati! R. Piazza – NGS Sequencing 30/10/13
SANGER SEQ vs. NGS THROUGHPUT COSTO PER-BASE Allele #1 C A G C G A C A G C A G C A T T G GG A C Allele #2 C A G C G A C A G C G G C A T T G GG A C NGS Read #5 C A G C G A C A G C G G C A T T G GG A C NGS Read #4 C A G C G A C A G C A G C A T T G GG A C NGS Read #3 C A G C G A C A G C A G C A T T G GG A C Coverage = 5 NGS Read #2 C A G C G A C A G C A G C A T T G GG A C NGS Read #1 C A G C G A C A G C G G C A T T G GG A C Allele #1 C A G C G A C A G C A G C A T T G GG A C C A G C G A C A G C G G C A T T G GG A C Allele #2 R. Piazza – NGS Sequencing 30/10/13
HIGH-THROUGHPUT SEQUENCING: APPLICAZIONI DNA GENOMIC DNA SEQUENCING RESEQUENCING DE NOVO SEQUENCING WHOLE-EXOME SEQUENCING ChIP-Seq DEEP SEQUENCING METHYL-SEQ RNA mRNA SEQUENCING TRANSCRIPTOME SEQUENCING (RNA-SEQ) TAG SEQUENCING (DITAG) MICRO-RNA STUDIES R. Piazza – NGS Sequencing 30/10/13
WHOLE-GENOME, WHOLE-EXOME AND ULTRADEEP-SEQUENCING COVERAGE COVERAGE R. Piazza – NGS Sequencing 30/10/13
ULTRADEEP SEQUENCING – QUANDO ? M M ABL kinase domain R. Piazza – NGS Sequencing 30/10/13
WHOLE-EXOME SEQUENCING R. Piazza – NGS Sequencing 30/10/13
ALIGNMENT DONE: WHAT’S NEXT ? T T T T T T C C C C C C G G G G G T T T T T T T A A A A VARIANT CALLING MUTATION, SEQ ERROR OR SNP ? SINGLE NUCLEOTIDE POLYMORPHISM CONTROL SAMPLE CASE SAMPLE VARIANT .... A A A G G G G G G G T T A A A A A A G C T ..ACTGAATTGCTGATTGTCAAGTCTGCTAGCG... .... A A G G G G T T T T T T ..ACTGAATTGCTGATTGTCAAGTCTGCTAGCG.. VarScan 2 (http://massgenomics.org/varscan) KoboldtDC et al., Genome Res. 2012 Mar;22(3):568-76 R. Piazza – NGS Sequencing 30/10/13
WHOLE-EXOME SEQUENCING GOES DIGITAL CONTROL CASE R. Piazza – NGS Sequencing 30/10/13
LOSS OF HETEROZYGOSITY – ALLELIC IMBALANCE A CONTROL A A T T T A CASE R. Piazza – NGS Sequencing 30/10/13
COMPARATIVE WHOLE-EXOME SEQUENCING GOES DIGITAL: CEQer EXONIC QUANTIFICATION ANALYZER Piazza R. et al., PLoS One. 2013 Oct 4;8(10):e74825
Statistical module As sample sizeincreases (Nr> 10) the Z-Score converges to a Gaussiandistribution! WilcoxonSigned-Rank test Test statistic W WilcoxonSigned-Rank test ..using the Abramowitz and Stegun approximation equation 7.1.26 Estimating the errorfunction of the normaldistribution of W.. R. Piazza – NGS Sequencing 30/10/13
CML-BC PATIENT: CML001BC Chr9 HET POSITION IN CONTROL Log2 Ratio EXON CDKN2A (p16) R. Piazza – NGS Sequencing 30/10/13
CML-BC PATIENT: CML004BC Chr17 http://www.ngsbicocca.org/html/ceqer.html p53 R. Piazza – NGS Sequencing 30/10/13
ANALISI DI PRODOTTI DI FUSIONE ONCOGENICI R. Piazza – NGS Sequencing 30/10/13
ANALISI DI PRODOTTI DI FUSIONE ONCOGENICI FRAMMENTAZIONE ? R. Piazza – NGS Sequencing 30/10/13
mRNA-seq – DRIVER FUSION TRANSCRIPTS IDENTIFICATION Junction reads Bridge reads 76bp 76bp Piazza R. et al., Nucleic Acids Res. 2012 Sep;40(16):e123 R. Piazza – NGS Sequencing 30/10/13
EXOME BUILDER EXOME DATASET SAM BAM ABNORMAL PAIRS SCANNER ABNORMAL PAIRS ABL ex2 BCR ex14 CCDS / REFFLAT HALF-MAPPED PAIRS PUTATIVE TRANSLOCATIONS SET (PTS) FILTERED HALF-MAPPED PAIRS PREFILTERING ALGORITHM FILTERED PTS Genome MappingQuality Read Quality Homology Filter Threshold Filter N Filter ALIGNMENT TO HUMAN GENOME ??? R. Piazza – NGS Sequencing 30/10/13
1 2 3 4 FILTERED PTS JUNCTION FINDER Ex12 Ex2 Ex3 Ex4 Ex13 Ex14 ABL BCR 1 2 JUNCTIONS LIST 3 4 FILTERED HALF-MAPPED PAIRS ALIGNMENT ALGORITHM Ex14 Ex2 JUNCTION READ JUNCTION BCR JUNCTION ??? R. Piazza – NGS Sequencing 30/10/13
JUNCTION READ FRAME ALGORITHM DIRECTION ALGORITHM 5’ BCR 3’ ABL RECIPROCAL TRANSLOCATION ALGORITHM 5’ ABL BCR 3’ R. Piazza – NGS Sequencing 30/10/13
BCR-ABL1 p210 e13a2 t(9;22) BCR-ABL1 p210 e14a2 t(9;22) BCR-ABL1 p190 t(9;22) AML1-ETO t(8;21) CBFB-MYH11 inv(16) CEP110-FGFR1 t(8;9) EWSR1-ERG t(21;22) MLL-MLLT1 t(11;19) MLL-MLLT3 t(9;11) MLLT10-PICALM t(10;11) NCOA4-RET inv(10) NPM-ALK t(2;5) R. Piazza – NGS Sequencing 30/10/13
RNA-SEQ GOES DIGITAL READ RPKM = READS PER KBASE PER MILLION OF MAPPED READS TPM = TRANSCRIPTS PER MILLION EXON LOW EXPRESSION RNA-Seq HIGH EXPRESSION TOPHAT (http://tophat.cbcb.umd.edu/) CUFFLINKS (http://cufflinks.cbcb.umd.edu/) Trapnell C, et al. Nat. Biotechnol. 2010;28:511–515.
HIGH-THROUGHPUT SEQUENCING: APPLICAZIONI DNA GENOMIC DNA SEQUENCING RESEQUENCING DE NOVO SEQUENCING WHOLE-EXOME SEQUENCING ChIP-Seq DEEP SEQUENCING METHYL-SEQ RNA mRNA SEQUENCING TRANSCRIPTOME SEQUENCING (RNA-SEQ) TAG SEQUENCING (DITAG) MICRO-RNA STUDIES R. Piazza – NGS Sequencing 30/10/13
METHYL-SEQ R. Piazza – NGS Sequencing 30/10/13
NEXT GENERATION SEQUENCING STANDARDIZED FILE FORMATS ARE NOW AVAILABLE FOR SEQUENCES AND ALIGNMENTS A LARGE NUMBER OF TOOLS HAS BEEN DEVELOPED TO ANALYSE NGS DATA THE LARGE MAJORITY OF THEM IS COMPLETELY FREE MANY TOOLS ARE OPEN-SOURCE IS THIS THE PERFECT NGS WORLD ?? R. Piazza – NGS Sequencing 30/10/13
NEXT GENERATION SEQUENCING INSTALLATION IS CHALLENGING (DEPENDENCY HELL!) THE SAME NGS DATA MUST BE OFTEN INTERROGATED MULTIPLE TIMES THE LARGE MAJORITY OF NGS TOOLS IS FAR FROM BEING USER-FRIENDLY MANY TOOLS RUN ONLY UNDER LINUX R. Piazza – NGS Sequencing 30/10/13
ESPERIENZA DELL’EMATOLOGIA MONZESE IN NGS: LA LEUCEMIA MIELOIDE ATIPICA La LeucemiaMieloideCronicaAtipica (aCML) è unapatologiaclonaleappartenente al gruppodellesindromimielodisplastiche/mieloproliferative (MDS/MPN). La aCML è caratterizzata da manifestazionicliniche e di laboratoriosimilialla CML classica, tuttavial’assenza del cromosoma Philadelphia e del prodotto di fusione BCR/ABL suggeriscono la presenza di un differentemeccanismopatogenetico. La prognosidell’aCML è infausta, con unamediana di sopravvivenza di 37 mesi. La causamolecolaredell’aCML è ad oggisconosciuta. Con l’obiettivo di identificarelesionimolecolariricorrenti in aCML abbiamoeffettuatoanalisi di sequenziamento esonico in 8 campioni (DNA genomico da cellule leucemiche + DNA germline) di aCML. R. Piazza – NGS Sequencing 30/10/13
In media, 8 miliardi di basisequenziate per esoma Coverage esonico medio: 80x 84variantisomaticheesoniche, di cui 63 non sinonime R. Piazza – NGS Sequencing 30/10/13
TARGETED RESEQUENCING R. Piazza – NGS Sequencing 30/10/13
Germline VariantiSomatiche SGS aCML SETBP1 R. Piazza – NGS Sequencing 30/10/13
p = 0.01 p = 0.008 SETBP1 WT SETBP1 mutato MUT WT MUT WT MUT WT R. Piazza – NGS Sequencing 30/10/13
Proteases MYC pY307 SET SET SET SET SET Beta-Catenin SETBP1 AKT PP2A R. Piazza – NGS Sequencing 30/10/13
DOMINIO PEST p p S E S H S E E T I P S D S G I G T D N N S T S D Q A E K S S E Beta-TRCP (F-box Protein) Beta-TRCP R. Piazza – NGS Sequencing 30/10/13
Ub G870S G870S Beta-TRCP Proteases Proteasome pY307 SET SET SET SET MYC SETBP1 SETBP1 Beta-Catenin PP2A AKT R. Piazza – NGS Sequencing 30/10/13
S E S H S E E T I P S D S G I G T D N N S T S D Q A E K S S E Pept. WT: Biotin-S H S E E T I P S D PS G I G PT D N N S T S Pept. G870S: Biotin-S H S E E T I P S D PS S I G PT D N N S T S P P R. Piazza – NGS Sequencing 30/10/13