1 / 41

Variación genética en el genoma

A G A G T T C T G C T C G A G G G T T A T G C G C G. A G A G T T C T G C T C G A G G G T T A T G C G C G. A G A G T T C T G C T C G A G G G T T A T G C G C G. A G A G T T C T G C T C G A G G G T T A T G C G C G. Variación genética en el genoma.

nitza
Télécharger la présentation

Variación genética en el genoma

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. A G A G T T C T G C T C G A G G G T T A T G C G C G A G A G T T C T G C T C G A G G G T T A T G C G C G A G A G T T C T G C T C G A G G G T T A T G C G C G A G A G T T C T G C T C G A G G G T T A T G C G C G Variación genética en el genoma

  2. International HapMap Project (http://www.hapmap.org)

  3. International HapMap Project (http://www.hapmap.org)

  4. International HapMap Project (http://www.hapmap.org) Aplicaciones biomédicas • Disponer datos genotípicos diferentes grupos étnicos • Selección TagSNPs estudio asociación -> Potencial para Whole Genome Association studies • Evaluación significación estadística e interpretación resultados • Estudio de los alelos menos comunes • Estudio variación estructural • Farmacogenómica

  5. Bases de datos de variación genética

  6. Association studies: Phenotpyic effect of SNPs Human genetic & phenotypic diversity database Phenotype Genotype Trait i Disease 1 ... SNP1 SNP2 SNP3 Estimation phenotypic effect G/T A/A G/C Secuence individual 1 x1 Healthy A/C C/C T/T Cervical Cancer x2 Secuence individual 2 ... ... ...

  7. BioBanks: Studies of cohorts at a great scale USA • deCODE (Islandia) • Estonia • Germany • Canada • Japan • China

  8. Association Studies

  9. Association Studies • Study design • Statistical analyses

  10. 1st phase: Design Study designs

  11. 1st phase: Design Study designs

  12. 2nd phase: Statistical analysis Statistical analysis methods

  13. 2nd phase Statistical analyses in Association Studies • Data validation • Genetic description • Unidimensional (snp by snp) • Multidimensional • Test for association genotype-phenotype • snp by snp • Multisnp / haplotype /tagSNP • Power assessment • Predictive model Steps

  14. Statistical analyses in Association Studies Step • Data validation (error sources: sampling, genotyping) • Checking with SNPref • Hardy-Weinberg proportions (separately for controls and cases) • Consistence among samples • Stratification (genetic markers)

  15. Hardy-Weinberg Test • SNP diallelic: A & a with p and q relative freq. • Genotypic HW proportions • AA, Aa & aa • p2, 2pq & q2 Genotype frequencies SNP rs1137933 • Three statistics: • (i) That based on the Pearson (χ2) test statistic(ii) That based on the Likelihood ratio test statistic (G test). (iii) An exact test

  16. Genotypes SNP rs1137933 Example of Hardy-Weinberg Test Control p = f(C)= f(CC) + f(CT)/2 q = 1 – p --------- Genotype SS SF FF Total Number, obs 38 76 15 = 129 = N Frequency, exp p2 2pq q2 = 1,00 Number, exp p2N 2pqN q2N = N Number, exp 50.1 70.0 8.9 = 129 ---------- Pearson (χ2) test statistic X2 = Σ (Oi-Ei) 2 / Ei Likelhood ratio (G) test statistic G= - 2 Σ ln (Oi / Ei)

  17. Example of Hardy-Weinberg Test SNP rs1137933

  18. SNP rs1137933 Genetic description: SNP by SNP Genotype frequencies Allele frequencies

  19. Genetic description: MultiSNP Haplotype inference Genotypes Possible haplotypes • a g • t c • b) a c • t g a/t g/c -> Haplotype 1acgtagcatcgtatgcgttagacgggggggtagcaccagtacag Haplotype 2acgtagcatcgtatgcgttagacgggggggtagcaccagtacag Haplotype 3acgtagcatcgtatgcgttagacgggggggtagcaccagtacag Haplotype4acgtagcatcgtttgcgttagacgggggggtagcaccagtacag Haplotype5acgtagcatcgtttgcgttagacgggggggtagcaccagtacag Haplotype6acgtagcatcgtttgcgttagacggcatggcaccggcagtacag Haplotype7acgtagcatcgtttgcgttagacggcatggcaccggcagtacag Haplotype8acgtagcatcgtttgcgttagacggcatggcaccggcagtacag Haplotype9acgtagcatcgtttgcgttagacggcatggcaccggcagtacag

  20. Frequency Haplotype estimates

  21. Genetic description: MultiSNP Linkage disequilibrium measure (D’ Lewontin) B1 B2 Total A1 p11 = p1q1 + D p12 = p1q2 - D p1 A2 p21 = p2q1 - D p22 = p2q2 + D p2 Total q1 q2 1 D’ = D / Dmax r = D’ / square root (p1 p2 q1 q2)

  22. Linkage Disequilibrium representation Recombination Hotspot Linkage blocks Associated Sites TagSNPs

  23. Statistical analyses in Association Studies • Data validation • Genetic description • Unidimensional (snp by snp) • Multidimensional • Test for association genotype-phenotype • snp by snp • Multisnp / haplotype /tagSNP • Power assessment • Predictive model Steps

  24. Case – control study 40% G 60% C Neutral SNP SNP1 (G/C) 40% G 60% C 100% A 0% T SNP2 (A/T) 0% A 100% T Mendelian SNP SNP3 (T/G) 80% T 20% G 60% T 40% G QTL SNP Genetic - phenotype Association -> Guilty by association Case vs Control SNPn

  25. Test for association • (snp by snp) • Chi-square Independence Test Genotypic SNP rs1137933 ChiSquare (2 gl) = 9,71** p = 0,00779   G (Likelihood ratio) (2 gl) = 9,67** p = 0,00795  ChiSquare (1 gl) = 0,07   p = 0,79134 G (Likelihood ratio) (1 gl) = 0,07 p = 0,79134 Allele Odds Ratio (OR) = 1,05Risk Ratio (RR) = 1,02

  26. odds (oportunidad) is the ratio of probabilties for an event given by the quantity p / (1 − p), where p is the probability of the event p Odds ratio (oportunidad relativa) 1 - p An disease with a 1 in 5 probability of occurring for a given genotype (i.e. 0.2 or 20%), then the odds are 0.2 / (1 − 0.2) = 0.2 / 0.8 = 0.25. • The odds ratio is defined as the ratio of the odds of an event occurring in one group to the odds of it occurring in another group. These groups might be case and control groups, or any other dichotomous classification. So if the probabilities of the event in each of the groups are p (first group) and q (second group), then the odds-ratio is:

  27. Odds ratio (razón de posibilidades) El cociente a/c es la Odds de exposición observada en el grupo de casos. El cociente b/d es la Odds de exposición en el grupo control OR = 2,2 -> 2,2:1 Un efecto (enfermedad) aparece 2,2 veces más ante la presencia de otra variable (alelo SNP) que si esta variable no está presente

  28. RR= tasa de incidencia de expuestos/tasa de incidencia en no expuestos Riesgo relativo RR, Risk ratio Riesgo Relativo

  29. Razón Odds = 210/100 = 2,52 250/300 Riesgo Relativo = 210/460 = 1,83 100/400

  30. Controling for other independent variables Genotypic SNP rs1137933 ChiSquare (2 gl) = 7,59* p = 0,02248 G (Likelihood ratio)(2 gl) = 7,5* p = 0,02352 ♀ ♂ ChiSquare (2 gl) = 1,95   p = 0,37719G (Likelihood ratio) (2 gl) = 1,98 p = 0,37158

  31. Test for association (multisnp) Test for association among haplotype and response (diseases) or TagSNP and response

  32. Logistic regression modelo de regresión estadística de variables dependientes binarias. Puede considerarse un modelo lineal generalizado que usa la función logit como función de enalce (link), y sus errores están distribuidos binomialmente. • El modelo se expresa en la forma • i, = 1, ..., n, donde • El logaritmo de odds (probabilidad dividida por uno menos la probabilidad) del resultado se modela como una función lineal de variables explicativas, X1 a Xk. Puede escribirse como • La interpretación de las estimas de los parámetros β es el efecto multiplicativo sobre la razón de odds. En el caso de variables dicotómicas explicativas, por ejemplo sexo, eβ (el antilog de β) es la estima del odds-ratio of tener el resultado según se compare machos y hembras. • Los parámetros α β1, ..., βk se estiman normalmente por máxima verosimilitud. Logistic regression

  33. Logistic regression is a predictive tool if the logit β1 = 2.303, then the corresponding odds ratio (the exponential function, eβ1) is 10, then we may say that when the independent variable increases one unit, the odds that the dependent = 1 increase by a factor of 10, when other variables are controlled.

  34. http://bioinfo.iconcologia.net/SNPstats (Web tool for association studies) • http://www.mep.ki.se/genestat/tl/genass_ldmap (Tutorial for association studies) • http://linkage.rockefeller.edu/soft (Software for genetic analysis) • http://www.broad.mit.edu/personal/jcbarret/haploview (Haploview) • http://www.genome.gov/26525384 (Catálogo de estudios de GWA publicados) • http://geneticassociationdb.nih.gov (Base de datos de estudios de asociación de enfermedades humana) Links

  35. Association studies: Recurso Web http://bioinfo.iconcologia.net/index.php?module=Snpstats

  36. 40% G 60% C SNP1 (G/C) 40% G 60% C 100% A 0% T 0% A 100% T SNP2 (A/T) SNP3 (T/G) 80% T 20% G 60% T 40% G Asociación genética -> Culpable por asociación Pacientes vs Control SNPn

  37. Hoy podemos abordar el análisis de asociación de miles de SNPs, pudiendo desvelar la base genética de las enfermedades.

  38. Translation of genetic-phenotypic information into the clinical practise D.R. Bentley. 2004 Nature 429: 440-445

  39. Translation of genetic-phenotypic information into the clinical practise

  40. Translation of genetic-phenotypic information into the clinical practise

  41. Translation of genetic-phenotypic information into the clinical practise

More Related