Download
slide1 n.
Skip this Video
Loading SlideShow in 5 Seconds..
SNP Discovery and Analysis: Application to Association Studies PowerPoint Presentation
Download Presentation
SNP Discovery and Analysis: Application to Association Studies

SNP Discovery and Analysis: Application to Association Studies

236 Views Download Presentation
Download Presentation

SNP Discovery and Analysis: Application to Association Studies

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

  1. SNP Discovery and Analysis: Application to Association Studies Mark J. Rieder, PhD Dana Crawford, PhD Deborah Nickerson, PhD SeattleSNPs PGA July 19-20, 2005

  2. Practical Aspects of SNP Association Studies • SNP Discovery: Where do I find SNPs to use in my association studies? (e.g. databases, direct resequencing) SNP Selection: How do I choose SNPs that are informative? (i.e. assessing SNP correlation - linkage disequilibrium) • SNP Associations: What analyses can I perform after genotyping these SNPs? (e.g. single SNP data, haplotype data) • SNP Replication/Function: How is function predicted or assessed. (e.g. nonsynonymous SNPs, conserved non-coding regions (CNS) transcription factor binding sites, gene expression)

  3. SeattleSNPs Program for Genomic Applications: Overview Aim 1: To establish a variation discovery resource capable of comprehensive resequencing of candidate genes related to HLBS. Biological Focus: Inflammation Genes and Pathways: Coagulation, Complement, Cytokines Interacting Partners

  4. SeattleSNPs SNPs in Candidate Genes Average Gene Size - 26.5 kb ~ Compare 2 haploid - 1 in 1,200 bp ~130 SNPs (200 bp) - 15,000,000 SNPs ~ 44 SNPs > 0.05 MAF (600 bp) - 6,000,000 SNPs

  5. SeattleSNPs PGA: Candidate Gene SNP Resource • 4.9 Mb in 47 individuals = 230 Mb total sequence • Define sequence diversity - catalogue all SNPs • Select “optimal” tagSNPs sets • Determine haplotype structure • Provide necessary baseline data for association studies

  6. Warfarin Pharmacogenetics Background Warfarin characteristics Pharmacokinetics/Pharmacodynamics Discovery of VKORC1 VKORC1 - SNP Discovery VKORC1 - SNP Selection (tagSNPs) VKORC1 - SNP Testing SNP/Haplotype Inference Haplotype Inference, Testing VKORC1 - SNP Replication/Function

  7. Pharmacogenomics as a Model for Association Studies Clear genotype-phenotype link intervention variable response Pharmacokinetics - 5x variation Quantitative intervention and response drug dose, response time, metabolism rate, etc. Target/metabolism of drug generally known gene target that can be tested directly with response Reduce variability and identify outliers. Prospective testing Personalized Medicine

  8. Warfarin Background • Commonly prescribed oral anti-coagulant • In 2003, 21.2 million prescriptions were written for • warfarin (Coumadin) • Prescribed following MI, atrial fibrillation, stroke, • venous thrombosis, prosthetic heart valve replacement, • and following major surgery • Difficult to determine effective dosage • Narrow therapeutic range • - Monitoring of prothrombin time (INR) - 2.0 - 3.0 • Large inter-individual variation

  9. Ave: 5.2 mg/d n = 186 European-American Add warfarin dose distribution 30x dose variability Patient/Clinical/Environmental Factors Pharmacokinetic/Pharmacodynamic - Genetic

  10. Warfarin CYP2C9 Epoxide Reductase Inactivation Pharmacokinetic  -Carboxylase (GGCX) Warfarin inhibits the vitamin K cycle Vitamin K-dependent clotting factors (FII, FVII, FIX, FX, Protein C/S/Z)

  11. Warfarin Metabolism (Pharmacokinetics) • Major pathway for termination of pharmacologic effect • is through metabolism of S-warfarin in the liver by CYP2C9 • CYP2C9 SNPs alter warfarin metabolism: • CYP2C9*1 (WT) - normal • CYP2C9*2 (Arg144Cys) - low/intermediate • CYP2C9*3 (Ile359Leu) - low • CYP2C9 alleles occur at a significant minor allele frequency • European: *2 - 10.7% *3 - 8.5 % • Asian: *2 - 0% *3 - 1-2% • African-American: *2 - 2.9% *3 - 0.8%

  12. TIME TO STABLE ANTICOAGULATION CYP2C9-WT ~90 days CYP2C9-Variant ~180 days *2 or *3 carriers take longer to reach stable anticoagulation N 127 28 4 18 3 5 Effect of CYP2C9 Genotype on Anticoagulation-Related Outcomes (Higashi et al., JAMA 2002) WARFARIN MAINTENANCE DOSE mg warfarin/day - Variant alleles have significant clinical impact - Still large variability in warfarin dose (15-fold) in *1/*1 “controls”?

  13. Analysis of Independent Predictors of Warfarin Dose Adapted from Gage et al., Thromb Haemost, 2004 Variable Change in Warfarin Dose P value Target INR, per 0.5 increase 21% <0.0005 BMI, per SD 14% <0.0001 Ethnicity(African-American, [Asian])13%, [ 10-15%] 0.003 Age, per decade 13% <0.0001 Gender, Female 12% <0.0001 Drugs (Amiodarone) 24% 0.007 CYP2C9*2, per allele19% <0.0001 CYP2C9*3, per allele30% <0.0001 ~ 30% of the variability in warfarin dose is explained by these factors What other candidate genes are influencing warfarin dosing?

  14. Warfarin Epoxide Reductase  -Carboxylase (GGCX) Warfarin acts as a vitamin K antagonist Pharmacodynamic CYP2C9 Inactivation Vitamin K-dependent clotting factors (FII, FVII, FIX, FX, Protein C/S/Z)

  15. Epoxide Reductase (VKORC1) New Target Protein for Warfarin  -Carboxylase (GGCX) Clotting Factors (FII, FVII, FIX, FX, Protein C/S/Z) Rost et al. & Li, et al., Nature (2004) 5 kb - chr 16

  16. Warfarin Resistance VKORC1 Polymorphisms Rost, et. al. Nature (2004) • Rare non-synonymous mutations in VKORC1 causative for warfarin resistance (15-35 mg/d) • NOnon-synonymous mutations found in ‘control’ chromosomes (n = ~400)

  17. 0.5 5 15 Inter-Individual Variability in Warfarin Dose: Genetic Liabilities SENSITIVITY CYP2C9 coding SNPs - *3/*3 RESISTANCE VKORC1 nonsynonymous coding SNPs Frequency Common VKORC1 non-coding SNPs? Warfarin maintenance dose (mg/day)

  18. SNP Discovery: Resequencing VKORC1 • PCR amplicons --> Resequencing of the complete genomic region • 5 Kb upstream and each of the 3 exons and intronic segments; ~11 Kb • SeattleSNPs PGA - pga.gs.washington.edu (24 African-Am./23 Europeans) • Warfarin treated clinical patients (UWMC): 186 European • Other populations: 96 European, 96 African-Am., 120 Asian

  19. SNP Discovery: Resequencing Results Summary of PGA samples (European, n = 23) Total: 13 SNPs identified 10 common/3 rare (<5% MAF) Clinical Samples (European patients n = 186) Total: 28 SNPs identified 10 common/18 rare (<5% MAF) 15 - intronic/regulatory 7 - promoter SNPs 2 - 3’ UTR SNPs 3 - synonymous SNPs 1 - nonsynonymous - single heterozygous indiv. - highest warfarin dose = 15.5 mg/d How does the comprehensive SNP discovery compare to what was known for this gene?

  20. SNP Discovery: dbSNP database dbSNP -NCBI SNP database

  21. SNP Discovery: dbSNP database (VKORC1) • SeattleSNPs Resequencing • 28 SNPs --> 15 SNPs gene region • 10 dbSNPs • 8/10 confirmations • 3 frequency/genotype data • 7 new dbSNP entries generated • by SeattleSNPs resequencing • 8 dbSNPs/15 SNPs (~50%)

  22. Nickerson and Kruglyak, Nature Genetics, 2001 SNP Discovery: dbSNP database Mar 2005 - 5.0 million (validated - 1/600 bp) 5.0/10.0 = 50% of all common SNPs (validated)!

  23. 1.0 96 48 24 16 8 Fraction of SNPs Discovered 0.5 2 0.0 0.0 0.1 0.2 0.3 0.4 0.5 Minor Allele Frequency (MAF) SNP discovery is dependent on your sample population size { GTTACGCCAATACAGGATCCAGGAGATTACC GTTACGCCAATACAGCATCCAGGAGATTACC 2 chromosomes

  24. SeattleSNPs 25% { 75% Minor Allele Freq. (MAF) SNP Discovery: dbSNP database dbSNP (Perlegen/HapMap) 50% Minor Allele Freq. (MAF) Rarer and population specific SNPs are found by resequencing

  25. dbSNP: Increasing numbers of SNPs now have genotype data HapMap Phase II Perlegen Perlegen Data

  26. Current State of dbSNP Many SNPs left to validate and characterize.

  27. Development of a genome-wide SNP map: How many SNPs? Nickerson and Kruglyak, Nature Genetics, 2001 ~ 10 million common SNPs (>1- 5% MAF) - 1/300 bp Mar 2005 - 5.0 million (validated - 1/600 bp) 5.0/10.0 = 50% of all common SNPs validated! Coming Soon! 5.0 million validated SNPs with genotypes!

  28. SNP Discovery: dbSNP database dbSNP Issues: Not comprehensive catalog (50% of SNPs) Is the data confirmed? (50% are validated) Information about allele frequency/population (50%) No information about SNP correlations (linkage disequilibrium) genotyping efficiency

  29. Frequency Warfarin Dose (mg/d) SNP Selection: Using Linkage Disequilibrium • Common SNPs • VKORC1 - 28 total - 10 SNPs > 10% MAF • Evaluate linkage disequilibrium (non-random association of genotype data) Does common variation in VKORC1 have a role in determining warfarin dose?

  30. SNP Selection: Using Linkage Disequilibrium Site 2 Site 2 Site 1 Site 1 Maternal C A C : 50% A : 50% T G T : 50% G : 50% Paternal Possible 2-site comb. Expected Freq. Observed Freq. C A 0.5 X 0.5 = 0.25 0.50 * C G 0.5 X 0.5 = 0.25 0.01 T A 0.5 X 0.5 = 0.25 0.01 T G 0.5 X 0.5 = 0.25 0.48 * * Sites Correlated

  31. SNP Selection: Using Linkage Disequilibrium • SNP discovery data (i.e. population of samples with genotypes) • Find all correlated SNPs to minimize the total number of SNPs • Maintains genetic information (correlations) for that locus LD_Select - SNP tagging/binning algorithm - based on LD (r2), not haplotypes Carlson, et al. AJHG (2004)

  32. SNP Selection: VG/LD_Select on the Web pga.gs.washington.ed/VG2

  33. SNP Selection: tagSNP Data

  34. SNP Selection: VKORC1 tagSNPs

  35. e.g. Bin 1 - SNP 381 C/C C/T T/T SNP Testing: VKORC1 tagSNPs Five Bins to Test 381, 3673, 6484, 6853, 7566 2653, 6009 861 5808 9041 Bin 1 - p < 0.001 Bin 2 - p < 0.02 Bin 3 - p < 0.01 Bin 4 - p < 0.001 Bin 5 - p < 0.001 SNP x SNP interactions - haplotype analysis?

  36. VKORC1 Summary: SNP Discovery/SNP Selection • VKORC1 candidate gene for warfarin dose response • SNP discovery performed using PCR/resequencing to catalog common SNPs • 28 SNPs found • 10 common SNPs • SNP discovery using dbSNP • 8/10 dbSNPs confirmed • 7 new SNPs added • SNP Selection using linkage disequilibrium • 10 common SNPs (> 10% MAF) • 5 informative SNPs for genotyping

  37. Haplotypes Pick tagSNPs Genotype samples Pick tagSNPs Infer haplotypes Test for association Haplotypes in Genetic Association Studies Two main approaches with haplotypes:

  38. Haplotypes in Genetic Association Studies How can you get haplotypes? What information do you get from haplotypes? How do you use haplotypes to find tagSNPs? How do you use haplotypes to test for associations?

  39. Haplotypes – The Definition “…a unique combination of genetic markers present in a chromosome.” pg 57 in Hartl & Clark, 1997

  40. Collect pedigrees Somatic cell hybrids Rodent Human C/C, A/G C/T, A/A Hybrid TT GG CC AG T/T, G/G C/C, A/G Allele-specific PCR SNP 1 SNP 2 CT AG C/T A/G C/T, A/G Constructing Haplotypes

  41. Constructing Haplotypes Examples of Haplotype Inference Software: EM Algorithm Haploview http://www.broad.mit.edu/mpg/haploview/index.php Arlequin http://lgb.unige.ch/arlequin/ PHASE v2.1 http://www.stat.washington.edu/stephens/software.html HAPLOTYPER http://www.people.fas.harvard.edu/~junliu/Haplo/docMain.htm

  42. Haplotypes in SeattleSNPs • >200 genes re-sequenced in inflammation response • 2 populations: European- and African-Americans • PHASEv2.0 results posted on website • Interactive tool (VH1) to visualize and sort haplotypes http://pga.gs.washington.edu

  43. Haplotypes in SeattleSNPs

  44. Haplotypes in SeattleSNPs

  45. Haplotypes in SeattleSNPs

  46. Haplotypes in SeattleSNPs

  47. Haplotypes in SeattleSNPs

  48. Haplotypes in SeattleSNPs

  49. Haplotypes in SeattleSNPs

  50. Haplotypes in SeattleSNPs