540 likes | 672 Vues
Regulatory variation and eQTLs. Chris Cotsapas cotsapas@broadinstitute.org. Regulatory variation. What do trait-associated variants do? Genetic changes to: Coding sequence ** Gene expression levels Splice isomer levels Methylation patterns Chromatin accessibility
E N D
Regulatory variation and eQTLs Chris Cotsapas cotsapas@broadinstitute.org
Regulatory variation • What do trait-associated variants do? • Genetic changes to: • Coding sequence ** • Gene expression levels • Splice isomer levels • Methylation patterns • Chromatin accessibility • Transcription factor binding kinetics • Cell signaling • Protein-protein interactions Regulatory
History, eQTL, mQTL, others Basic Concepts
Within a population • Damervalet al 1994 • 42/72 protein levels differ in maize • 2D electrophoresis, eyeball spot quantitation • Problems: • genome coverage • quantitation • post-translational modifications • Solution: use expression levels instead!
gene 4 gene 1 Whole-genome eQTL analysis is an independent GWAS for expression of each gene gene 2 gene 3 gene N gene 5
Genetics of gene expression (eQTL) • cis-eQTL • The position of the eQTLmaps near the physical position of the gene. • Promoter polymorphism? • Insertion/Deletion? • Methylation, chromatin conformation? • trans-eQTL • The position of the eQTLdoes not map near the physical position of the gene. • Regulator? • Direct or indirect? Modified from Cheung and Spielman 2009 Nat Gen
yeast, mouse, maize, human eqtl – the array era
Yeast • Brem et al Science 2002 • Linkage in 40 offspring of lab x wild strain cross • 1528/6215 DE between parents • 570 map in cross • multiple QTLs • 32% of 570 have cis linkage • 262 not DE in parents also map
transhotspots Brem et al Science 2002
Mammals I • F2 mice on atherogenic diet • Expression arrays; WG linkage Schadt et al Nature 2003
Mammals II 10% !! Chesler et al Nat Genet 2005
Mammals III • No major trans loci in humans • Cheung et al Nature2003 • Monks et al AJHG 2004 • Stranger et alPLoS Genet 2005, Science 2007
Open question Where are the trans eQTLS?
gene 4 gene 1 Whole-genome eQTL analysis is an independent GWAS for expression of each gene gene 2 gene 3 gene N gene 5
Issues with trans mapping • Power • Genome-wide significance is 5e-8 • Multiple testing on ~20K genes • Sample sizes clearly inadequate • Data structure • Bias corrections deflate variance • Non-normal distributions • Sample sizes • Far too small
But… • Assume that transeQTLs affect many genes… • …and you can use cross-trait methods!
Cross-phenotype meta-analysis L(data | λ≠1) SCPMA ~ L(data | λ=1) Cotsapas et al, PLoS Genetics
Open research questions • Do trans effects exist? • Yes – heritability estimates suggest so. • Can we detect them? • Larger cohorts? • Most eQTL studies ~50-500 individuals • See later, GTEx Project • Better methods? • Collapsing data? • PCA, summary statistics, modeling?
First, let’s define the question • Can we use genetic perturbations as a way to understand how genes are regulated? • In what groups, in which tissues? • To what stimuli/signaling events? • Do ciseQTLs perturb promoter elements? • Do trans perturb TFs? Signaling cascades?
Significant associations are symmetrically distributed around TSS Most significant SNP per gene 0.001 permutation threshold Stranger et al., PLoS Gen 2012
69-80% of cis associations are cell type-specific Cell type-specific and cell type-shared gene associations (0.001 permutation threshold) 262 268 271 82 73 85 86 86 86 No. of cell types with gene association cell type • cis association sharing increases slightly when significance thresholds are relaxed • Cell type specificity verified experimentally for subset of eQTLs Dimas et al Science 2009 Slide courtesy Antigone Dimas Dimas et alScience 2009
Open research questions • Do ciseQTLs perturb functional elements? • Given each is independent, how can we know? • Do tissue-specific effects correlate with the expression of a gene across tissues? Or a regulator? • Perhaps a gene is expressed, but in response to different regulators across tissues? • If we ever find trans eQTLs… • Common regulators of coregulated genes? • Tissue specificity? • Mechanisms?
Candidate genes, perturbations underlying organismal phenotypes Application to GWAS
eQTLs as intermediate traits Schadt et al Nat Genet 2005
cell type not relevant for disease relevant cell type for disease Exploring eQTLs in the relevant cell type is important for disease association studies Importance of cataloguing regulatory variation in multiple cell types Slide courtesy Antigone Dimas Modified from Nica and Dermitzakis Hum Mol Genet 2008
Barrett et al 2008 de Jageret al 2007
Frankeet al 2010 Anderson et al 2011
Shared association in 8 HapMap populations APOH: apolipoprotein H Stranger et al., PLoS Gen 2012
Number of genes with cis-eQTL associations 8 extended HapMap populations SRC: permutation threshold Stranger et al., PLoS Gen 2012
Direction of allelic effectsame SNP-gene combination across populations Population 1 Population 2 AGREEMENT log2 expression log2 expression OPPOSITE log2 expression log2 expression Stranger et al., PLoS Gen 2012
Population differences could have non-genetic basis • Differences due to environment? (Idaghdour et al. 2008) • Differences in cell line preparation? (Stranger et al. 2007) • Differences due to batch effects? (Akey et al. 2007) (Reviewed in Gilad et al. 2008) Slide courtesy Alkes Price
Gene expression experiment Does gene expression in 60 CEU + 60 YRI vary with ancestry? Does gene expression in 89 AA vary with % Eur ancestry? 60 CEU + 60 YRI from HapMap, 89 AA from Coriell HD100AA Gene expression measurements at 4,197 genes obtained using Affymetrix Focus array c Slide courtesy Alkes Price
Gene expression differences in African Americans validate CEU-YRI differences 12% ± 3% in cis c = 0.43 (± 0.02) (P-value < 10-25) Slide courtesy Alkes Price
RNAseq, GTEx Emerging efforts
RNAseqquestions • Standard eQTLs • Montgomery et al, Pickrell et al Nature 2010 • Isoform eQTLs • Depth of sequence! • Long genes are preferentially sequenced • Abundant genes/isoforms ditto • Power!? • Mapping biases due to SNPs
Strategies for transcript assembly Garber et al. Nat Methods 8:469 (2011)