590 likes | 714 Vues
6.047/6.878/HST.507 Computational Biology: Genomes, Networks, Evolution. Lecture 15 Regulatory variation and eQTLs. Chris Cotsapas cotsapas@broadinstitute.org. Module 4: Population / Evolution / Phylogeny. L15/16: Association mapping for disease and molecular traits
E N D
6.047/6.878/HST.507Computational Biology: Genomes, Networks, Evolution Lecture 15Regulatory variation and eQTLs Chris Cotsapas cotsapas@broadinstitute.org
Module 4: Population / Evolution / Phylogeny L15/16: Association mapping for disease and molecular traits Statistical genetics: disease mapping in populations (Mark Daly) Quantitative traits and molecular variation: eQTLs, cQTLs L17/18: Phylogenetics / Phylogenomics Phylogenetics: Evolutionary models, Tree building, Phylo inference Phylogenomics: gene/species trees, coalescent models, populations L19/20: Human history, Missing heritability Measuring natural selection in human populations The missing heritability in genome-wide associations And done! Last pset Nov 11 (no lab), In-class quiz on Nov 20 No lab 4! Then entire focus shifts to projects, Thanksgiving, Frontiers
Today: Regulatory variation and eQTLs • Quantitative Trait Loci (QTLs), Regulatory Variation • Molecular phenotypes as QTs: expression, chromatin… • Discretization: a GWAS for each gene. Cis-/Trans-eQTLs • Underlying regulatory variation: eQTLs, GWAS, cis-eQTL • Finding trans-eQTLs (distal from gene that varies) • Challenges: Power, structure, sample size • Cross-phenotype analysis: trans QTLs affect many genes • Identifying underlying regulatory mechanisms • Cis-eQTLs: TSS-distance, cell type specificity • eQTLs vs. GWAS: Expression as intermediate trait • Population differences, emerging efforts • Shared associations, SNP-gene pairs, allelic direction • Confound: environment, preparation, batch, ancestry
Quantitative traits- weight, height- anything measurable- today: gene expression QTLs (QT Loci) - The loci that control quantitative traits
Regulatory variation • What do trait-associated variants do? • Genetic changes to: • Coding sequence ** • Gene expression levels • Splice isomer levels • Methylation patterns • Chromatin accessibility • Transcription factor binding kinetics • Cell signaling • Protein-protein interactions Regulatory
History, eQTL, mQTL, others Basic Concepts
Within a population • Damervalet al 1994 • 42/72 protein levels differ in maize • 2D electrophoresis, eyeball spot quantitation • Problems: • genome coverage • quantitation • post-translational modifications • Solution: use expression levels instead!
Usual mapping tools available • Discretization approach
gene 4 gene 1 Whole-genome eQTL analysis is an independent GWAS for expression of each gene gene 2 gene 3 gene N gene 5
Genetics of gene expression (eQTL) • cis-eQTL • The position of the eQTLmaps near the physical position of the gene. • Promoter polymorphism? • Insertion/Deletion? • Methylation, chromatin conformation? • trans-eQTL • The position of the eQTLdoes not map near the physical position of the gene. • Regulator? • Direct or indirect? Modified from Cheung and Spielman 2009 Nat Gen
yeast, mouse, maize, human eqtl – the array era
Yeast • Brem et al Science 2002 • Linkage in 40 offspring of lab x wild strain cross • 1528/6215 DE between parents • 570 map in cross • multiple QTLs • 32% of 570 have cis linkage • 262 not DE in parents also map
transhotspots Brem et al Science 2002
Mammals I • F2 mice on atherogenic diet • Expression arrays; WG linkage Schadt et al Nature 2003
Mammals II 10% !! Chesler et al Nat Genet 2005
Mammals III • No major trans loci in humans • Cheung et al Nature2003 • Monks et al AJHG 2004 • Stranger et alPLoS Genet 2005, Science 2007
Today: Regulatory variation and eQTLs • Quantitative Trait Loci (QTLs), Regulatory Variation • Molecular phenotypes as QTs: expression, chromatin… • Discretization: a GWAS for each gene. Cis-/Trans-eQTLs • Underlying regulatory variation: eQTLs, GWAS, cis-eQTL • Finding trans-eQTLs (distal from gene that varies) • Challenges: Power, structure, sample size • Cross-phenotype analysis: trans QTLs affect many genes • Identifying underlying regulatory mechanisms • Cis-eQTLs: TSS-distance, cell type specificity • eQTLs vs. GWAS: Expression as intermediate trait • Population differences, emerging efforts • Shared associations, SNP-gene pairs, allelic direction • Confound: environment, preparation, batch, ancestry
Open question Where are the trans eQTLS?
gene 4 gene 1 Whole-genome eQTL analysis is an independent GWAS for expression of each gene gene 2 gene 3 gene N gene 5
Issues with trans mapping • Power • Genome-wide significance is 5e-8 • Multiple testing on ~20K genes • Sample sizes clearly inadequate • Data structure • Bias corrections deflate variance • Non-normal distributions • Sample sizes • Far too small
But… • Assume that transeQTLs affect many genes… • …and you can use cross-trait methods!
Cross-phenotype meta-analysis L(data | λ≠1) SCPMA ~ L(data | λ=1) Cotsapas et al, PLoS Genetics
Open research questions • Do trans effects exist? • Yes – heritability estimates suggest so. • Can we detect them? • Larger cohorts? • Most eQTL studies ~50-500 individuals • See later, GTEx Project • Better methods? • Collapsing data? • PCA, summary statistics, modeling?
Today: Regulatory variation and eQTLs • Quantitative Trait Loci (QTLs), Regulatory Variation • Molecular phenotypes as QTs: expression, chromatin… • Discretization: a GWAS for each gene. Cis-/Trans-eQTLs • Underlying regulatory variation: eQTLs, GWAS, cis-eQTL • Finding trans-eQTLs (distal from gene that varies) • Challenges: Power, structure, sample size • Cross-phenotype analysis: trans QTLs affect many genes • Identifying underlying regulatory mechanisms • Cis-eQTLs: TSS-distance, cell type specificity • eQTLs vs. GWAS: Expression as intermediate trait • Population differences, emerging efforts • Shared associations, SNP-gene pairs, allelic direction • Confound: environment, preparation, batch, ancestry
First, let’s define the question • Can we use genetic perturbations as a way to understand how genes are regulated? • In what groups, in which tissues? • To what stimuli/signaling events? • Do ciseQTLs perturb promoter elements? • Do trans perturb TFs? Signaling cascades?
Significant associations are symmetrically distributed around TSS Most significant SNP per gene 0.001 permutation threshold Stranger et al., PLoS Gen 2012
69-80% of cis associations are cell type-specific Cell type-specific and cell type-shared gene associations (0.001 permutation threshold) 262 268 271 82 73 85 86 86 86 No. of cell types with gene association cell type • cis association sharing increases slightly when significance thresholds are relaxed • Cell type specificity verified experimentally for subset of eQTLs Dimas et al Science 2009 Slide courtesy Antigone Dimas Dimas et alScience 2009
Open research questions • Do ciseQTLs perturb functional elements? • Given each is independent, how can we know? • Do tissue-specific effects correlate with the expression of a gene across tissues? Or a regulator? • Perhaps a gene is expressed, but in response to different regulators across tissues? • If we ever find trans eQTLs… • Common regulators of coregulated genes? • Tissue specificity? • Mechanisms?
Candidate genes, perturbations underlying organismal phenotypes Application to GWAS
eQTLs as intermediate traits Schadt et al Nat Genet 2005
cell type not relevant for disease relevant cell type for disease Exploring eQTLs in the relevant cell type is important for disease association studies Importance of cataloguing regulatory variation in multiple cell types Slide courtesy Antigone Dimas Modified from Nica and Dermitzakis Hum Mol Genet 2008
Barrett et al 2008 de Jageret al 2007
Frankeet al 2010 Anderson et al 2011
Today: Regulatory variation and eQTLs • Quantitative Trait Loci (QTLs), Regulatory Variation • Molecular phenotypes as QTs: expression, chromatin… • Discretization: a GWAS for each gene. Cis-/Trans-eQTLs • Underlying regulatory variation: eQTLs, GWAS, cis-eQTL • Finding trans-eQTLs (distal from gene that varies) • Challenges: Power, structure, sample size • Cross-phenotype analysis: trans QTLs affect many genes • Identifying underlying regulatory mechanisms • Cis-eQTLs: TSS-distance, cell type specificity • eQTLs vs. GWAS: Expression as intermediate trait • Population differences, emerging efforts • Shared associations, SNP-gene pairs, allelic direction • Confound: environment, preparation, batch, ancestry
Shared association in 8 HapMap populations APOH: apolipoprotein H Stranger et al., PLoS Gen 2012
Number of genes with cis-eQTL associations 8 extended HapMap populations SRC: permutation threshold Stranger et al., PLoS Gen 2012
Direction of allelic effectsame SNP-gene combination across populations Population 1 Population 2 AGREEMENT log2 expression log2 expression OPPOSITE log2 expression log2 expression Stranger et al., PLoS Gen 2012