1 / 31

Epigenetic Analysis

Epigenetic Analysis. BIOS 691- 803 Statistics for Systems Biology Spring 2008. Kinds of Questions. Where are the epigenetic modifications? How do they co-vary? How do epigenetic changes affect expression of genes?. Covariation of Epigenetic Measures. Motivating questions

kina
Télécharger la présentation

Epigenetic Analysis

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Epigenetic Analysis BIOS 691- 803 Statistics for Systems Biology Spring 2008

  2. Kinds of Questions • Where are the epigenetic modifications? • How do they co-vary? • How do epigenetic changes affect expression of genes?

  3. Covariation of Epigenetic Measures • Motivating questions • How are epigenetic modifications related? • What are the major determinants of epigenetic state? • Statistical techniques • Covariance calculation • Principal component analysis • Linear models

  4. Location and Covariance • Question: do epigenetic modifiers act on specific targets or do they act on whole regions of DNA? • Direct experimental evidence contradictory • Statistics may help: • Covariation patterns may be evidence

  5. Calcitonin A gene Two CpG clusters plus 3 odd CpG’s High correlation within clusters CalcA in NCI60

  6. CDH1 in NCI 60

  7. Covariation in Methylation of 7 Genes • Individual genes have multiple CpG sites • Most variation: overall methylation Correlation Map of 108 CpG sites in 6 genes across 5 ECOG pilot samples Red = 1 White = 0 Blue < 0 Epigenomic Analysis

  8. Methylation and Expression • Single gene (E-cadherin) results suggest overall methylation correlated with expression

  9. Methylation and Expression • HELP assay gives genome-wide sampling of methylation sites at 15K genes • If select genes with S/N > 2 in both measures, then correlations with associated genes are bi-modal Epigenomic Analysis

  10. What Causes Methylation? • NCI-60 derived from various tissues • Tissue characteristic profile + specific history of cells • Fit linear model to each methylation site • 9 tissues for 60 observations • 51 error df • Overall 41% of variance attributable to tissue • What causes the remainder of methylation differences?

  11. PCA for Cell-specific Factors • Residual variance has one strong PC • Remainder are ‘noise’ • 1st PC is almost constant • Reflects overall level of methylation • Is this an artifact or is it real? • Significantly correlated with expression of DNMT1 & DNMT3A

  12. Relations Between Epigenetic Measures - III Stem Cells & Cancer

  13. Issue: Cancer Stem Cells? • Hypothesis: cancers arise from stem cells rather than differentiated epithelial cells • How would you tell the difference between partially differentiated stem cells and de-differentiated epithelial cells? • Proposal: compare characteristic epigenetic modifications of stem cells with cancers • Epigenetic modifications are distinct • PRC2 (stem cells) vs methylation (cancer)

  14. Statistical Methodology • Test of association 2 x 2 table • Fisher Exact p ~ 10-5

  15. Statistical Methodology • Test of association 2 x 2 table • Fisher Exact p ~ 10-5 • Alternatives • T-test (predictor: PRC2) • Linear model (predictor: methylation: T – N )

  16. PRC2 – Methylation Association

  17. Are CIMP’s Stem Cell Clones? • Distinctive PRC2 sites appear preferentially methylated in CIMP tumors

  18. Correlations between epigenetic and expression measures – I Copy Number and Expression

  19. Copy Number and Expression • Large sections of DNA containing many genes are often copied or deleted • We think most control elements are copied or deleted also • If more (or fewer) copies of a gene then ceteris paribus there should be more (fewer) copies of RNA

  20. Integrative Studies of CGH & Gene Expression • Expect to see strong correlation between copy number and expression in data • Previous studies report report weak effects • Average correlations from (0.04 to 0.27) • NCI 60 study average correlation 0.16

  21. Why Not? • H1: there really isn’t much effect – biology • Somehow the cells are compensating • In any case there shouldn’t be any effect on non-expressed genes • H2: we may not be able to measure the effect that is there – technical error • Probes may be insensitive/cross-hybridizing • Signal/noise too low even when probes are sensitive

  22. Eliminating Uninformative Genes • Genes which are silenced will not show effect of copy number variation • Mean signal a rough proxy • Remove genes with mean signal above 6.3 • Only genes with significant copy number variation (above measurement noise) will show effect • Select genes with SD of copy number > 0.5

  23. Correlations of Selected Measures Black: All correlations Red: Reliably measured correlations

  24. Estimating True Correlations • If measurement noise of SD ~ 0.3 degrades expression measures, then true correlations of variables will be mostly closer to 0 than correlations of measures • Given a correlation and measured standard deviations, what are most likely true standard deviations and true correlation?

  25. MLE of Noisy Correlations • Noise can be estimated from replicates • If N large can estimate • SD of originals can be estimated by ML • Given s and e, the MLE of correlation can be inferred • For NCI 60 median MLE correlation ~ 0.65 Epigenomic Analysis

  26. Correlations between epigenetic and expression measures – II Chromatin and Expression

  27. Do Epigenetic Marks Regulate Transcription? • Several studies finding only weak evidence by correlation analysis • Same technical issue: S/N ratio • Questions • Does methylation shut down most genes? • Which histone marks indicate active transcription?

  28. Methylation and Expression • HELP assay gives genome-wide sampling of methylation sites at 15K genes • Select genes with S/N > 2 in both measures • Correlations with gene expression values are bi-modal Epigenomic Analysis

  29. Interpretation of Meth-Expr Corrs • MLE of negative mode ~ -0.8 • ~ 2/3 of genes under that hump • Unclear whether positive hump is real or an artifact of small sample size • Possible explanations: • True induction by methylation • Methylation of insulator • Irrelevant CpG site

  30. Acetylation and Expression • Histones often acetylated during expression • Histone 3 lysine 9 (H3K9) acetylation measured • Measures corrupted by noise • Blue: S/N > 2.5 • Red: S/N > 2 • Black: S/N > 1.5

  31. Biological Prediction • H3K9 acetylation gene expression • Is this real? • Experimental test: find genes with high acetylation variance, and little expression variance by microarray • Results (7 genes) • Confirm hypothesis • Implies: • Expression arrays are not sensitive Epigenomic Analysis

More Related