1 / 25

A New Method to Study the Change of miRNA-mRNA Interactions Due to Environmental Exposures

A New Method to Study the Change of miRNA-mRNA Interactions Due to Environmental Exposures. Pei Wang Dept. of Genetics and Genomics Sciences 11/5/2018 BIRS- Oaxaca. Outline. The Motivation Application --- an environmental chemical study iJRF Joint Random Forest

lmickle
Télécharger la présentation

A New Method to Study the Change of miRNA-mRNA Interactions Due to Environmental Exposures

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. A New Method to Study the Change of miRNA-mRNA Interactions Due to Environmental Exposures Pei Wang Dept. of Genetics and Genomics Sciences 11/5/2018 BIRS-Oaxaca

  2. Outline • The Motivation Application --- an environmental chemical study • iJRF • Joint Random Forest • Integration of data from existing databases • iJRF • Study the change of miRNA-mRNA Interactions due to environmental exposures

  3. An Environmental Chemical Study --- Background • Exposure to environmental chemicals during early development may increase the risk of developing breast cancer later in life. • Three chemicals: diethyl phthalate (DEP), methyl paraben (MPB) and triclosan (TCS) – present in personal care products such as toothpaste and cosmetics • Public health concerns about these chemicals are arising given their high presence in the US population (Nhanes et al, 2009) • Studying the impact of environmental on transcriptome can help • to identify chemical induced mechanisms • to quantify more accurately the risk of developing cancer.

  4. An Environmental Chemical Study (PI: J. Chen) Control DEP MBP TCS x20 x20 x15 x20 microRNA & mRNA profiling • Sprague-Dawley (SD) rats were exposed to three environmental chemicals during their development periods (Gopalakrishnan et al., 2017). • Mammary tissues of these rats were then examined using both messenger RNA (mRNA) and microRNA (miRNA) profiling. • Number of mRNAs (N=7,546) and miRNAs (M=272) • Number of samples < 20 for each subgroup

  5. An Environmental Chemical Study --- Goal Control DEP MBP TCS x20 x20 x15 x20 microRNA & mRNA profiling • Goal: • There is increasing evidence about the involvement of miRNAs in mammary gland development (Lee et al., 2013, 2015). • Little is known about the effect of environnemental exposure on miRNA. • Construct microRNA-mRNA interaction network to study the biological and genomic mechanisms mediating the effects of chemical exposures.

  6. Info from existing database TCS Exposure Control DEP Exposure MPB Exposure Joint Learning Joint Learning Joint Learning Challenges and Solutions Challenges: • Extremely small sample sizes (“Large p small n” paradigm) Solution: • Borrow information across different data types to better estimate common structure across networks • Integrate information from existing database when learning co-expression networks iJRF --- Simultaneously estimate miRNA-mRNA interactions for different exposure data sets and integrate information from existing databases

  7. Existing Algorithms • Joint estimation of multiple networks • - Guo et al. (2011) and Danaher et al. (2014) --- likelihood-based methods for joint estimation of multiple related Gaussian graphical models. • - Limitations: The performance of the methods heavily depends on the Multivariate Gaussian assumptions. • Integration of data from existing databases • - Bayesian Networks (Bernard and Hartemink, 2005; Werhli and Husmeier, 2007; Zhu et al., 2003, 2008) • - Sparse Structural Equation Models (Cai et al, 2012; Logsdon and Mezey, 2010) • - Consensus techniques (Shojaie et al., 2014; Yip et al., 2010) • - Limitation of many existing algorithms: linearity and normality assumptions

  8. Our recent works • iRafNet: Integrative Random Forest (Petralia et al, 2015, Bioinformatics) • Introduce a sampling scheme within random forest to integrate existing information when inferring biological networks • Outperformed the original random forest algorithm GENIE3 (Irrthum et al, 2010) for network construction on DREAM challenge data. • JRF: Joint Random Forest (Petralia et al, 2016, Journal of Proteomics Research) • Efficiently estimate multiple related networks simultaneously (borrow information across different data types) • Non-parametric, ensemble algorithm, great performance with small sample size requirement • Demonstrated to outperform Joint Graphical Lasso (Danaher et al, 2014, Biometrika) when estimating multiple related networks

  9. iJRF: JRF + iRafNet • Simultaneously estimate miRNA-mRNA interactions for different exposure data sets and integrate information from existing databases.

  10. iJRF

  11. Integrative Joint Random Forest (iJRF) 1. IrafNet Step • For different tree ensembles corresponding to different treatments, the same set ofpredictors (miRNAs) are proposed for the splitting rule of nodes • Predictors are selected by prioritizing miRNAs that have similar sequences to that of the target mRNA • Predictors are selected with probability where scores are derived based on TargetScan (Agarwal et al., 2015) • For each interaction j→k, TargetScan provides a non-negative context score • Context scores are non-positive with more negative values corresponding to more favorable sites • We derive • Scores takes values in the interval [1,2], therefore the probability to sample the miRNA with the most similar sequence to that of the target mRNA will be twice the probability of the least similar miRNA (context score equal to zero) For each exposed data g, model mRNA k as function of miRNAs via random forest Prioritize miRNAs based on sequence similarity Control DEP MPB TCS

  12. Integrative Joint Random Forest (iJRF) For each exposed data g, model mRNA k as function of miRNAs via random forest 2. JRF Step • Among the selected predictors, the optimal splitting variable of nodes is the predictor maximizing the weighted sum of node impurities across different trees, i.e., with being the decrease in node impurity observed in the gth tree after splitting the node based on the jth predictor Control DEP MPB TCS Key idea: borrow information across different data sets by forcing the class-specific tree ensembles to use the same genes for the splitting rules.

  13. Integrative Joint Random Forest (iJRF) Derive threshold using permutation techniques and an FDR cut-off For each exposed data g, model mRNA k as function of miRNAs via random forest Rank Interactions Control DEP MPB TCS

  14. iJRF: JRF + iRafNet • Simultaneously estimate miRNA-mRNA interactions for different exposure data sets and integrate information from existing databases. • Leverage sequence database information to give more advantage to miRNA-mRNA pairs with good sequence similarities. • Borrow information across different data sets by forcing the class-specific tree ensembles to use the same genes for the splitting rules. • Able to detect interactions common across different classes with better power, and to detect interactions specific to individual classes with better FDR. • No tuning parameter in the model to control similarities between different classes.

  15. Study the change of miRNA-mRNA Interactions due to environmental exposures

  16. Data and conventional analyses • 7, 546 messenger RNAs and 272 miRNAs • Samples size: 20 each for the control, DEP and MPB treatment groups, and 15 for TCS. Univariate Test Did not reveal any significant change between control and chemical exposed miRNA levels. Correlation Test to Detect miRNA-mRNA interactions Using FDR = 0.001, the correlation test detects very fewer edges and fails to reveal any informative hub structure or pathway enriched network module.

  17. iJRF results iJRF Parameters • Number of trees: 1, 000 trees. • N = sqrt(272) miRNAs (predictors) were sampled at each node (Breiman, 2001). • The four networks were estimated using iJRF and mRNA-miRNA interactions were derived using permutation techniques with an FDR cut-off of 0.001. Results • All three chemicals result in loss • of interaction compared to control. • DEP was the chemical exposure • resulting in the most dramatic loss • of interaction compared to control.

  18. Top miRNAs in Control-Net responsible of more than 85% edges More than 65% of Control-specific interactions were not present in any of the chemical-networks

  19. Top miRNAs in Control-Net responsible of more than 85% edges Remarkable loss in connectivity (> 90%) compared to Control.

  20. miR-200a and miR-375 • Consider only mRNAs connected to miR-200a and miR-375 in control but not in DEP • Derive enriched categories for this set

  21. miR-200a and miR-375 • Various studies have shown that chemical exposure can alter mammary gland development (Schwarzman et al., 2015; Manservisi et al., 2015; Mandrup et al., 2015). • As shown, enriched pathways include genes such as ERBB2 (HER2), FOXA1 and SFRP1 which play a crucial role in Breast Cancer. • Hypothesis: DEP exposure might alter the regulatory mechanism of miR-375 and miR-200a and affect mammary gland development.

  22. Validation Hypothesis: DEP exposure might alter the regulatory mechanism of miR-375 and miR-200a and affect mammary gland development. DEP exposure → miRNA → mRNA Validation Experiments and results: • Cell lines experiments to test the effect of DEP exposure on miR-375 and miR-200a. • In-vitro experiments of human breast cancer MCF-7 cell-lines • Expression levels of both miR-375 and miR-200a are significantly different between Control and DEP-exposed cells

  23. Summary • We observe loss of connectivity in microRNA-RNA networks due to exposure of DEP/MPB/TCS. • miR-200a and miR-375 lost more than 90% connectivity in the microRNA-mRNA network from samples exposed due to DEP compared to that of the control samples. • mRNAs connected to miR-375 and miR-200a were enriched of “Mammary Gland Development” and “Gland Morphogenesis” indicating their potential involvement in mammary gland development mechanisms. • We demonstrated that DEP exposure affects miR-375 and miR-200a on breast cancer human cell lines. • Future research is necessary to validate the effect of chemical exposure on miRNA-mRNA interactions.

  24. Summary • iJRF can be used in different applications. Few examples include: • eQTL analysis might be performed for different tissues simultaneously while integrating information from different databases; • co-expression networks might be estimated for proteomics and gene expression data simultaneously while integrating information from protein-protein interaction databases. • iJRF R CRAN package at • https://cran.r-project.org/web/packages/iJRF/index.html • F. Petralia, V. N. Aushev, K. Gopalakrishnan, M. Kappil, N. W. Khin, J. Chen, S. L. Teitelbaum, P. Wang, iJRF to Study the Effect of Environmental Exposures on miRNA-mRNA interactions in Mammary Transcript, Bioinformatics, 2017, 33-14, 199-207 • F. Petralia, W. Song, Z. Tu, P. Wang, “New Method for Joint Network Analysis Reveals Common and Different Coexpression Patterns among Genes and Proteins in Breast Cancer”, Journal of Proteomics Research 2016 Mar 4;15(3):743-54, ACS Editors’ Choice Article Selection. • F. Petralia, P. Wang, J. Yang, Z. Tu, “A General Framework of Integrative Random Forest for Gene Regulatory Network Inference”, Bioinformatics 2015, Jun; 15;31(12), i197-205. Reference

  25. Acknowledgement • Dr. Francesca Petralia • Dr. Jia Chen and Dr. Susan Teitelbaum’s team • Aushev V, • Gopalakrishnan K, • Kappil M, • Khin NW, • Chen J, • Teitelbaum S • Dr. Zhidong Tu • NIH Grants.

More Related