230 likes | 361 Vues
This work explores advanced statistical methods for analyzing dynamic data trajectories in metabolomics and toxicogenomics. We cover univariate and multivariate approaches, including Principal Component Analysis (PCA) and ANOVA. Key techniques include the validation of results using permutations, univariate time series analysis, and addressing multiple testing challenges. Real-world examples involve NMR spectroscopy of urine from rats treated with bromobenzene, showcasing the impact on metabolite behavior over time. This comprehensive approach aids in deriving clinically relevant insights from complex datasets.
E N D
Biosystems Data Analysis group University of Amsterdam Dealing with dynamic data trajectoriesHuub Hoefsloot www.bdagroup.nl
Outline • Univariate approach Measuring a single metabolite • Multivariate example Principal Component Analysis ANOVA Simultaneous Component Analysis • Using knowledge Looking only at interesting metabolites • Validation Permutations
Univariate time series analysis • Do two groups of graphs differ?
Modeling the time behavior Regression coefficients T-test model Y=at P =1.8605e-004
Other systems • Fitted values in differential equations • Parameters in state space models • Clinical derived parameters • Anything goes
Multiple testing • Use a multiple testing correction Univariate: P =1.8605e-004 Bonferroni correction, 1000 variables: P =1.8605e-001
3.0275 2.055 Rat 211 Rat 111 Rat 311 Rat 112 Rat 212 Rat 312 Rat 113 Rat 213 Rat 313 3.285 5.38 3.0475 3.675 3.7525 2.7175 6 hours 2.075 2.93 24 hours 10 8 6 4 2 0 48 hours chemical shift (ppm) Metabolomics example: setup Rats are given bromobenzene that affects the liver NMR spectroscopy of urine Visual inspection of the livers Time: 6, 24 and 48 hours Groups: 3 doses of BB Vehicle group, Control group Jansen et al; Bioinformatics 21 (2005) 3043-3048
Toxicogenomics example: resulting data 330 Metabolomics NMR 45 • 45 = 5 (treatments) x 3 (time points) x 3 (rats) • highly structured data • few samples, many measurements
J rat1 (time 1 dose 1) i11=1 ... ... t=1 rat3 (time 1 dose 1) i11=3 d=1 t=2 t=1 d=2 t=2 Toxicogenomics example: PCA
Toxicogenomics example: PCA control 6 0.4 control 24 control 48 0.3 vehicle 6 vehicle 24 vehicle 48 0.2 low 6 low 24 0.1 low 48 medium 6 medium 24 0 PC 2 ( 26.3 % of variation explained) medium 48 high 6 -0.1 high 24 high 48 -0.2 -0.3 -0.4 -0.2 0 0.2 0.4 0.6 PC 1 ( 44.6 % of variation explained)
control 6 0.4 control 24 ? control 48 0.3 vehicle 6 vehicle 24 vehicle 48 0.2 low 6 low 24 0.1 low 48 medium 6 medium 24 0 PC 2 ( 26.3657 % of variation explained) medium 48 high 6 -0.1 high 24 high 48 -0.2 -0.3 -0.4 -0.2 0 0.2 0.4 0.6 PC 1 ( 44.6108 % of variation explained) Toxicogenomics example: PCA (dose)
6 hrs control 6 0.4 control 24 control 48 0.3 vehicle 6 vehicle 24 vehicle 48 0.2 low 6 low 24 0.1 low 48 medium 6 PC 2 ( 26.3657 % of variation explained) medium 24 0 Low+med 24+48 medium 48 high 6 -0.1 high 24 high 48 -0.2 -0.3 -0.4 -0.2 0 0.2 0.4 0.6 PC 1 ( 44.6108 % of variation explained) Toxicogenomics example: PCA (time) 6 hrs Low+med 24+48
i = animal j = chemical shift t = time d = dose group Toxicogenomics example: ANOVA model ANOVA model: Under usual constraints ( ): Collect terms in matrices (ITDxJ)
Other ANOVA’s • Expressed with respect to the control • Subtract the mean of the control everywhere • Paired design • Subtract the mean of every person from the measurement on that person
A = time B = dose C = individual Toxicogenomics example: in matrices Collect terms in matrices (ITDxJ): Column spaces of XM, XA, XB, XAB and XABC mutually orthogonal Consequences: 1. 2.Easy algorithms
Toxicogenomics example: ASCA Time individual Treatment& interaction
+ + Toxicogenomics example: results Data = 100% = 25% + 57% + 18%
“Dose” level control vehicle low medium high Toxicogenomics example: results Time trajectories (71%) 0.5 0.4 0.3 0.2 Scores 0.1 0 -0.1 -0.2 6 24 48 Time (Hours)
“Dose” level Toxicogenomics example: results Metabolites Acetic acid TMAO …
Only PCA? pls plsda N-way pls N-way plsda parafac Anything you would do on X is possible
Using Knowledge • Considering only metabolites that follow a predefined model • Looking for similar profiles as the product. Peters et al. Trend analysis of time-series data: A novel method for untargeted metabolite discovery, Analytica Chimica Acta 663 (2010) 98–104 Rubingh et al. Analyzing Longitudinal MicrobialMetabolomics Data, Journal of Proteome Research2009, 8, 4319–4327 4319
Validation • I like permutation tests • Vis et al. Statistical validation of megavariate effects in ASCA, BMC BIOINFORMATICS, Volume 8, Article 322, AUG 30 2007