Enhancing Multivariate Data Analysis Methods for Height and Structural Variance Studies
This project by Roger Woods, M.D., explores advanced multivariate data reduction and model selection techniques tailored for non-Euclidean manifolds. It investigates heritability factors influencing height and various anatomical dimensions, integrating statistical tools such as multivariate regression, PCA, and ICA. The adaptability of these methods is emphasized in applications spanning fetal alcohol studies, microcephaly, and schizophrenia analytics, supporting efficient model selection and user-friendly interfaces. The findings aim to bridge global genetic variance influences with local environmental sources, enhancing our understanding of complex traits.
Enhancing Multivariate Data Analysis Methods for Height and Structural Variance Studies
E N D
Presentation Transcript
Project 3: Data Interpretation • Roger Woods, M.D.
New Aims • Tools for multivariate data reduction and model selection • Adaptation of multivariate methods to non-Euclidean manifolds • Interoperability, intuitiveness, efficiency and user friendliness of statistical models
Height: A “Simple” Problem Heritability: 80% (age and sex adjusted)
Multivariate Height • Skull height • 7 Cervical vertebrae • 12 Thoracic vertebrae • 5 Lumbar vertebrae • Pelvic height • Upper leg length • Lower leg length • Infinite variations
Brain Studies: The Height Problem in 2D, 3D or More Heritabilities up to 90%
Global Influences Microcephaly
Regional Influences Cerebellar Hypoplasia Callosal Agenesis
Multivariate Data Reduction and Model Selection • Multivariate regression • Partial least squares and variants • ICA/PCA and variants • Model averaging • Linear discriminant analysis
Mixed ICA/PCA • Models Subgaussian, Supergaussianand Gaussian Sources • Model Selection Using Leave-One-Out Cross Validation (Akaike Information Criterion is not valid) • Implemented as C code with interfaces in Matlab and in R • User-specified Known Sources • Parallelization High Priority
Adaptation of Multivariate Methods to non-Euclidean Manifolds • Model Selection on Shape Manifolds • All multivariate methods in Aim 1 • Diffusion Imaging • Selecting correct model versus model best supported by the data
DTI Maximum Likelihood Analyses • Gaussian model recovers orientation of simulated primary diffusion axis better than Rician model, even when data are generated as Rician • The Cholesky-based approach for assuring positive-definite tensors often fails to converge to the true minimum
Interoperability, Intuitiveness, Efficiency and User Friendliness • Support volumetric data, tracts, vectors, tensors • GUI’s for model specification • Multicore processor support • Spatial provenance • Touchstone pipelines