Identifying Robust Activation in fMRI

Identifying Robust Activation in fMRI Thomas Nichols, Ph.D. Assistant Professor Department of Biostatistics University of Michigan http://www.sph.umich.edu/~nichols FBIRN March 13, 2006

Are Robust Activations a Problem? • Robust activation • Proposed definition:An effect that is detected regardless of the specific model or methods used • Shouldn’t we be worried about non-robust activations? 

Robustness Overview • 1 voxel | Univariate • Validity • Sensitivity • Images | Mass-univariate • Validity for some multipleType I metric • Sensitivity, depending on metric

Univariate Robustness & Test Validity • Parametric, Two-sample t-test • Famously robust • False positive rate   even... • Under non-normality, heterogeneous variance • Most robust with balanced data • Can have problems with outliers • False positive rate may be <  • Impact for imaging • Simple block designs probably very safe

Univariate Robustness & Test Validity • Non-Parametric tests • “Exact” by construction • False positive rate precisely  • NB: Due to discreteness, your  may not be available • Not a generic modeling framework • No “permutation GLM” • Autocorrelation challenging • Impact for imaging • Within subject, must account for autocorrelation • Between subject, simple models easy

Univariate Robustness & Test Power • Parametric, Two-sample t-test • Reduced sensitivity • From outliers or, with un-balanced data, non-normality or heteroscedasticity • Impact for imaging • Safe, but possibly conservative approach • Not getting the most out of the data

Univariate Robustness & Test Power • Non-Parametric tests • Sensitivity varies with test! • Just because all tests are “Exact” doesn’t mean all have same sensitivity to Ha • When Normality true, or almost, t-test is optimal • Indicates permutation t-test is good • When data very non-normal, other tests better • E.g. median • “Robust” methods – Iteratively Re-weighted Least Squares (Wager, NI, 2005)

Univariate Robustness & Test Power • Non-Parametric tests • In-flight Monte Carlo Simulation • One-sample test on differences, 12 Subjects • 11 Ss have effect size 1 • 1 S has effect size -2 • Compare power of twopermutation tests • Median & t-test • Conclusion • Both tests “exact”, but Median more sensitive in the presence of outliers Normal data, 1,000 realizations

Univariate Robustness & Test Power • Implications for Imaging • Non-normality (group heterogeneity) can reduce sensitivity • Alternate test statistics can out-perform standard methods

Mass-Univariate Inference • Interesting Result? • t = 5.446 • 4.3×10-5! • Look at the data • Contrastunremarkable • Standard deviationlow • White matter! • Must account for multiple tests! FIAC group data, 15 subjects, block design dataDifferent Speaker & Sentence Effect

Mass- Univariate Robustness & Test Validity • 100,000 tests, 0-100,000 false positives! • No unique measure of false positives • Just two: • Familywise Error Rate (FWER) • Chance of existence of one or more false positives • False Discovery Rate (FDR) • Expected fraction of false positives (among all detections)

Mass- Univariate Robustness & Test ValidityFWER methods • Parametric, Random Field Theory • Provides thresholds that control FWER • Assumes data is smooth random field • Very flexible framework • Closed form results for t/Z/F... • Can be conservative • Low DF • Low smoothness

Mass- Univariate Robustness & Test ValidityFWER methods • Non-Parametric • Use permutation to find null max distribution • No smoothness assumptions • “Exact” control of FWER • Not very flexible • But can get a lot of mileage out of 1-, 2-sample t, and correlation

Mass- Univariate Robustness & Test PowerFWER methods • Parametric, Random Field Theory • Can be conservative when... • Low DF • Low smoothness • Nonparametric Permutation • More powerful when RFT has problems

FWERThresholds:RFT vs. Perm 9 df • RF & Perm adapt to smoothness • Perm & Truth close • Bonferroni close to truth for low smoothness 19 df more

Real Data – ThresholdRFT vs Bonf. vs Perm.

Real Data – Num voxel foundRFT vs Bonf. vs Perm.

Mass-Univariate Inference • FWER-Corrected P-value: 0.9878 • FDR-Corrected P-value 0.1122 • Interpretation • This result is totally consistent with the null hyp. when searching 26,000 voxels FIAC group data, 15 subjects, block design dataDifferent Speaker & Sentence Effect

Robustness Conclusions • Separately consider validity and sensitivity • Validity • Most methods fairly robust • Event-related fMRI probably least robust • Sensitivity • Standard univariate methods suffer under non-normality, heterogeneity • RFT FWER thresholds can lack sensitivity under low DF, low smoothness • Nonparametric methods, while not fully general, provide good power under problematic settings

Permutation for fMRIBOLD vs. ASL • Temporal Autocorrelation • BOLD fMRI has it • Makes permutation test difficult • Differenced ASL data • Differenced ASL data white (Aguirre et al) • Permutation test now easy • Though Aguirre found that regressing out movement parameters was necessary to get nominal FPR’s

BOLD vs. ASL:My stance: Don’t Difference! • Model the control/label effect • Differenced data has length-n/2 • Only using ½ the data is suboptimal • Gauss-Markov Theorem • Optimally precise estimates come from full, whitened model • Advantages • Uses standard BOLD fMRI modeling tools • Reference • Mumford, Hernadez & Nichols, Estimation Efficiency and Statistical Power in Arterial Spin Labeling FMRI.Provisionally accepted, NeuroImage.

ASL w/outDifferencing Full Data Design Matrix Columns • Model all n observations • Predictors • Baseline BOLD • Baseline perfusion • BOLD  • Perfusion 

ASL w/outDifferencing Difference in Power Relative toModeling Full Data and Autocorrelation • Two key aspects • Model all data • Account for autocorrelation • Theoretical Result • Better power!

ASL w/outDifferencing Difference in Z scoreFull Model GLS vs. Difference Data OLS • Two key aspects • Model all data • Account for autocorrelation • Real DataResult • Bigger Z’s!(on average)

ASL Conclusion • Intrasubject Inferences with ASL • Differenced ASL data only white when noise 1/f • Worry about validity of intrasubject permutation test • Group Inferences with ASL • Data then looks just like BOLD fMRI • Permutation test easy again

Identifying Robust Activation in fMRI