Effect Estimation & Testing

Effect Estimation& Testing Thomas Nichols, Ph.D. Assistant Professor Department of Biostatistics http://www.sph.umich.edu/~nichols fMRI Course OHBM 2004

Outline • Data Modeling • General Linear Model • GLM Issues • Statistical Inference • Statistic Images & Hypothesis Testing • Multiple Testing Problem

Basic fMRI Example • Data at one voxel • Rest vs.passive word listening • Is there an effect?

A Linear Model • “Linear” in parameters 1&2 error = + + b1 b2 Time e x1 x2 Intensity

Linear model, in image form… = + +

… in image matrix form…  = + 

= + Y … in matrix form. GeneralLinear Model Really general • Correlation • ANOVA • ANCOVA N: Number of scans, p: Number of regressors

Linear Model Issues • Signal Predictors • Block • Event-related • Nuisance Predictors • Drift • Motion parameters • Autocorrelation • Random effects

Temporal AutocorrelationIn Brief

Random Effects Models • GLM has only one source of randomness • Residual error • But people are another source of error • Everyone activates somewhat differently…

Distribution of each subject’s estimated effect Fixed vs.RandomEffects Subj. 1 Subj. 2 • Fixed Effects • Intra-subject variation suggests all these subjects different from zero • Random Effects • Intersubject variation suggests population not very different from zero Subj. 3 Subj. 4 Subj. 5 Subj. 6 0 Distribution of population effect

Random Effects for fMRI • Summary Statistic Approach • Easy • Create contrast images for each subject • Analyze contrast images with one-sample t • Limited • Only allows one scan per subject • Assumes balanced designs and homogeneous meas. error. • Full Mixed Effects Analysis • Harder • Requires iterative fitting • REML to estimate inter- and intra subject variance • SPM2 & FSL3 implement this differently • Very flexible

Random Effects for fMRIRandom vs. Fixed • Fixed isn’t “wrong”, just usually isn’t of interest • If it is sufficient to say “I can see this effect in this cohort”then fixed effects are OK • If need to say “If I were to sample a new cohort from the population I would get the same result”then random effects are needed

c’ = 1 0 0 0 0 0 0 0 Building Statistic Images • Contrast • A linear combination of parameters • Truth: c’ Estimate: contrast ofestimatedparameters T = T = varianceestimate s2c’(X’X)-1c

P-val Hypothesis Testing • Assume Null Hypothesis of no signal • Given that there is nosignal, how likely is our measured T? • P-value measures this • Probability of obtaining Tas large or larger •  level • Acceptable false positive rate T

t > 2.5 t > 4.5 t > 0.5 t > 1.5 t > 3.5 t > 5.5 t > 6.5 Hypothesis Testing in fMRI • Massively Univariate Modeling • Fit model at each voxel • Create statistic images of effect • Which of 100,000 voxels are significant? • =0.05  5,000 false positives!

MCP Solutions:Measuring False Positives • Familywise Error Rate (FWER) • Familywise Error • Existence of one or more false positives • FWER is probability of familywise error • False Discovery Rate (FDR) • R voxels declared active, V falsely so • Observed false discovery rate: V/R • FDR = E(V/R)

FWER MCP Solutions • Bonferroni • Maximum Distribution Methods • Random Field Theory • Permutation

 FWER MCP Solutions: Controlling FWER w/ Max • FWER & distribution of maximum FWER = P(FWE) = P(One or more voxels u | Ho) = P(Max voxel u | Ho) • 100(1-)%ile of max distn controls FWER FWER = P(Max voxel u | Ho)   u

FWER MCP Solutions:Random Field Theory • Euler Characteristic u • Topological Measure • #blobs - #holes • At high thresholds,just counts blobs • FWER = P(Max voxel u | Ho) = P(One or more blobs | Ho) P(u  1 | Ho) E(u| Ho) Threshold Random Field Suprathreshold Sets

Random Field Intuition • Corrected P-value for voxel value t Pc = P(max T > t) E(t) () ||1/2t2 exp(-t2/2) • Statistic value t increases • Pc decreases (of course!) • Search volume increases • Pc increases (more severe MCP) • Smoothness increases (||1/2 smaller) • Pc decreases (less severe MCP)

Lattice ImageData  Continuous Random Field Random Field TheoryStrengths & Weaknesses • Closed form results for E(u) • Z, t, F, Chi-Squared Continuous RFs • Results depend only on volume & smoothness • Smoothness assumed known • Sufficient smoothness required • Results are for continuous random fields • Multivariate normality • Several layers of approximations

FWER MCP Solutions • Bonferroni • Maximum Distribution Methods • Random Field Theory • Permutation

5% Parametric Null Distribution 5% Nonparametric Null Distribution Nonparametric Permutation Test • Parametric methods • Assume distribution ofstatistic under nullhypothesis • Nonparametric methods • Use data to find distribution of statisticunder null hypothesis • Any statistic!

5% Parametric Null Max Distribution 5% Nonparametric Null Max Distribution Controlling FWER: Permutation Test • Parametric methods • Assume distribution ofmax statistic under nullhypothesis • Nonparametric methods • Use data to find distribution of max statisticunder null hypothesis • Any max statistic!

Permutation TestStrengths • Requires only assumption of exchangeability • Under Ho, distribution unperturbed by permutation • Subjects are exchangeable • Under Ho, each subject’s A/B labels can be flipped • fMRI scans not exchangeable under Ho • Due to temporal autocorrelation • Need to de-correlate, then permute(Brammer, Bullmore et al, 1997)

Permutation TestLimitations • Computational Intensity • Analysis repeated for each relabeling • Not so bad on modern hardware • No analysis discussed below took more than 3 hours • Implementation Generality • Each experimental design type needs unique code to generate permutations • Not so bad for population inference with t-tests

Measuring False Positives • Familywise Error Rate (FWER) • Familywise Error • Existence of one or more false positives • FWER is probability of familywise error • False Discovery Rate (FDR) • R voxels declared active, V falsely so • Observed false discovery rate: V/R • FDR = E(V/R)

False Discovery RateIllustration: Noise Signal Signal+Noise

11.3% 11.3% 12.5% 10.8% 11.5% 10.0% 10.7% 11.2% 10.2% 9.5% 6.7% 10.5% 12.2% 8.7% 10.4% 14.9% 9.3% 16.2% 13.8% 14.0% Control of Per Comparison Rate at 10% Percentage of Null Pixels that are False Positives Control of Familywise Error Rate at 10% FWE Occurrence of Familywise Error Control of False Discovery Rate at 10% Percentage of Activated Pixels that are False Positives

p(i) i/V q Controlling FDR:Benjamini & Hochberg • Select desired limit q on E(FDR) • Order p-values, p(1)p(2) ...  p(V) • Let r be largest i such that • Reject all hypotheses corresponding top(1), ... , p(r). 1 p(i) p-value i/V q 0 0 1 i/V

Active ... ... yes Baseline ... ... D UBKDA N XXXXX no Example – Working Memory • fMRI Study of Working Memory • 12 subjects, block design Marshuetz et al (2000) • Item Recognition • Active:View five letters, 2s pause, view probe letter, respond • Baseline: View XXXXX, 2s pause, view Y or N, respond • Second Level RFX • Difference image, A-B constructedfor each subject • One sample t test Skip

Example – Working MemoryRFT Result • Threshold • S = 110,776 • 2  2  2 voxels5.1  5.8  6.9 mmFWHM • u = 9.870 • Result • 5 voxels above the threshold -log10 p-value

Permutation Distribution Maximum t Example – Working MemoryNon-Parametric Result • Threshold • u = 7.67 • Result • 58 voxels above the threshold -log10 p-value

Example – Working MemoryFDR Result • FDR Threshold • u = 3.83 • Result • 3,073 voxels abovethreshold

Conclusions • Must account for multiple comparisons • FWER • Random Field Theory • Simple to apply, but heavy on assumptions • Nonparametric • Exact, but requires more computation • FDR • More lenient measure of false positives – more powerful • Sociological calibration still underway (5%? 1%? 0.1%?)

Thanks • Slide help • Stefan Keibel, Rik Henson, JB Poline, Andrew Holmes

Effect Estimation & Testing