Factor Analysis of MRI-Derived Tongue Shapes

Factor Analysis of MRI-Derived Tongue Shapes Mark Hasegawa-Johnson ECE Department and Beckman Institute University of Illinois at Urbana-Champaign

Background The vowel sounds of English are classified in two dimensions: “high/low” and “front/back.” u High i e o ae a Low Front Back

Background Tongue is composed of about 9 muscles (4 intrinsic, 5 extrinsic) Superior Longitudinalis Palatoglossus Styloglossus Verticalis Superior Phar. Constrictor Transversus Genioglossus Inferior Longitudinalis Hyoglossus

Theories of Motor Control Theory 2: Hierarchical Control Theory 1: Direct Control

Factor Analysis of X-Ray ImagesHarshman, Ladefoged, &Goldstein, 1977

Factor Analysis of X-Ray ImagesHarshman, Ladefoged, &Goldstein, 1977 Finding: Two factors account for 92% of variance.

Factor loadings seem to represent distinctive features: v1 = [a front] v2 = [b high]

Can Three-Dimensional TongueShape be Explained Using ShapeFactors? Hypothesis 1 3D tongue shape during speech = weighted sum of 2-3 factors. Hypothesis 2 Shape of the factors t1(i), t2(i) is speaker-dependent. (??)

Why is 3D Different from 2D? Linear Source-Filter Theory: - Vowel Quality is Determined by Areas - Area Correlated w/Midsagittal Width

Do Shape Factors Exist in 3D? • If inter-speaker shape similarity is governed by desire for acoustic similarity, and... • If acoustic similarity depends on cross-sectional area, not cross-sectional shape... • Then Variation in 3D Shape May Not Have a Shape Factor Basis

Factor Analysis of MRI-Derived Tongue Shapes: Methodology 1. Recruit Subjects 2. Collect MRI Images 3. Segment the Images 4. Interpolate ROI to Create 3D Tongue Shapes for Each Vowel 5. Speaker-Dependent Factor Analysis 6. Speaker-Independent Factor Analysis

Subject Recruitment: • Ten subjects recruited; five successfully imaged (3 male, 2 female). • Subjects were college undergrads and grads with no metal fillings and no claustrophobia. • Subjects were trained to sustain vowel sounds with little variation. • Human subjects approval: both UCLA and Cedars-Sinai Medical Center.

MRI Image Collection • GE Signa 1.5T • T1-weighted • 3mm slices • 24 cm FOV • 256 x 256 pixels • Coronal, Axial • 11-18 Sounds • per Subject. • Breath-hold in • vowel position • for 25 seconds

Image Viewing and Segmentation: the CTMRedit GUI and toolbox • Display series of CT or MR image slices • Segment ROI manually or automatically • Interpolate and reconstruct ROI in 3D space

Calibration: Segmentation of Phantom (J. Cha) • Test tubes of 3 sizes • Radius estimated from manual segmentation has an absolute error of • typical case: 0.1mm • worst case: 0.4mm

Calibration: Articulatory Speech Synthesis (J. Cha) • /a,i,u/ synthesized using Maeda articulatory synthesizer • F1-F4 errors: • worst case: +/- 30% • mean error: +2.8% • std dev: 19.5%

Reconstruction of ROI • Interpolate between image slices to create 3D object.

Tongue Shape During /ae/

Speaker Normalization: VT Length, Inter-Molar Width (S. Pizza)

Speaker-Dependent Factor Analysis • 12 tongue shapes from one speaker: • Each tongue shape modeled as a 25 point x 40 point rubber sheet. • Principal Components Analysis: • 11 Non-Zero Factors (12 vowels - 1 mean vector = 11 degrees of freedom). • 2 Factors: 78% of variance • 3 Factors: 88% of variance

“Excuses:” Why Didn’t it Work? • Tongue Length changes from /ao/ to /iy/. • Human Transcriber Error? • Interpolation to Form 3D Image Causes Error • Spline & Sinc interpolation: very large errors • Linear interpolation: smaller errors, but still too large.

New Approaches: ---- Avoid Interpolation General Method: Avoid interpolation by modeling the measured data directly. • J. Huang: Control factor shape using an a priori probability distribution. • Y. Zheng: Limit factor to the set of polynomial surfaces.

Polynomial Smoothing (Y. Zheng) • Polynomial Surface Modeling • Tongue shape = polynomial surface • 4D surface model enforces smoothness constraints. • Hybrid Polynomial/Factor model • Midsagittal tongue shape is as predicted by Harshman et al. • 3D shape = (midsag. shape)X(polynomial)

Conclusions • X-ray analysis suggests hierarchical motor control, but... • “Hierarchical control” might reflect structure of the acoustic space. • MRI analysis does not find hierarchical control (yet), but... • Negative finding might be result of methodological weakness.

Speaker-Dependent Factor Analysis

Factor Analysis of MRI-Derived Tongue Shapes

Factor Analysis of MRI-Derived Tongue Shapes

Presentation Transcript

Factor Analysis

Factor Analysis

Factor Analysis

Factor Analysis

Factor Analysis

Factor Analysis

Brain Derived Neurotrophic Factor

Overview of Factor Analysis

Factor Analysis

Factor Analysis

Factor Analysis

Factor Analysis

Factor Analysis

FACTOR ANALYSIS

Factor Analysis

Factor Analysis:

platelet derived growth factor D

Factor Analysis

Factor Analysis

FACTOR ANALYSIS

Factor Analysis