570 likes | 649 Vues
Characterisation of individuals’ formant dynamics using polynomial equations. IAFPA 2006. Kirsty McDougall Department of Linguistics University of Cambridge kem37@cam.ac.uk. Speaker characteristics and static features of speech.
E N D
Characterisation of individuals’ formant dynamics using polynomial equations IAFPA 2006 Kirsty McDougallDepartment of LinguisticsUniversity of Cambridge kem37@cam.ac.uk
Speaker characteristics and static features of speech • Most previous research has focussed on static features- instantaneous, average • Straightforward to measure • Natural progression from other research areas – delineation of different languages and language varieties
Speaker characteristics and static features of speech • Reflect certain anatomical dimensions of a speaker, e.g. formant frequencies ~ length and configuration of VT • Instantaneous and average measures - demonstrate speaker differences, but unable to distinguish all members of a population look todynamic (time-varying) features
Dynamic features of speech • More information than static • Reflect movement of a person’s speech organs as well as dimensions- people move in individual ways for skilled motor activities - walking, running, … and speech
Dynamic features of speech • can view speech as achievement of a series of linguistic ‘targets’ • speakers likely to exhibit similar properties at ‘targets’ (e.g. segment midpoints), but move between these in individual ways examine formant frequency dynamics
Formant dynamics Frequency (Hz) Time (s) Time (s) /aɪ/ in ‘bike’ uttered by two male speakers of Australian English
Formant dynamics Frequency (Hz) Time (s) Time (s) 10% 10% /aɪ/ in ‘bike’ uttered by two male speakers of Australian English
Formant dynamics Frequency (Hz) Time (s) Time (s) /aɪ/ in ‘bike’ uttered by two male speakers of Australian English
Research Questions • How do speakers’ formant dynamics reflect individual differences in the production of the sequence //? • How can this dynamic information be captured to characterise individual speakers?
bike hike like mike spike /baIk/ /haIk/ /laIk/ /maIk/ /spaIk/ /aIk/ Target words:
Data set e.g. I don’t want the scooter, I want the bike now. Later won’t do, I want the bike now. 5 repetitions x 5 words (bike, hike, like, mike, spike) x 2 stress levels (nuclear, non-nuclear) x 2 speaking rates (normal, fast) = 100 tokens per subject
Subjects • 5 adult male native speakers of Australian English (A, B, C, D, E) • aged 22-28 • Brisbane/Gold Coast, Queensland
Speaker A “bike” (normal-nuclear) 10 20 30 40 50 60 70 80 90% 1 2
Speaker A “bike” (normal-nuclear) F3 F2 F1 • F3 • F2 • F1 10 20 30 40 50 60 70 80 90% 1 2
F1 normal-nuclear Frequency (Hz) +10% step of /a/
F2 normal-nuclear Frequency (Hz) +10% step of /a/
F3 normal-nuclear Frequency (Hz) +10% step of /a/
Discriminant Analysis Multivariate technique used to determine whether a set of predictors (formant frequency measurements) can be combined to predict group (speaker) membership (ref. Tabachnick and Fidell 1996)
Discriminant Analysis Each datapoint represents 1 token Each speaker’s tokens are represented with a different colour fast-nuclear 6 4 2 A B C D E Function 2 0 -2 -4 -6 -4 -2 0 2 4 6 Function 1
Discriminant Analysis Each datapoint represents 1 token Each speaker’s tokens are represented with a different colour e.g. Speaker E’s 25 tokens of /aɪk/ fast-nuclear 6 4 2 A B C D E Function 2 0 -2 -4 -6 -4 -2 0 2 4 6 Function 1
Discriminant Analysis DA constructs discriminant functions which maximise differences between speakers(each function is a linear combination of the formant frequency predictors) fast-nuclear 6 4 2 A B C D E Function 2 0 -2 -4 -6 -4 -2 0 2 4 6 Function 1
Discriminant Analysis Assess how well the predictors distinguish speakers by extent of clustering of tokens+ classification percentage… fast-nuclear 6 4 2 A B C D E Function 2 0 -2 -4 -6 -4 -2 0 2 4 6 Function 1
Discriminant Analysis Assess how well the predictors distinguish speakers by extent of clustering of tokens+ classification percentage…95% fast-nuclear 6 4 2 A B C D E Function 2 0 -2 -4 -6 -4 -2 0 2 4 6 Function 1
Discriminant Analysis 95% 89% 95% 88%
Discussion • DA scatterplots and classification rates promising • However, not very efficient – method essentially based on a series of instantaneous measurements, probably containing dependent information • Recall: individuals’ F1 contours of /aɪk/…
F1 normal-nuclear Frequency (Hz) +10% step of /a/
A new approach… • Differences in location in frequency range • Differences in curvature – location of turning points, convex/concave, steep/shallow • Need to capture most defining aspects of the contours efficiently linear regression to parameterise curves with polynomial equations
Linear regression • Technique for determining equation of a line or curve which approximates the relationship between a set of (x, y) points y x
Linear regression • Technique for determining equation of a line or curve which approximates the relationship between a set of (x, y) points y x
Linear regression • Technique for determining equation of a line or curve which approximates the relationship between a set of (x, y) points y x
Linear regression • Technique for determining equation of a line or curve which approximates the relationship between a set of (x, y) points y y = a0 + a1x x
Linear regression • Technique for determining equation of a line or curve which approximates the relationship between a set of (x, y) points y y = a0 + a1x y-intercept x
Linear regression • Technique for determining equation of a line or curve which approximates the relationship between a set of (x, y) points y y = a0 + a1x y-intercept gradient x
Linear regression • Can also be used for curvilinear relationships y x
Linear regression • Can also be used for curvilinear relationships y quadratic: y = a0 + a1x + a2x2 x
Linear regression • Can also be used for curvilinear relationships y quadratic: y = a0 + a1x + a2x2 y-intercept x
Linear regression • Can also be used for curvilinear relationships y quadratic: y = a0 + a1x + a2x2 y-intercept determine shape and direction of curve x
Polynomial Equations y Cubic y = a0 + a1x + a2x2 + a3x3 Quartic y = a0 + a1x + a2x2 + a3x3 + a4x4 Quintic y = a0 + a1x + a2x2 + a3x3+ a4x4 + a5x5 x y x y x
Polynomial Equations y Cubic y = a0 + a1x + a2x2 + a3x3 Quartic y = a0 + a1x + a2x2 + a3x3 + a4x4 Quintic y = a0 + a1x + a2x2 + a3x3+ a4x4 + a5x5 x y x y x
/ak/data • fit F1, F2, F3 contours with polynomial equations • test the reliability of the polynomial coefficients in distinguishing speakers Quadratic: y = a0 + a1t + a2t2 Cubic: y = a0 + a1t + a2t2 + a3t3
“bike”, Speaker A (normal-nuclear token 1) F1 contour actual data points Quadratic fit: y = 420.68 + 79.26t - 5.92t2 Cubic fit: y = 478.85 - 46.07t + 35.62t2 - 3.46t3 y Frequency (Hz) t Normalised time
“bike”, Speaker A (normal-nuclear token 1) F1 contour actual data points Quadratic fit: y = 420.68 + 79.26t - 5.92t2 R = 0.879 Cubic fit: y = 478.85 - 46.07t + 35.62t2 - 3.46t3 R = 0.978 y Frequency (Hz) t Normalised time
“bike”, Speaker A (normal-nuclear token 1) F2 contour y actual data points Quadratic fit: y = 876.01 - 53.24t + 22.46t2 R = 0.985 Cubic fit: y = 825.49 + 55.64t - 13.63t2+ 3.01t3 R = 0.991 Frequency (Hz) t Normalised time
DA on polynomial coefficents • Quadratic 3 formants x 3 coefficients = 9 predictors • Cubic3 formants x 4 coefficients = 12 predictors • Cubic + duration of /a/ 12 + 1 = 13 predictors
Comparison of Classification Rates % Correct Classification
Comparison of Classification Rates No. of predictors: (9) (12) (13) (20) % Correct Classification
Comparison of Classification Rates No. of predictors: (9) (12) (13) (20) % Correct Classification
Comparison of Classification Rates No. of predictors: (9) (12) (13) (20) % Correct Classification