130 likes | 243 Vues
This work explores the challenges of automatic fluency assessment, identifying fluency as a subjective quantity that is difficult to measure. Building upon existing research, we propose that low-level acoustic measurements, particularly vocalic nuclei and silent pauses, serve as effective quantifiers of fluency. Using data from classroom recordings of second-language Chinese speech, we correlate acoustic features with human ratings, finding that Phonation Time Ratio is most positively correlated with Speech Flow. Our findings suggest pathways for future research including the detection of filled pauses and exploring additional quantifiers.
E N D
Towards Automatic Fluency Assessment Suma Bhat
The Problem • Fluency – subjective quantity • Design of right quantifiers critical • Measurement of the quantifiers required • Problem: Automatic Assessment
Problem is Hard • Fluency • Subjective quantity • Not readily measurable • More than just the opposite of disfluency • Key: design of good quantifiers
Previous Work • Main work by Cucchiarini et al. • Quantitative assessment of fluency in speech possible • Measure quantifiers of syllable rate and frequency of pauses • Use of ASR
Our Work • Previous work used language specific training data to build ASR • Our focus: low-level acoustic measurements • Our thesis: Low level acoustic variables are good quantifiers of fluency
Experiments • Start with speech signal • View acoustic data at a coarse level • Syllables well represented by corresponding vocalic nuclei • Vocalic nucleiand silent pauses can be detected automatically • Goal: correlate with human judgment
Data • Data from rated assessment • Classroom recording of 2nd language Chinese speech • 20s snippets rated for Speech Flow Phonological Control Lexical Accuracy Disfluency Delivery Skills Fluency • Fluency most correlated with Speech Flow,Disfluency
dur1=duration of speech without silent pauses dur2= total duration of speech Assessment of Fluency Speech Flow Disfluency
Acoustic Measurements • Downsample to 16K • Use intensity information • Segment utterance into regions of speech and silence • silent pause related information • Detect vocalic regions in the speech segments • syllable related information
Conclusion • Key Acoustic Features • PhonationTime ratiomost positively correlated with Speech Flow • Frequency of silent pausesmost negatively correlated with Speech Flow • Summary: Quantifiers obtained by low-level acoustic measurements useful for non-rated assessment
Looking Ahead • Measurements on more rated speech • Detection of filled pause • Look for additional quantifiers • Poor pronunciation • vocabulary richness • Automatic classification of fluency