Speech Recognition Technology Applications

Speech Recognition Technology Applications Denise Bilyeu, M.S. CCC-SLP Scottish Rite Computer Supported Literacy Program Munroe-Meyer Institute Omaha, NE

Speech Recognition • Utilizes hardware and software to transcribe spoken words into orthographic text • Allows users hands free operation of computer systems

Applications for Persons with Disabilities • Academic opportunities • Vocational opportunities • Access to WWW

Implementation Issues • System Training Requirements • Dictation in Written Form • Absence of Graphical Representation • Functional Grade Level • Dictation Environment • Higher Order Organizational Skills/Strategies

System Training Requirements • Samples of training protocol text (500 words) were taken from each of the following programs • Dragon Naturally Speaking Standard* • Dragon Naturally Speaking Teen* • IBM Via Voice Gold • L & H Voice XPress *two samples were analyzed and averaged

Samples were analyzed using Readability Stack (Tice, B. 1990) • Flesch Index • Dale Index • Dale-Chall Formula • Fry Readability Graph

Flesch Index • (RE= 206.835 - (1.015 x words/sentence) - (84.6 x syllables/word) • Rates text on a 100 point scale • High scores indicate easier reading levels • Reading Ease based upon • Mean Sentence Length • Syllables per 100 words

Dale Index • DI = 11.534 - (.053 x RE) • Based on the Flesch Index Reading Ease Score

Dale-Chall Formula • Reading Grade Score (RGS) = .1579 x DS (Dale Score) + .0496 x SL (Sentence Length) + 3.6365 • Dale Score = % of words not on Dale list of 3000 • Sentence Length = average # of words per sentence

Fry Readability Graph • Yields Readability Grade Score (RGS) based upon: • Syllables per 100 words • Sentences per 100 words • Average the RGS for 3+ random passages for reliable score

Dale-Chall Analysis

Fry Readability

Flesch Index

Dale Index

Conclusions • 4th grade minimum literacy level required to train voice recognition programs (most programs need 6th to 8th grade reading levels) • Respiratory support sufficient to produce sentences of M = 10.44 words • No statistically significant differences in training protocols

Dictation in Written Form • Dictation vs. Conversational speech • Children produce 86% more words in slow dictation than in writing and 163% more words in normal dictation than in writing (Breeder & Scardamalia) • Process is vastly different • Dictation skills must be taught

Absence of Graphical Representation • Difficulty with dictation is often attributed to absence of graphical representation; may cause problems in text development and revision (Wetzel) • Speech Recognition has graphical representation, but often with a delay that interrupts the dictation process

Functional Grade Level • Classroom placement and curriculum demands contribute to written text needs • Written text requirements may not be extensive enough to warrant a Speech Recognition system • Consider cognitive and/or language skills

Dictating Environment • Voice recognition requires an environment relatively free of auditory stimuli • Ambient noise will effect the system’s ability to function well • Dictating may be disruptive to others • Removal from the environment may solve dictation problems, but result in educational or vocational disruptions

Higher Order Organizational Skills / Strategies • Persons must have cognitive abilities to dictate and often need strategies to help with the process • Pre-Writing Strategies • Writing instruction • Planning • Outlining/Mapping • Inspiration

Evaluation • Intelligibility • Sentence Intelligibility Test (Yorkston, Beukelman & Tice, 1991) • Utilizes ten unrelated sentences • Transcribed by unfamiliar listeners • Variables elicited • Intelligibility (% of intelligible speech to unfamiliar listener without context) • Rate of speech • Grade/Literacy Level • Fluency of Dictation

Attention to task • Writing/Dictating environment

Trial with voice recognition system • set up microphone/sound system to see if voice is perceived • run system training session if user is capable • dictate known passages that require little cognitive demand e.g., pledge of allegiance • dictate text that requires cognitive demand, short expository

Alternate means for training systems • Utilize another person with similar voice characteristics • Transcribe training protocols and allow user to learn and practice dictating • Transcribe training protocols and dictate to tape for user to listen to while dictating

Case Studies

Janae • 9 years old • Athetoid Cerebral Palsy • Sentence Intelligibility Test Score - 10% • Current System • Discover Board, Mouse Key • Reason for Referral • Mousing slow and fatiguing

Evaluation Tool • Dragon Dictate v. 3.0 • Evaluation Results • With no training, could utilize Mouse Grid with 80% accuracy, after one hour session, could utilize Mouse Grid with 95% accuracy. • With extensive training could dictate small amounts of text

Voice Recognition Status • Utilizing Dragon Dictate Mouse Grid on trial basis • Training on selected, commonly used words in progress to determine efficiency and fatigue effect of dictating text

James • 14 years old • Learning Disabled, reading and writing skills 4 years below grade level • Sentence Intelligibility Test score = 100% • Reason for Referral • Slow input method • Input impeded cognitive writing process • Inability to monitor written work

Evaluation Tool • Dragon Naturally Speaking Standard • Dragon Naturally Speaking Teen • Evaluation Results • Training materials printed and practiced before actual program training • Training required 2 weeks, 3 sessions/week • Needed alternate text program to review text • Worked on phrasing, assisted punctuation

Voice Recognition Status • Uses voice recognition at home for homework and correspondence • Does not use voice recognition at school

Brett • 18 years old • Quadriplegia, ventilator dependent • Sentence Intelligibility Test score = 100% • Current system • EZKeys for Windows with Morse Code input via pneumatic switch • Reason for referral • slow input method

Evaluation Tool • Kurzweil • Evaluation Results • Ventilator had to be physically blocked at Brett’s neck and in the back of the wheelchair • Training on segmentation of words and phrases was necessary • Training required one month

Voice Recognition Status • Able to use voice recognition at home for homework, correspondence and the internet • Unable to use voice recognition at school because of ambient noise and disruption that dictation causes

Katie • 16 years old • Traumatic brain injury • Sentence Intelligibility Test score = 89% • Current system • regular keyboard with track ball • Reason for referral • slow input method • fine motor movement fatiguing

Evaluation tool • Dragon Naturally Speaking Standard • Evaluation results • Unable to train system during evaluation likely because of nasal emission on specific sounds, effecting the intelligibility of surrounding sounds • Palatal lift was fitted subsequent to evaluation, but further voice recognition evaluation was not done

Voice Recognition Status • Unable to utilize voice recognition at time of evaluation • Further evaluation was not done as fine motor abilities were improving and alternate strategies (word prediction, abbreviation-expansion) were effective

John • 53 years old • Friedrich’s Ataxia • Sentence Intelligibility Test score = 53% • Current system • EZKeys for Windows scanning via pneumatic switch • Reason for referral • slow input method • alternate access for versatility and fatigue

Evaluation Tool • Dragon Naturally Speaking Standard • IBM Via Voice Gold • Evaluation Results • Unable to train system after extensive trial period (4 weeks, daily) • System would not “perceive” John’s voice

Voice Recognition Status • Unable to utilize voice recognition • Trial with Dragon Dictate scheduled

Clinical Implications • Decrease intelligibility results in decreased success with voice recognition • Intelligibility may NOT predict success with voice recognition • Rate of speech may effect success with voice recognition

Future Directions • New voice recognition programs require minimal training • New programs that do not learn as they are used are in development • New programs that utilize a standard set of distinct “sounds” are in development

Speech Recognition Technology Applications

Speech Recognition Technology Applications

Presentation Transcript

Speech Recognition

Speech Recognition

Speech Recognition and its clinical applications

Speech Recognition

Speech Recognition

Speech Recognition Technology

Speech recognition

Speech Recognition

Speech Recognition

Speech Recognition

Speech Recognition

Speech Recognition

Children’s Speech Recognition for Educational Applications

Speech Recognition

SPEECH RECOGNITION:

Speech Recognition

Speech Recognition

Speech Recognition

Speech Recognition

Speech Recognition

Speech Recognition

Speech Recognition