1 / 42

Speech Recognition Technology Applications

Speech Recognition Technology Applications . Denise Bilyeu, M.S. CCC-SLP Scottish Rite Computer Supported Literacy Program Munroe-Meyer Institute Omaha, NE. Speech Recognition. Utilizes hardware and software to transcribe spoken words into orthographic text

amal
Télécharger la présentation

Speech Recognition Technology Applications

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Speech Recognition Technology Applications Denise Bilyeu, M.S. CCC-SLP Scottish Rite Computer Supported Literacy Program Munroe-Meyer Institute Omaha, NE

  2. Speech Recognition • Utilizes hardware and software to transcribe spoken words into orthographic text • Allows users hands free operation of computer systems

  3. Applications for Persons with Disabilities • Academic opportunities • Vocational opportunities • Access to WWW

  4. Implementation Issues • System Training Requirements • Dictation in Written Form • Absence of Graphical Representation • Functional Grade Level • Dictation Environment • Higher Order Organizational Skills/Strategies

  5. System Training Requirements • Samples of training protocol text (500 words) were taken from each of the following programs • Dragon Naturally Speaking Standard* • Dragon Naturally Speaking Teen* • IBM Via Voice Gold • L & H Voice XPress *two samples were analyzed and averaged

  6. Samples were analyzed using Readability Stack (Tice, B. 1990) • Flesch Index • Dale Index • Dale-Chall Formula • Fry Readability Graph

  7. Flesch Index • (RE= 206.835 - (1.015 x words/sentence) - (84.6 x syllables/word) • Rates text on a 100 point scale • High scores indicate easier reading levels • Reading Ease based upon • Mean Sentence Length • Syllables per 100 words

  8. Dale Index • DI = 11.534 - (.053 x RE) • Based on the Flesch Index Reading Ease Score

  9. Dale-Chall Formula • Reading Grade Score (RGS) = .1579 x DS (Dale Score) + .0496 x SL (Sentence Length) + 3.6365 • Dale Score = % of words not on Dale list of 3000 • Sentence Length = average # of words per sentence

  10. Fry Readability Graph • Yields Readability Grade Score (RGS) based upon: • Syllables per 100 words • Sentences per 100 words • Average the RGS for 3+ random passages for reliable score

  11. Dale-Chall Analysis

  12. Fry Readability

  13. Flesch Index

  14. Dale Index

  15. Conclusions • 4th grade minimum literacy level required to train voice recognition programs (most programs need 6th to 8th grade reading levels) • Respiratory support sufficient to produce sentences of M = 10.44 words • No statistically significant differences in training protocols

  16. Dictation in Written Form • Dictation vs. Conversational speech • Children produce 86% more words in slow dictation than in writing and 163% more words in normal dictation than in writing (Breeder & Scardamalia) • Process is vastly different • Dictation skills must be taught

  17. Absence of Graphical Representation • Difficulty with dictation is often attributed to absence of graphical representation; may cause problems in text development and revision (Wetzel) • Speech Recognition has graphical representation, but often with a delay that interrupts the dictation process

  18. Functional Grade Level • Classroom placement and curriculum demands contribute to written text needs • Written text requirements may not be extensive enough to warrant a Speech Recognition system • Consider cognitive and/or language skills

  19. Dictating Environment • Voice recognition requires an environment relatively free of auditory stimuli • Ambient noise will effect the system’s ability to function well • Dictating may be disruptive to others • Removal from the environment may solve dictation problems, but result in educational or vocational disruptions

  20. Higher Order Organizational Skills / Strategies • Persons must have cognitive abilities to dictate and often need strategies to help with the process • Pre-Writing Strategies • Writing instruction • Planning • Outlining/Mapping • Inspiration

  21. Evaluation • Intelligibility • Sentence Intelligibility Test (Yorkston, Beukelman & Tice, 1991) • Utilizes ten unrelated sentences • Transcribed by unfamiliar listeners • Variables elicited • Intelligibility (% of intelligible speech to unfamiliar listener without context) • Rate of speech • Grade/Literacy Level • Fluency of Dictation

  22. Attention to task • Writing/Dictating environment

  23. Trial with voice recognition system • set up microphone/sound system to see if voice is perceived • run system training session if user is capable • dictate known passages that require little cognitive demand e.g., pledge of allegiance • dictate text that requires cognitive demand, short expository

  24. Alternate means for training systems • Utilize another person with similar voice characteristics • Transcribe training protocols and allow user to learn and practice dictating • Transcribe training protocols and dictate to tape for user to listen to while dictating

  25. Case Studies

  26. Janae • 9 years old • Athetoid Cerebral Palsy • Sentence Intelligibility Test Score - 10% • Current System • Discover Board, Mouse Key • Reason for Referral • Mousing slow and fatiguing

  27. Evaluation Tool • Dragon Dictate v. 3.0 • Evaluation Results • With no training, could utilize Mouse Grid with 80% accuracy, after one hour session, could utilize Mouse Grid with 95% accuracy. • With extensive training could dictate small amounts of text

  28. Voice Recognition Status • Utilizing Dragon Dictate Mouse Grid on trial basis • Training on selected, commonly used words in progress to determine efficiency and fatigue effect of dictating text

  29. James • 14 years old • Learning Disabled, reading and writing skills 4 years below grade level • Sentence Intelligibility Test score = 100% • Reason for Referral • Slow input method • Input impeded cognitive writing process • Inability to monitor written work

  30. Evaluation Tool • Dragon Naturally Speaking Standard • Dragon Naturally Speaking Teen • Evaluation Results • Training materials printed and practiced before actual program training • Training required 2 weeks, 3 sessions/week • Needed alternate text program to review text • Worked on phrasing, assisted punctuation

  31. Voice Recognition Status • Uses voice recognition at home for homework and correspondence • Does not use voice recognition at school

  32. Brett • 18 years old • Quadriplegia, ventilator dependent • Sentence Intelligibility Test score = 100% • Current system • EZKeys for Windows with Morse Code input via pneumatic switch • Reason for referral • slow input method

  33. Evaluation Tool • Kurzweil • Evaluation Results • Ventilator had to be physically blocked at Brett’s neck and in the back of the wheelchair • Training on segmentation of words and phrases was necessary • Training required one month

  34. Voice Recognition Status • Able to use voice recognition at home for homework, correspondence and the internet • Unable to use voice recognition at school because of ambient noise and disruption that dictation causes

  35. Katie • 16 years old • Traumatic brain injury • Sentence Intelligibility Test score = 89% • Current system • regular keyboard with track ball • Reason for referral • slow input method • fine motor movement fatiguing

  36. Evaluation tool • Dragon Naturally Speaking Standard • Evaluation results • Unable to train system during evaluation likely because of nasal emission on specific sounds, effecting the intelligibility of surrounding sounds • Palatal lift was fitted subsequent to evaluation, but further voice recognition evaluation was not done

  37. Voice Recognition Status • Unable to utilize voice recognition at time of evaluation • Further evaluation was not done as fine motor abilities were improving and alternate strategies (word prediction, abbreviation-expansion) were effective

  38. John • 53 years old • Friedrich’s Ataxia • Sentence Intelligibility Test score = 53% • Current system • EZKeys for Windows scanning via pneumatic switch • Reason for referral • slow input method • alternate access for versatility and fatigue

  39. Evaluation Tool • Dragon Naturally Speaking Standard • IBM Via Voice Gold • Evaluation Results • Unable to train system after extensive trial period (4 weeks, daily) • System would not “perceive” John’s voice

  40. Voice Recognition Status • Unable to utilize voice recognition • Trial with Dragon Dictate scheduled

  41. Clinical Implications • Decrease intelligibility results in decreased success with voice recognition • Intelligibility may NOT predict success with voice recognition • Rate of speech may effect success with voice recognition

  42. Future Directions • New voice recognition programs require minimal training • New programs that do not learn as they are used are in development • New programs that utilize a standard set of distinct “sounds” are in development

More Related