1 / 23

Facilitating Use of Speech Recognition Software

Facilitating Use of Speech Recognition Software. Summarized By: Vivianne Cardenas EME 2040-Fall 2008. Abstract. This study examined three interventions: 1. physiological, 2. behavioural, 3. pragmatic Designed to facilitate speech recognition software.

kert
Télécharger la présentation

Facilitating Use of Speech Recognition Software

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Facilitating Use of Speech Recognition Software Summarized By: Vivianne Cardenas EME 2040-Fall 2008

  2. Abstract • This study examined three interventions: 1. physiological, 2. behavioural, 3. pragmatic • Designed to facilitate speech recognition software. • There were 15 adult participants with dysarthria associated with a variety of aetiological conditions. • The conditions include: cerebral palsy, Parkinson’s disease, and motor neuron disease. • Participants demonstrated systematic improvement in their dictation rates (treatment order did not matter).

  3. Introduction • Voice or speech software provide access for people with physical disabilities. • 13 out of 20 people with dysarthia associated with cerebral palsy achieved accuracy rates. • The rates were 80% to 100% during training tutorial of a speech recognition system. • It took no more that four 1-hour sessions. • Factors were identified that there were successful and unsuccessful users. • Parameters reflected the ability to co-ordinate: respiratory, phonatory, and articulatory.

  4. Introduction Continuation • Measures of vocabulary size, nonverbal problem, solving skills, and reading competency did not predict success in sung the software. • Tasks emphasizing laryngeal and respiratory co-ordination were more difficult. • It was difficult for the people whose dysarthria fit within the mixed categories of FrenchayDysarthria Assessment. • Overall this subgroup was unsuccessful with the software. • Additional practice improved participants dictation rates. • Voice recognition success was characterized by steep increases in correct recognition (dysarthric speakers).

  5. Introduction Continuation • More gradual increases observed over subsequent sessions. • Improvement after second session changes in performance, rather to a software training effect. • Non-disabled participants showed no voice recognition success after second dictation. • Most participants in study showed improvement but dictation rate stayed slow, compared to normal users. • Initial training time with speech recognition software was greater for speakers with cerebral palsy (3.5 hrs. training). • Normal speakers require 1 hour of training.

  6. Introduction Continuation • Research suggests that a reasonable proportion of speakers with dysarthria can use the software successfully. • Efficiency use in limited. • Ongoing improvement associated with performance associated with performance changes within individual users. • Treatment can effect optimal use. • Behavioural Treatment: focuses on modifying a behaviour that an assessment shows to be inadequate. • Physiological Treatment: focuses on improving the range, strength, and speed of musculature (has been identified as impaired during assessment).

  7. Introduction Continuation • Critical for specific movements for speech production. • Pragmatic: involves manipulating factors that may influence speech intelligibility but do not directly control it. • Aims of the Project: 2 main Aims • First- test proposition that people with upper motor neuron type dysarthria are less likely to be successful. • Second- is to determine what type of speech practice constitutes the most efficient method of becoming skilled at using the software.

  8. Method • Participants • 16 Australians (12 men and 4 women) participated. • All spoke Australian dialect of English but one, who spoke with a Sri Lankon background. • English-:main and preferred language of all participants. • Age Range: 18 to 81 years with an average age of 53 years. • One individual withdrew because of progression of his medical condition. • 5 completed tertiary education, 6 secondary, 1 reached primary standard, 4 remain undisclosed.

  9. Method Continuation • Table I. Participant characteristics. • Sex Age MD HL PPVT Ravens WI FDA-TS FDA-I DP Train • 1 M 29 CP normal 98 104 98 238 8.0 UMN 2 • 2 M 48 CP mild 106 73 113 173 5.7 3 • 3 M 77 P severe 92 70 95 198 6.7 3 • 4 M 64 P mod 117 74 129 243 9.0 LMN 2 • 5 M 59 P severe 123 87 117 228 8.7 2 • 6 M 26 CP normal — 68 — 195 8.7 UMN 4 • 7 M 66 P mild 99 68 123 110 2.3 EP 4

  10. Method Continuation • 8 M 81 P mod 102 71 128 197 9.0 LMN 3 • 9 M 46 MND mod 93 104 122 140 1.3 ULMN 4 • 10 M 68 P severe 99 70 103 218 8.7 2 • 11 M 18 CP normal 109 92 89 211 8.0 3 • 12 F 70 P mod 113 85 114 231 8.7 EP 2 • 13 F 71 MND mod 102 83 123 154 3.3 ULMN 3 • 14 F 53 F mod — 68 — 185 6.3 3 • 15 F 31 CP mod 61 67 75 216 8.7 UMN 4 • M 53 101 80 109 196 6.8 2.9 • SD 20 14 13 16 38 2.5 0.8

  11. Method Continuation • MD5medical diagnosis (CP5cerebral palsy, P5Parkinson’s disease, MND5motor neuron disease, • F5Fybromyosis). HL5hearing loss. WI5Word Identification subtest from the Woodcock. FDA-TS5Frenchay • Dysarthria Assessment total score. FDA-I5mean of FrenchayDysarthria Assessment intelligibility ratings. PPVT, • Ravens and WI test scores are standard scores (M5100, SD515). DP5Dysarthria profile based on the FDA • (UMN5upper motor neuron lesion, LMN5lower motor neuron lesion, ULMN5mixed upper and lower motor • neuron lesion, EP5extrapyramidal lesion, where the column is blank the profile was unclear). Train5number of 1- • hour sessions required to reach success in training PowerSecretary speech recognition software.

  12. Method Continuation • Instrumentation • Speech Recognition Software used: PowerSecretary power edition, installed on a 7500/100 power Macintosh Computer. • Uses discrete or word-by-word input and has a large vocabulary. • Behaviorual Sessions • Apple condenser microphone connected to a 750/100 power Macintosh computer. • Computer used to record the speech samples to provide visual feedback during behavioural treatment sessions.

  13. Method Continuation • Physiological Sessions • PowerLab hardware and software, installed on a Macintosh G3 computer used for visual biofeedback during the treatment. • General Procedure • 4 main stages to the project: in order to progress stages participants attended a two 1-hour sessions each week. • Sessions were carried out by 4 final year speech pathology students under the supervision of an experienced speech pathologist.

  14. Method Continuation • Initial Screening and Clinical Assessment • First stage: involved screening, during which standardized test were given, and voice and speech was completed. • All speech samples were recorded using a Sony ECM-44B Electret Condenser lapel microphone and attached was a cassette recorder. • A microphone was clipped to the participants’ shirt 15cm below mouth. • Assessments were taken in a sound attenuated audiology booth to reduce background noise level.

  15. Method Continuation • PowerSecretary Software Training • Second Stage: involved initial PowerSecretary software training. • Participants pronounced a large set of key training words and phrases for speakers adaptation purposes. • Therapy/dictation sessions • Third stage: involved combined speech therapy and dictation sessions for the successful participants. • During each session participants receive 30 minutes of treatment followed by 30 minutes of dictation.

  16. Method Continuation • 15 sessions overall (divided into 3 blocks of 5 sessions). • Dictation session was 8 minutes for the following tasks: letter application, letter to editor, veterinarian memo, and a doctor memo. • Post therapy clinical assessment • Final stage: after all sessions completed participants underwent a clinical assessment of speech and voice. • Using same tasks as stage 1. • Treatment Descriptions • Behaviorual Treatment: facilitate production of utterances with pauses between single words.

  17. Method Continuation • Started with the repetition of one-to four-syllable words first, followed by the compound word and phrase repetition tasks. • Physiological Treatment: techniques were designed to encourage the participant to be relaxed and maintain good posture when sitting at a computer. • Also increases breath control and volume of inspiration. • PowerLab used to display recordings of pulse rate, chest expansion, and vowel prolongation. • Pragmatic Treatment

  18. Results • Initial Training • All 15 participants completed the initial training phase of project. • Participants achieved 100% in the training vocabulary. • 3 participants showed signs of having an upper motor neuron lesion. • 2 were consistent with a lower motor neuron lesion • 2 showed signs of a mixed upper and lower motor neuron lesion. • 2 were consistent with an extrapyramidal lesion.

  19. Results Continuation • Difficulty matching the remaining 6 participants to a specific diagnostic profile. • Effect of therapy on dictation skills • Refer to table I and II. • Post-treatment measures • Comparison between initial and final assessment of speech production and voice was made to see if participants modified their speech and voicing abilities. • Results: Significant increase of vowel prolongation following the treatment programmes.

  20. Results Continuation • Significant increase in the percentage of voicing measure.

  21. Discussion • All participants: successful to some degree in using the powersecretary software. • A relationship was observed between severity of dysarthria and efficiency in training the software. • Physiological Treatment: focused on improving breath support and elongating phonation using objective feedback from the PowerLab. • Pragmatic Treatment: yielded a better dictation response, as compared to the behaviouralprogramme. • Behavioural Treatment: involved spectrographic feedback of speech productions.

  22. Citation • Kathryn Hird and Neville W. Hennessey. 2007. Faciliating use of speech recognition software for people with disabilities: A comparison of three treatments. 211-226.

  23. What Questions Do You Have? ?

More Related