1 / 20

The Malevolent Hal: An Exploration of Speech Recognition and Language Processing

Join Dr. Paul De Palma as he delves into the complexities of speech recognition, language processing, and the challenges faced in creating an accurate and efficient system. Discover the limitations of current models and the potential for future advancements. This enlightening presentation will leave you questioning the true nature of language and the possibilities that lie ahead.

lpepin
Télécharger la présentation

The Malevolent Hal: An Exploration of Speech Recognition and Language Processing

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Open The Pod Bay Doors, Hal (it’s gonna be a long ride) Paul De Palma, Ph.D.Department of Computer Science School of Engineering and Applied Science Gonzaga University

  2. The Malevolent Hal

  3. WE’RE Not Quite There Yet(and Lucky for Us) But what is an error?

  4. The Model Do You Believe This?

  5. Why ASR is Hard

  6. A Tale of Aspiration • [t] Tunafish • Word initial • Vocal chords don’t’ vibrate. Produces a puff of air • [t] Starfish • [t] preceded by [s] • Vocal chords vibrate. No air puff • [k]: vocal chords don’t vibrate • [g]: Vocal chords don’t vibrate when preceded by an [s] • Leads to the mishearing of the Jimi Hendrix song: • ‘Scuse me, while I kiss the sky • ‘Scuse me, while I kiss this guy

  7. What to DO? • Language is not a system of rules • [t] makes a certain sound • “to whom” is correct. “to who” is incorrect • Language is a collection of probabilities • Then speech recognition is a conditional probability Read: “The hypothesized word sequence is that sequence in the target language with the greatest probability given a sequence of acoustic observations.” On the Right Track But Still Too Hard

  8. Parson Bayes to the Rescue Author of: Divine Benevolence, or an Attempt to Prove That the Principal End of the Divine Providence and Government is the Happiness of His Creatures (1731)

  9. Bayes Rule Let’s us transform: To: In Fact:

  10. LVCSR

  11. My Own Work • Is Speech Just Transcribed Writing? • What Does Speech Look Like?

  12. Here’s What A Real LVCSR Does “We have never seen anything like this in our history. Even the British colonial rule, they stopped chasing people around when they ran into a monastery.” (Spoken by a “33 year-old business woman” to a NY Times reporter in 2007) Do People Really Talk Like This?

  13. In Fact We don’t First segment from the Buckeye Corpus yes <VOCNOISE> i uh <SIL> um <SIL> uh <VOCNOISE> lordy <VOCNOISE> um <VOCNOISE> grew up on the westsidei went to <EXCLUDE-name> my husband went to <EXCLUDE-name> um <SIL> proximity wise is probably within a mile of each other we were kind of high school sweethearts and <VOCNOISE> the whole bit <SIL> um <VOCNOISE> his dad still lives in grove city my mom lives still <SIL> at our old family house there on the westside <VOCNOISE> and we moved <SIL> um <SIL> also on the westside probably couple miles from my mom.

  14. Why the Two Questions? Because of the conventional formulation of the ASR Problem: Speech recognition problem is the transformation of an acoustic signal to words Implicit bias toward writing

  15. What about the only real LVCSR • Human performance does not include transcription • Neither sanitized like the NY Times • Nor raw like the Buckeye Corpus

  16. The Syllable-Concept Hypothesis We could get better results if we: • First, map an acoustic signal to a syllable string • Then, map the syllable string to a concept string Where • A syllable is just a principled division of a word • A concept is an equivalence class of words and phrases that seem to mean the same thing • GO : fly, flying, going to fly, flew, go to, travelling to, book a ticket to …

  17. What We’re Building

  18. The (almost) last word “But it must be recognized that the notion ‘probability of a sentence’ is an entirely useless one under any known interpretation of the term.”

  19. Still, The greatest Irony • We use the 19th century qwerty keyboard to access a 21st device • We won’t get HAL anytime soon • We might not even want HAL But we can do better than a keyboard

  20. Thanks To • My Two Collaborators: • Charles Wooters, Ph.D. • Senior Researcher • International Computer Science Institute • Berkeley, CA • George Luger, Ph.D. • Chair, Department of Computer Science • University of New Mexico • Albuquerque, NM • My audience for putting up with my rambling

More Related