1 / 11

Advanced NLP: Speech Research and Technologies

Advanced NLP: Speech Research and Technologies. Julia Hirschberg CS 6998. Spoken Natural Language Processing. NLP/Computational Linguistics historically text-oriented Speech research domain of EE and Linguistics 1980s: efforts to bring together by DARPA

mcrockett
Télécharger la présentation

Advanced NLP: Speech Research and Technologies

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Advanced NLP: Speech Research and Technologies Julia Hirschberg CS 6998

  2. Spoken Natural Language Processing • NLP/Computational Linguistics historically text-oriented • Speech research domain of EE and Linguistics • 1980s: efforts to bring together by DARPA • Today: applications motivate collaboration • Automatic Speech Recognition (ASR) • Text/Concept-to-Speech (TTS/CTS) • Spoken Dialogue Systems (SDS), Speech-to-Speech Translation, Speech Search/Data Mining

  3. Studying Speech is Different • Understanding input and generating output are more complicated • ASR errors and lack of formatting cues • TTS/CTS naturalness issues • But there is also more information to take advantage of • Pitch variation, loudness, rate, voice quality • Filled pauses, self-repairs

  4. Acoustic/Prosodic Cues Can Convey…. • Topic Structure • Information Status: what’s shared knowledge? What’s important? • Speaker State/Emotion • Speech Acts • Syntactic Structure • Semantic Information

  5. Labeled Waveform and F0 Contour

  6. Current Approaches • Corpus-based studies • Hand-labeled/automatically-labeled data • Tools: • Analysis (pitch tracks, spectrograms….) • ASR toolkits • TTS systems • Machine learning • Laboratory studies • Evaluation

  7. CS 6998 • Requirements: • Class Participation: • Questions for class discussion • Helping lead a class • Lab exercises • Project • Literature review • Data collection and/or analysis from a corpus

  8. Building a system or system component (e.g. a preprocessor to assign intonation in a generation system) • Conduct an experiment: perception or production • Examples: • How do people convey contrast? • Given/new information? • What tells people that they can ‘take the next turn’?

  9. What is the relationship between syntactic structure and intonation? • How do people convey anger? Uncertainty? Other emotions? • How can you tell if people are deceiving you? • How might we recognize disfluencies?

  10. Next Week • Read Hirschberg 2003 and ToBI conventions • Make sure you have access to supplementary readings if you need them • Bring 3 discussion questions to class • Check access on cs servers to corpora and /proj/nlp/tools/mathTools/ • Xwaves (solaris and linux) esps531.sol, esps531.linux (also downloadable from KTH) • wavesurfer (win, linux, mac) available at KTH

  11. Projects: • Start thinking about what area you want to work in for your project and what type of project you’d like to do

More Related