1 / 29

Human Language Technology

Human Language Technology. Gary Geunbae Lee Intelligent Software Lab. POSTECH. contents. What is HLT? - definition/history/application? HLT workshop case study – acl/sigir/hlt conferences Towards Technology synergy – 21c frontier project. Goals of the HLT.

latham
Télécharger la présentation

Human Language Technology

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Human Language Technology Gary Geunbae Lee Intelligent Software Lab. POSTECH

  2. contents • What is HLT? - definition/history/application? • HLT workshop case study – acl/sigir/hlt conferences • Towards Technology synergy – 21c frontier project Gary G. Lee, Postech

  3. Goals of the HLT Computers would be a lot more useful if they could handle our email, do our library research, talk to us … But they are fazed by natural human language. How can we make computers have abilities to handle human language? (Or help them learn it as kids do?) Gary G. Lee, Postech

  4. A few applications of HLT • Spelling correction, grammar checking … • Better search engines • Information extraction, gisting • Psychotherapy; Harlequin romances; etc. • New interfaces: • Speech recognition (and text-to-speech) • Dialogue systems (USS Enterprise onboard computer) • Machine translation; speech translation (the Babel tower??) • Trans-lingual summarization, detection, extraction … Gary G. Lee, Postech

  5. Levels of Language • Phonetics/phonology/morphology: what words (or subwords) are we dealing with? • Syntax: What phrases are we dealing with? Which words modify one another? • Semantics: What’s the literal meaning? • Pragmatics: What should you conclude from the fact that I said something? How should you react? Gary G. Lee, Postech

  6. What’s hard – ambiguities, ambiguities, all different levels of ambiguities John stopped at the donut store on his way home from work. He thought a coffee was good every few hours. But it turned out to be too expensive there. [from J. Eisner] - donut: To get a donut (doughnut; spare tire) for his car? - Donut store: store where donuts shop? or is run by donuts? or looks like a big donut? or made of donut? - From work: Well, actually, he stopped there from hunger and exhaustion, not just from work. - Every few hours: That’s how often he thought it? Or that’s for coffee? - it: the particular coffee that was good every few hours? the donut store? the situation - Too expensive: too expensive for what? what are we supposed to conclude about what John did? Gary G. Lee, Postech

  7. Ubiquitous computing • Ubiquitous computing • Pervasive computing • Third paradigm computing • Calm technology • Computing everywhere • Invisible computing • Irobot style interface – human language + hologram?? Gary G. Lee, Postech

  8. reverberation Robot noise Envi noise Remote speech input Intelligent service robot Gary G. Lee, Postech

  9. GPS Voice Portal for Email, VAD and Internet Contents Telematics device PDA Car Controller CDMA Information Center PDA Telematics – Eye busy and hand busy Gary G. Lee, Postech

  10. Smart Home • 아줌마: 요즘 이영애 나오는 인기있는 드라마가 뭐지? • DTV: MBC에서 방영중인 대장금입니다. • 아줌마: 대장군 재방송 어디서 해? • DTV: 지금은 방송중이 아니고, 채널36에서 오후 2시에 방영예정입니다. • 아줌마: 그럼, 그거 녹화해 줘. • DTV: 네, 알겠습니다. Gary G. Lee, Postech

  11. ASR community (speech/ signal processing) [from McTear] • Research projects • Communicator, TRIPS, COMIC, DIPPER, TRINDI, GALAXY, SMARTKOM, Verbmobil, DUMAS, FASiL, EARS • Platforms for development and research • CSLU, JASPIS, TRINDIKIT, SUEDE, WITAS, SpeechBuilder, … • Conferences and workshops • ICSLP, Eurospeech, ICASSP, SigDial, … • Journals • Computer speech and language, Speech Communication, IJHCS, IEEE Trans. SAP (speech and audio processing) Gary G. Lee, Postech

  12. NLP community (AI-NLP, Ling – CL) • Research projects • SAM/PAM, Pen treebank, TAG, GATE, MUC, TIPSTER, TDT, TIDES, etc • Platforms for development and research • Alembic, Alvey, Gate, LingPipe, Collins parser, Jasen, postag/K, …(see NLP software registry) • Conferences and workshops • ACL, EACL, ANLP, COLING, IJCNLP, EMNLP… • Journals • Computational Linguistics, Natural Language Engineering, ACM TALIP, IJCPOL, Computers and Humanities… Gary G. Lee, Postech

  13. IR community (Library science) • Research projects • SMART, Digital Libraries, TREC, NTCIR, etc • Platforms for development and research • SMART, MG, Lemur, Z-PRIZE, etc • Conferences and workshops • ACM SIGIR, AIRS, ACM CIKM, JCDL, ASIST,.. • Journals • IPM, JASIST, Information systems, … Gary G. Lee, Postech

  14. Long History of Funding • long research history since 1960’s • significant research results due to constant funding (e.g. DARPA’s 20 years of funding) --- ready for practical solution • five main desiderata for practical app? • Integration at proper level of analysis/understanding • Combination of appropriate modality • Based on real examples (corpora) • Towards multi-lingual applications • Thorough evaluation (usability) Gary G. Lee, Postech

  15. contents • What is HLT? - definition/history • HLT workshop case study – darpa hlt examples • Towards Technology synergy – 21c frontier project Gary G. Lee, Postech

  16. History notes common technologies • HMM, SVM, CRF, MEMM, DBN for POS tagging, ASR, parsing, prosody modeling, statistical MT, etc • Tf/idf, n-gram, discounting/smoothing for sentence weighting, retrieval models, summarization, question answering, etc • Bayesian network, causal network, graphical models for information retrieval, topic detection, dialog, task planning, etc • Trie indexing, tree indexing, caching for ASR pronunciation modeling, morphological lexicon, IR indexing, TTS G2P, etc • Common themes  statistical language modeling and machine learning and empirical evaluation (glass box/black box) Gary G. Lee, Postech

  17. Written language vs. spoken language? • Commonalities • Human languages • Differences • Punctuation vs. prosodic cues • Disfluencies vs. linguistic competency • Recognition errors • I canned meat at eleven ten then ok • I can’t / meet at eleven ten then? / ok Gary G. Lee, Postech

  18. IT839: new technology for economy growth needs synergy? • 8 new services        - WiBro        - DMB        - Home Network       - Telematics      - RFID application      - W-CDMA       - ground DTV       - internet telephony (VoIP) • 3 new infrastructures       - (BcN)       - u-sensor network       - IPv6 • 9 new growth  technology       - new mobile communication       - digital TV broadcasting        - Home Network - IT SOC       - next generation PC- embedded SW - digital contents(DC)       - telematics     - intelligent service robots Gary G. Lee, Postech

  19. NLP/IR/speech merge: ACL-05 conferences • The Association for Computational Linguistics invites the submission of papers for its 43rd Annual Meeting hosted jointly with the North American Chapter of the ACL. Papers are invited on substantial, original, and unpublished research on all aspects of computational linguistics, including, but not limited to: pragmatics, discourse, semantics, syntax, grammars and the lexicon; phonetics, phonology and morphology; lexical semantics and ontologies; word segmentation, tagging and chunking; parsing, generation and summarization; language modeling, spoken language recognition and understanding; linguistic, psychological and mathematical models of language; language-oriented information retrieval, question answering, and information extraction; machine learning for natural language; corpus-based modeling of language, discourse and dialogue; multi-lingual processing, machine translation and translation aids; multi-modal and natural language interfaces and dialogue systems; applications, tools and resources; and evaluation of systems. Gary G. Lee, Postech

  20. NLP/IR/speech merge: ACM SIGIR-05 conferences • SIGIR 2005 welcomes contributions related to any aspect of IR, but the major areas of interest are listed below. For each general area, two or more area coordinators will guide the reviewing process. • Formal Models, Language Models, Fusion/Combination • Text Representation and Indexing, XML and Metadata • Performance, Compression, Scalability, Architectures, Mobile Applications • Web IR, Intranet/Enterprise Search, Citation and Link Analysis, Digital Libraries, Distributed IR • Cross-language Retrieval, Multilingual Retrieval, Machine Translation for IRVideo and Image Access, Audio and Speech Retrieval, Music Retrieval • Text Data Mining and Machine Learning for IRText Categorization, Clustering • Topic Detection and Tracking, Content-Based Filtering, Collaborative Filtering, Agents • Summarization, Question Answering, Natural Language Processing for IR, Information Extraction, Lexical Acquisition • Interactive IR, User Interfaces, Visualization, User Studies, User Models • Specialized Applications of IR, including Genomic IR, IR in Software Engineering, and IR for Chemical Structures • Evaluation, Building Test Collections, Experimental Design and Metrics Gary G. Lee, Postech

  21. Speech/NLP/IR merge- recent HLT conference series • HLT/NAACL2003 – Edmonton, Canada • HLT/NAACL2004 – Boston, USA • HLT/EMNLP2005 – Vancouver, Canada • The joint conference provides a unified forum for researchers across a spectrum of disciplines to present recent, high-quality, cutting-edge work, to exchange ideas, and to explore emerging new research directions. The conference especially encourages submissions that discuss synergistic combinations of language technologies (e.g., Speech with Information Retrieval, Machine Translation with Speech, Question Answering with Natural Language Processing, etc.). Particular consideration will be given to papers addressing novel learning tasks and evaluation metrics in speech, natural language processing and information retrieval,including e.g.: Gary G. Lee, Postech

  22. HLT/EMNLP2005 CFP • learning tasks insufficiently addressed in the past, e.g. collaborative learning, learning in the presence of background knowledge, or finding anomalies in data; • limits of standard evaluation methods on new tasks; • novel performance measures incorporating user preferences, competence, or relevance to a given problem; • learning and optimization algorithms addressing the above, e.g. novel statistical methods or cognitively inspired solutions. • We are interested in papers from academia, government, and industry on all areas of traditional interest to the HLT and SIGDAT communities, as well as aligned fields, including but not limited to: • Speech processing, including: • Speech recognition • Speech generation • Speech summarization • Rich transcription: annotation of speech signals with metalinguistic information, such as speaker identity, attitude, emotion, etc. • Speech-based human-computer interfaces • Text summarization • Question answering • Paraphrasing • Computational analysis of phonology, morphology, prosody, syntax, semantics, pragmatics, discourse, style • Statistical techniques for language processing, including: • Corpus-based language modeling • Lexical and knowledge acquisition Gary G. Lee, Postech

  23. HLT/EMNLP2005 CFP • Language generation and text planning • Sentence parsing and discourse analysis • Multilingual processing, including: • Machine translation of speech and text • Cross-language information retrieval • Multi-lingual speech recognition and language identification • Evaluation, including: • Glass-box evaluation of HLT systems and system components • Back-box evaluation of HLT systems in application settings • Development of language resources, including: • Lexicons and ontologies • Treebanks, proposition banks, and frame banks • Understanding of human communication, including: • Natural language interfaces • Dialogue structure and dialogue systems • Message and narrative understanding systems • Information extraction from multiple media • Information retrieval, including: • Formal models, clustering and classification • Web mining for IR • Natural language processing for IR • Spoken IR • Metadata annotation and XML IR Gary G. Lee, Postech

  24. contents • What is HLT? - definition/history • HLT workshop case study – darpa hlt examples • Towards Technology synergy – 21c frontier project Gary G. Lee, Postech

  25. Technology cross-over: some examples • Spoken language understanding needs information extraction technology • Language modeling (adaptation) for ASR needs information retrieval for corpus expansion for a specific domain • Statistical MT needs fast-viterbi decoding • ASR, SMT, speech error correction use exactly same HMM modeling process • Language modeling for ASR needs parsing/structural analysis • And more and more… Gary G. Lee, Postech

  26. Some scenarios from 21c frontier project • 3단계:노인 : 꾀돌아, 파리의 연인 재방송은 언제하지?로봇 : 파리의 연인 재방송은 SBS 드라마 채널에서 월요일 아침 10시에 합니다. 노인 : 그거 예약 녹화 좀 해 놔라.다음 주에도 같은 시간에 재방송이니? 로봇 : 네, 예약하겠습니다. 다음 주는 올림픽 중계로 재방송이 없습니다. 노인 : 참, 이번 일요일 강영순 집사가 온다고 했는데 몇 시에 오니?(domain switching) 로봇 : 강영순 집사와 월요일 약속은 오후 1시 입니다.노인 : 알았다. 그날 1시간 전에 다시 알려다오.그리고 냉장고에서 마실 것 좀 가져와라.(domain switching)로봇 : 네, 알겠습니다.  마실 것은 무엇으로 가져올까요? (mixed mode convceration)노인 : 시원한 냉수가 좋겠다.로봇 : 네, 냉수 1잔을 가져다 드리겠습니다. 노인 : (거실 구석의 책상을 가리키며)그리고 그 위에 있는 노란 책 좀 가져와라.(multi-modal gesture)로봇 : 네, 책상 위에 있는 노란 책을 가져다 드리겠습니다. Gary G. Lee, Postech

  27. 노인 그 그래? 어 대장금 시작하믄 알려줘 음성인식 결과 • 대화 현상을 반영한 음성인식 • 간투어: 어/ • 반복/수정발화: 그/ 그래? • 발음변이: 시작하믄 (시작하면) • (CSR itself) 그래? 대장균삭히면 알려줘 영역 지식(TV가이드) 및 구문/의미/ 문맥 지식을 이용한 인식 오류 수정 (post error correction) 그래? 대장금 시작하면 알려줘 대화 모델과 영역 지식을 이용한 음성대화 진행 (dialog understanding) 로봇 예. 대장금 시작할 때 TV를 켜고 알려 드리겠습니다. Conversational SDS-integrated approach Gary G. Lee, Postech

  28. 음성 인식 시스템 음성 후처리 음성 언어 이해 대화 시스템 질의 응답 시스템 HTK 기반의 제한-실용적 음성 인식 음성 오류 수정 음성 오류에 강인한음성 언어 이해 대화모델링 및 대화를 통한 음성 인식 수정 DataBase 질의 및 결과 응답 시스템 HLT- speech/language synergy 음성 처리Recognition&Correction 언어 이해Understanding 미래의 통합정보 DB 대화 및 질의 응답 Gary G. Lee, Postech

  29. 음성 전처리 성능향상 • 음원 분리 및 음원 분류 • 원격 마이크 환경보상 • 로봇 잡음 및 배경잡음 보상 대장금 언제 하지? 9시 55분 입니다. 대화음성 인터페이스 대화음성인식기 성능 향상 • 노년층 대화현상을 반영한 음성인식 • 문맥정보를 활용한 음성인식 오류 수정 • 대화모델을 이용한 체감인식률 향상 From signal to dialog/knowledge 실버 메이트용 대화음성 인터페이스 응용 시나리오 예제 노인 : 짱구야, 장금이 언제 하지? 로봇 : 드라마 대장금은 월요일, 화요일 밤 9시 55분에 합니다. 노인 : 가만, 오늘이 무슨 요일이지? 로봇 : 오늘은 월요일입니다. 노인 : 그래? 대장금 시작하면 알려줘. 로봇 : 예, 드라마 시작할 때 TV를 켜고 알려 드리겠습니다. Gary G. Lee, Postech

More Related