110 likes | 253 Vues
Noriko Kando National Institute of Informatics. Roadmap for Language Resources in the viewpoint from Information Access Technology Evaluation. Presented at: Roadmap for Language Resources and Evaluation In a Multilingual Environment: Organised by COCOSDA and WRITE Genoa, 28 May 2006
E N D
Noriko KandoNational Institute of Informatics Roadmap for Language Resources in the viewpoint from Information Access Technology Evaluation Presented at: Roadmap for Language Resources and Evaluation In a Multilingual Environment: Organised by COCOSDA and WRITE Genoa, 28 May 2006 In Conjunction with LREC 2006
Issues • LR and Information Access • Multi-linguality across Culture • Emerging Areas: Genres, Opinion, Subjectivity, Community-based Ontology
LR and Information Access • Information Access (IA)Technologies (Information Retrieval, Question Answering, Summarization, Text mining, etc) needs better LR: coverage, richness, quality. • Evaluation • Development Ex. AQUAINT (Advanced QA) Program has supported Resources (WordNet, Gazetteer, etc.), Component Modules, and QA systems.
LR and Information Access– cont’d • Extrinsic (Task-based) LR Evaluation • So far LR evaluation had placed emphasis on intrinsic evaluation. Eg. Accuracy, consistency, standards, etc. • Extrinsic LR evaluation: How LR improved the effectiveness of the IA technologies? • Good ways to appeal LR’s social importance. Easy to be understood by non-experts and sponsors
LR and Information Access– cont’d • LR can be enriched or created through usage in IA systems Ex. Search Engine Query logs Users’ Relevance judgments, click through, etc.
Multi-linguality • Axes to characterize CLIA systems • Languages • Type of media • Tasks and users • Success criteria or relevance judgments • Document genres • Layers of CLIR technologies • Information access process [Kando 2002; Gey, Kando & Peters 2002]
Layers of Cross-Lingual IA Technologies; pragmatic layer: cultural & social aspects, semantic layer: concept mapping syntactic lager: lexical layer: language identify, indexing symbol layer: character codes physical layer: network [Kando 1999; 2002; Gey, Kando & Peters 2002]
Multi-linguality in Pragmatic layer • Pragmatic layer of CLIA technologies • include issues related to text structure, intra & inter- text relationship • identifying the differences of the viewpoints across the languages or cultures is also critical.
Emerging Areas • Esp. Conjunction with WEB, • Heterogeneous Document Genres • Subjectivity, Opinion, Emotion, etc. • Community-based or Domain-specific Ontology • Multi-faceted Ontology • Interactivity • Multi-modal
Summary • LR and Information Access • Multi-linguality across Culture • Emerging Areas: Genres, Opinion, Subjectivity, Community-based Ontology
Thanks Merci Danke schön Gracie Gracias Ta! Tack Köszönöm Kiitos Terima Kasih Khap Khun Ahsante Tak 謝謝 ありがとう