130 likes | 243 Vues
This document provides an overview of the European Commission's initiatives in language technology and machine translation through DG INFSO. It covers funding programs from FP3 to FP7, highlighting successful projects, emerging trends, and the evolving policy landscape surrounding language technology. The emphasis is on fostering innovation in multilingual online content management and developing methodologies for adaptive, self-learning machine translation systems. The role of data-driven approaches and community collaboration in shaping the future of language technology is also addressed.
E N D
Language Technology in European Funding Programmes Kimmo RossiDG Information society and MediaUnit E1 – Language Technology, Machine Translation INFSO-E1 12/09/2008
DG INFSO E in Luxembourg INFSO-E1 12/09/2008
Introducing DG INFSO • DG Information Society and Media • 10 directorates • not only (research) funding agency: policy making, market regulation • strategic policy framework: i2010 • most sizeable implementation task: ICT programme of FP7 (9.1 B€) INFSO-E1 12/09/2008
Directorate E – Digital Content & cognitive systems • located in Luxembourg • 7 units, 130 people • Main themes: • Digital libraries • Public sector information • Language technology • Cognitive systems, robotics • Safer Internet • content production, (re-)use INFSO-E1 12/09/2008
Unit E1 – Language Technology, Machine Translation • Until 30/6/2008: Interaction & Interfaces • New name implies more emphasis on (novel) language technology • Changed political, technologic and societal context • EU of 23 official languages • Breakthrough of data-driven MT • Community-based approach to content production and use (Web 2.0) INFSO-E1 12/09/2008
Brief history (1993-2003) • MLIS-FP3-FP4-LE-ESPRIT • Dedicated Language Technology projects (Machine Translation, translation tools, terminology, standardisation, LRs) • some important seed technology (e.g. Translator’s Workbench, EBMT) • About 100 small projects (less than 1 MEUR) • FP5 (5th framework programme) • Applications of Language Technology (cars, mobile services, information retrieval, knowledge mgmt) • 90+ projects, 130+ MEUR funding • Mid-sized projects: 0.7 – 2.5 MEUR INFSO-E1 12/09/2008
From history to present • FP6 (6th framework programme) • 28 (20 with LT) projects, 120 MEUR • Larger projects (up to 15 MEUR) • Non-linguistic interaction: about 12 projects: TnD, TAI-CHI, ARTTS, SATIN, … • “Pure” language technology: TC-STAR, TALK, EUROMATRIX, SMART, LUNA, … • Multimodal interaction: CHIL, AMI, AMIDA, … • FP7 (7th framework programme) • Challenge 2: cognitive systems, robotics, interaction • Objective 2.2: language-based interaction INFSO-E1 12/09/2008
Current projects • EuroMatrix (www.euromatrix.net ) • Intoducing linguistics into SMT • Factored SMT (treelet alignment, morphology) • Improvement by various means (combining engines etc.) • Inventory of MT systems and language resources for SMT • SMART (www.smart-project.eu ) • Optimising SMT (algorithms, parameters, combining SMT engines) • Machine learning techniques (e.g. kernel methods) • towards adaptive MT • ITALK (italkproject.org ) • iCub robot learns language by doing • integrates action, social and linguistic skills, cognitive development • language learning from scratch, for simple tasks • ROSSI (www.rossiproject.net ) • how language development is linked to physical experience • sensori-motor grounding of human conceptualisation and language use • neurologic basis underlying the use of verbs and nouns • grounding of object affordances INFSO-E1 12/09/2008
Current projects • ALEAR (www.alear.eu ) • language evolution in robot populations • baseline: language games (Luc Steels), recruitment theory • self-organisation of conceptual frameworks and communication systems • POETICON (www.poeticon.eu ) • linking sensori-motor representations with linguistic ones • extending Lexicon into Praxicon (grammar and lexicon of action) • EMIME (www.emime.org ) • unified modelling of speech recognition and synthesis • personalised speech synthesis (“your voice speaking chinese”) • reversibility of statistical modelling techniques • FlareNet (http://www.ilc.cnr.it/flarenet/) • thematic network on Language Resources • what are Language Resources? • promotes interaction of LR stakeholders • inventory and roadmap for action • open for new members – join now! INFSO-E1 12/09/2008
Trends • New requirements – new approaches • From Web 1.X to Web 2.0 – we are all content producers • From static and uni-directional to dynamic, volatile, interactive, collaborative • Translations are needed “on the fly” • Are language technologies up to the task? • What happens to online content • Disappearing document • What is on the Internet? Who knows? Google? • Europa web site: 6 million “documents” • Disappearing distinction between content and service • How to manage (automatically?) the multilingual online “content” • From service to self-service • Travelling, banking, house-buying … • Need for language-literate systems • Multilingualism on the rise • In the EU (from 4 to 23 languages) – and the global dimension • Online content becomes more multilingual • English gains ground – but mother tongues remain INFSO-E1 12/09/2008
Challenges • Machine translation – new paradigms • adaptive, self-learning MT systems • MT that learns from its mistakes • learning through understanding and vice versa • Is there any future for RBMT? • Bringing scientific communities together • learning a common language among researchers • detecting common interest & mutual benefit • SC’s can learn from each other • Language resources • Exploit the hidden treasures (e.g. public sector resources) • Improve usability of existing resources • Identify and address gaps in coverage • Reusable, standardized, automated collection (e.g. from Web) • Towards automation: harvesting LR’s from the Web INFSO-E1 12/09/2008
Funding opportunities ahead? • Language-based interaction: priorities will be defined in next FP7 Work programme (covering call 4): http://cordis.europa.eu/fp7/dc • Online multilingualism: creation of a single European information space. Watch out for future actions in the ICT-PSP programme: http://ec.europa.eu/information_society/activities/ict_psp INFSO-E1 12/09/2008
Thank You for your attention Contact: Kimmo.Rossi@ec.europa.eu INFSO-E1 12/09/2008