1 / 24

June 10th, 2003

IST Proposal MobiNews Meeting - June 10th, 2003 “Automatic and Personalised Compilation of Broadcast News with Audio Playback on Mobile Devices”. François CAPMAN, PhD Research Engineer, Technologies Radio & Signal Unit francois.capman@fr.thalesgroup.com Tel : +33 (0) 1 46 13 29 63

hovan
Télécharger la présentation

June 10th, 2003

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. IST ProposalMobiNews Meeting - June 10th, 2003“Automatic and Personalised Compilation of Broadcast News with Audio Playback on Mobile Devices” François CAPMAN, PhD Research Engineer, Technologies Radio & Signal Unit francois.capman@fr.thalesgroup.com Tel : +33 (0) 1 46 13 29 63 Fax : +33 (0) 1 46 13 25 55 • June 10th, 2003

  2. 10h00 - 10h15 Agenda, objectives of the meeting • 10h15 - 10h30 Presentation of MobiNews IST proposal , current status • 10h30 - 11h30 Presentation of each organisation 1 (5mn/10mn) • 11h30 - 11h45 Break • 11h45 - 12h15 Presentation of each organisation 2 (5mn/10mn) • 12h15 - 12h45 Definition of contributions and overall structure of the project • 12h45 - 13h45 Lunch • 13h45 - 15h15 Detailed structure of the project, description of work-packages • 15h15 - 15h45 Other topics (additional partners, ...) • 15h45 - 16h00 Further steps, planning for the proposal • 16h00 - 16h30 Discussion - Conclusion • MobiNews Workshop Agenda

  3. Call 2: publication 17/6 2003, closing 15/10 2003 – would have an indicative budget of around 525 MEuros (80 % pre-distributed). • Objectives covered in Call 2 • Advanced displays • Optical, opto-electronic, & photonic functional components • Open development platforms for software and services • Cognitive systems • Embedded systems • Applications and services for the mobile user and worker (60 MEuros) • Cross-media content for leisure and entertainment (55 MEuros) • GRID-based Systems for solving complex problems • Improving Risk management • eInclusion •  Specific Targeted Research Project (STREP) : 2.5 / 3.0 MEuros (Funding) • IST Objectives (2nd Call)

  4. 2.3.2.7 Cross-media content for leisure and entertainment Objective: To improve the full digital content chain, covering creation, acquisition, management and production, through effective multimedia technologies enabling multi-channel, cross-platform access to media, entertainment and leisure content in the form of film, music, games, news and alike. It will accelerate take up in B2B, B2C and C2C, currently hampered by insufficient productivity, convergence and high cost. Focus is on: – Developing technologies supporting the creation of new, compelling forms of content for interactive, creative or artistic consumption. Research should aim at advancing imaging technologies and audio-visual representation, multi-dimensional immersive environments and experience portals, as well as virtual, augmented and mixed reality technologies featuring higher levels of quality and accuracy. Device adaptivity and contextualisation, personalisation and (emotive) feedback, and ability to capture real-time, multimodal and multisensorial input will be embedded as needed. – Developing integrated content programming environments allowing to retrieve content from different sources, types and locations, and to store, compress and categorise it, with a view to realising programming appropriate to a particular audience and delivery channel, including interactive TV, e-cinema, radio, online games and music. • IST Objectives (2nd Call)

  5. 2.3.2.6 Applications and Services for the Mobile User and worker Objective: To foster the emergence of rich landscape of innovative applications and services for the mobile user and worker and to support the use and development of new work methods and collaborative work environments. These should be based on interoperable mobile, wireless technologies and the convergence of fixed and mobile communication infrastructures. Such applications and services will enable new business models, new ways of working, improved customer relations and government services in any context. The target applications and services will be capable of being seamlessly accessed and provided anywhere, anytime and in any context. Focus is on: – The integration of technologies into a wide range of innovative mobile and multimodal applications and services including workplace designs that enhance creativity and productivity. (Intelligent, adaptive and self-configuring services that deploy wearable interfaces and enable automatic context-sensitivity, user profiling and personalisation in a trusted and secure environment as well as multi-lingual and multi-cultural presentation, and multiple modes of interaction) – Addressing the major hurdles for the deployment of applications and services for the mobile user. • IST Objectives (2nd Call)

  6. Targeted Application • Automatic compilation of broadcast news (audio, text) with audio playback on mobile devices (2.5G, 3G). • Access to personally selected text and audio news from a service/source provider using Multimedia Messaging Service (MMS) transmission protocol. • Expected Features • Fast and reliable access to synthetic newscast on a regular basis (daily, weekly, …) or upon request. • Access to various identified sources within the same compilation, using scheduled programme. • Automatic server-based generation of the synthetic newscast, with MMS WAP 2.0 Low-cost transmission towards mobile devices. • User-defined profile for automatic download • Enhanced Man Machine Interface (MMI) for queries’ submission, key-word-based search, ... • MobiNews Proposal

  7. Technical Objectives • Audio data and Text data Structuring: • automatic / semiautomatic segmentation (speaker tracking, scheduled programme, …) • classification, discrimination (speech, music, jingles, …) • transcription and information retrieval (word-spotting, key-words, …) • automatic summarisation • Very Low Bit Rate (VLBR) Wide-Band speech compression (with optional scalable audio stage). • Text-To-Speech (TTS) synthesis for audio display of the transmitted text component (optional voice conversion, style / prosody mimicking). • Software optimisation (complexity and memory) of VLBR decoder and TTS modules for embedded solutions on mobile devices (downloadable as plug-ins). • Enhanced interface for mobile products (Natural Language Processing (NLP), …) • Demonstrator with MMS link between a PC-based server and a handheld mobile terminal. • MobiNews Proposal

  8. MobiNews Proposal

  9. MobiNews Proposal

  10. VLBR compression for MobiNews Targeted duration: 10 to 15 minutes in one single MMS  VLBR between 800 and 1200 bits/sec

  11. Definition of Work Packages • WP 1 Project management • WP 2 Analysis of the needs, analysis of the market, dissemination • WP 3 Broadcast radio news databases (specifications, collect, recordings) • WP 4 Audio and text data structuring • WP 5 Very-Low Bit Rate (VLBR) compression for synthetic newscast • WP 6 Text-To-Speech (TTS) synthesis for mobile devices • WP 7 MMS-based demonstrator (Server and mobile applications, MMI, …) • WP 8 Evaluation methodology, field trials, analysis • MobiNews Work Packages

  12. Thales Communications (France) • L.I.A. (France) • E.N.S.T. (France) • E.S.I.E.E. (France) • Elan Speech (France) • Brno University of Technology (Czech Republic) • Multitel (Belgium) • INESC-ID (Portugal) • PT Inovação, Voice services and platforms Dept (Portugal) • Radio France Multimedia (France) • Belga Press Agency (Belgium) • Portuguese Radio/TV (Portugal) ??? • MobiNews Consortium

  13.  General Presentation and Potential Contributions to MobiNews • 1 - Gwenaël Guilmin (Thales Communications) • 2 - Bertrand Ravera : RNRT project proposal Mobi-Info • 2 - Corinne Fredouille (L.I.A.) • 3 - Maurice Charbit (E.N.S.T.) • 4 - Geneviève Baudoin (E.S.I.E.E.) • 5 - Jacques Toën (ELAN SPEECH) • 6 - Petr Motlicek (BRNO University of Technology) • 7 - Stéphane Deketelaere (MULTITEL) • 8 - Isabel Trancoso (INESC-ID) • 9 - Nuno Beires (PT INOVACAO) • 10 - Caroline Roy (RADIO France MULTIMEDIA) • Presentation of organisations

  14. Thales Communications: • Speech segmentation / classification • Very-Low Bit Rate speech compression using parametric approaches • optimisation of VLBR for a mobile plug-in • E.N.S.T.: • voice conversion using improved HNM synthesis, • joint-optimisation of speech units for coding and synthesis • E.S.I.E.E.: • Very Low Bit Rate speech compression using recognition/synthesis • Very Low Bit Rate speech compression using parametric approaches • voice conversion • joint-optimisation of speech units for coding and synthesis • BRNO University of Technology: • Very Low Bit Rate speech compression using recognition/synthesis • Contributions

  15. ELAN SPEECH: • distributed architecture (mobile/server) for speech synthesis • optimisation for a mobile plug-in • voice personalization, voice conversion • INESC-ID, and L.I.A.: • audio data structuring • MULTITEL: • Man-Machine Interface, Natural Language Processing • PT INOVACAO: • MMS synthetic newscast packaging • MMS-based demonstrator • Radio France Multimedia, and Belga Press Agency (+ Portuguese TV/rad) • specifications • news content provider • evaluation • Contributions

  16. WP 2: Analysis of the market, … needs, dissemination: • WP2.1: Analysis of the market: existing services • WP2.2: Analysis of the needs: limitations of the existing services • WP2.3: Dissemination: valorisation of the outcome of the project, standardisation, ... • MobiNews: WORKPLAN

  17. WP 3: Broadcast radio news databases • WP3.1: Audio databases (collect, recordings, annotation, meta-data, …) • WP3.2: Text databases (collect, annotation, meta-data, …) • WP3.3: Service specifications (features, user acceptance, …) • MobiNews: WORKPLAN

  18. WP 4: Audio and Text data Structuring • WP4.1: Low-level segmentation • speech/non speech discrimination (silence, noise, pause, speech, music, jingle, …) • speaker characterisation (identification, tracking, segmentation, clustering, …) • WP4.2: High-level segmentation • speech-to-text transcription • story segmentation, topic detection, tracking and classification • WP4.3: Customisation • text summarisation, audio summarisation • constrained summarisation (profile-driven, queries-driven, duration, multi-sources, …) • meta-data information • evaluation methodology (reference human-built summaries, quiz scores, …) • MobiNews: WORKPLAN

  19. WP 5: VLBR Speech / Audio compression • WP5.1: Segmental-based parametric compression of synthetic newscast • audio stream analysis and segmentation • optimised compression of structured messages • scalable solutions (bit-rate and bandwidth) • WP5.2: Compression based on natural speech units indexing • optimised HNM-based speech synthesis • speaker-independent mode (speaker adaptation, voice conversion) • joint-optimisation of units for both synthesis and coding • compression of synthesis units for memory storage optimisation • MobiNews: WORKPLAN

  20. WP 6: Text-To-Speech synthesis for mobile devices • WP6.1: Voice conversion / customisation • WP6.2: Optimisation for mobile terminals • complexity reduction • memory storage • distributed software architecture • MobiNews: WORKPLAN

  21. WP 7: User-centred design of the MMI • (Man Machine Interface) • WP7.1: Server-based application • optimised entries for the definition of user profile, user queries, ... • WP7.2: Mobile embedded application • design of an efficient mobile interface with emphasis on the ease-of-use and the acceptability (= usability) • MobiNews: WORKPLAN

  22. WP 8: MMS-based demonstrator • WP8.1: Server-based applications • module for data structuring • module for audio compression • MMS packaging • WP8.2: Mobile devices embedded applications • MMS de-packaging • optimised plug-in for text-to-speech synthesis • optimised plug-in for audio decompression • MobiNews: WORKPLAN

  23. WP 9: Evaluation methodology, Field trials, Analysis • WP9.1: Evaluation methodologies • audio quality for speech synthesis and compression • evaluation of synthetic newscast (summarisation) • evaluation of MMI (queries, profile, …) • WP9.2: Field trials and analysis • quiz score methods • … ? • MobiNews: WORKPLAN

  24. the project proposal will include: • A1 form: proposal acronym, proposal number, proposal title, estimated duration (30 months ?), key word codes, abstract (co-ordinator) • A2 form: participant submission form (for each participant) • A3 form:financial information (co-ordinator) • B part: non-anonymous description of scientific/technological objectives • Administrative Issues

More Related