1 / 70

ViSiCAST 2002 Technical Audit

ViSiCAST 2002 Technical Audit. 4 October 2002, Brussels Michele Wakefield - Project Manager, ITC. The ViSiCAST Project. Vi rtual Si gning C apture A nimation S torage and T ransmission. Aims of ViSiCAST Project.

chars
Télécharger la présentation

ViSiCAST 2002 Technical Audit

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. ViSiCAST 2002 Technical Audit 4 October 2002, Brussels Michele Wakefield - Project Manager, ITC

  2. The ViSiCAST Project Virtual Signing Capture Animation Storage and Transmission

  3. Aims of ViSiCAST Project “…support improved access by deaf citizens to information and services in sign language” • by successfully developing signing systems for • broadcast, WWW & ‘over the counter’ type applications • user friendly methods to capture & generate signs • machine readable system to describe gestures • ... preferred medium is sign language

  4. Instituut voor Doven Hamburg University University of East Anglia Independent Television Commission Televirtual The Post Office Institut National des Télécommunications Royal National Institute for Deaf People Institut für Rundfunktechnik ViSiCAST Consortium

  5. Project Dimensions • Duration • Start : January 2000 • Finish : December 2002 • 36 months • Total Costs • 3770kECU total • 2876kECU funding from EC

  6. ViSiCAST Project Highlights • Signing transmissions demonstrated at IBC 2002 • MPEG-4 compliant INT-IRT demonstrator to deliver an open signing service for broadcast DTV • BBC demonstrator to deliver closed DTV signing service • Translate simple sentences in real time to sign animation • WWW Weather-forecaster launched in the Netherlands • Interactive sign language learning tool • 2nd trial of TESSA system now nationwide and RNID re-promoting ViSiCAST • after success of pilot at Science Museum, London • encouraging national media coverage

  7. Animation Linguistics Internet Community Broadcast Evaluation Exploitation ViSiCAST Project Structure Technology User Application Exploitation &Dissemination

  8. Presentations by Core Streams • Technology: Animation & Linguistics • WP4 Animation • WP5 Linguistics • User: Applications • WP2 Sign Tutor • WP1 Broadcast • WP2 WWW • WP3 Face to Face • WP6 Usability • Exploitation & Dissemination

  9. Presentation by Streams - Animation • WP 4 Animation • Increased realism in sign generation • Enhanced signing experience • WP5 Sign Language Linguistics • Use of natural sign language • Synthesis of sign language gestures

  10. Animation Work: Objectives • WP4: • Develop Hi-Resolution Avatars + related capture, and animation • To enable and support application development in WPs 1-2-3 using WP4 (& WP5) Product • To further develop, compare and integrate both proprietary and standard solutions, where appropriate, in networked environments

  11. Technology: WP4, Animation • At start of Year: • Visia 2 • Running in Mask 1 • Using Motion Capture Data only • Reasonable animation, expression etc.

  12. Technology: WP4, Animation • Visia 2 in MPEG-4 • Mesh partitioned into anatomical segments • MPEG-4 compliant authoring tool • Animation editing tool • Server-client tool for TX of animation parameters • MPEG-4 SNHC player<25fps • Embedded within an MPEG-4 set-top box

  13. Technology: WP4, Animation • Visia 3 • Updated Virtual Human • Higher resolution & polygon count, more realistic photographic textures • Improved articulation • Mesh distortion applied to garments • Facial expression via skeleton manipulation & morphs • Speech Enabled

  14. Technology: WP4, Animation • Visia 3 • New host software - Mask TNG • Writing new Active X Controls • Superior functionality, lighting and Camera FX, image quality, frame rate, flexibility etc. • >75 FPS

  15. Technology: WP4, Animation • Visia 3 • Running in Mask TNG Graph

  16. Technology: WP4, Animation • Facial Morphs • Created in Maya, exported to Mask TNG • Based on Sign Language expressions (BSL Dictionary) • Inter-operable • Variable weighting (<100%+) • May be used with Mo-Cap data or for synthetic sign

  17. Technology: WP4, Animation • Facial Animation - Experimental Work • Tracking of Active Shape Models • Tracking of Active Appearance Models

  18. Technology: WP4, Animation • Facial Animation - Experimental Work • Vision-based motion capture of facial expressions using MPEG-4 compliant templates.

  19. WP4: Synthetic Animation - Introduction • Task: • Make avatar do signing synthetically • as specified by ViSiCAST’s Signing Gesture Markup Language - SiGML • Motive: • Synthetic animation is more flexible than animation via motion-capture - “just write some more SiGML” • Support Natural-Language-to-Animation strategy of WP4-5 • In broadcasting applications: put synthetic player on receiver and transmit SiGML - very low bandwidth

  20. WP4: Synthetic Animation - Context • Televirtual Avatar is a deformable textured Mesh • Mesh shape and position are determined by configuration of underlying Skeleton • skeleton configuration: a.k.a. “Bone-Set” • To animate avatar: need to generate stream of Bone-Sets - one per frame of animation • i.e. BAF data stream - BAF = “Bones Animation Format” • Data intensive: 4Kb per bone-set

  21. WP4: Synthetic Animation - Technical Approach • SiGML specifies gestures through: • Postures: • hand shape • hand orientation - palm and extended finger direction • position of hand(s) in signing space • Motions - straight-line, circular, zig-zag etc. • Synthetic Animation Engine: • specifies hand bone configuration for given posture • configures arm/shoulder bones using Inverse Kinematics • implements transition from one posture to next using non-linear interpolation - often via control system modelling

  22. WP4: Synthetic Animation - Progress (i) • Initial Prototype (D4-2) delivered 2001-12 • Supported most of manual SiGML • Implemented in Perl (interpreted scripting language) • BAF/VRML output to file - and then to avatar • Relatively slow - often < 15 fps • Perl module packaged as ActiveX control • relatively unwieldy architecture • Enhancements for 2002-02 (M5-11) • BAF data stream cached in memory-fed directly to avatar • Front-end(for WP5): HamNoSys input server, with built-in HamNoSys-to-SiGML translation

  23. WP4: Synthetic Animation - Progress (ii) • HamNoSys-to-Signing (Fast) 2002-06 • Synthetic Animation Engine re-implemented in C++ • 50 times faster - generates approx. 1000 fps, supporting real-time streamed input (e.g. Broadcast, WWW) • More flexible framework - basis for improved authenticity • Modular system architecture - supports flexible application development, scripting in WWW pages, etc. • Upgrade to Mask2 2002-09 • Interface to new primitive Mask2 ActiveX control • allows better control of animation frame scheduling • BAF replaced by VBM (ViSiCAST Bones and Morphs) - provides framework for support of non-manual SiGML

  24. Presentation by Streams - Linguistics • WP 4 Animation • Increased realism in sign generation • Enhanced signing experience • WP5 Sign Language Linguistics • Use of natural sign language • Synthesis of sign language gestures

  25. WP 5: Language Technology • Goal within the project: • To provide semi-automatic translation from English into BSL, DGS, NGT • Can also be used to assist the user in monolingual language input • No writing system for sign languages established

  26. Presentation by Streams • Animation and Linguistics • User Applications • Exploitation and Dissemination

  27. Presentation by Streams - Sign Tutor • WP2 Sign Tutor • WP1 Television • Closed signing for Broadcast DTT • WP2 Internet • Information and Education for Sign Language Learners • WP3 Face to Face • High Street Post Office Counter Services • WP6 Comparison of virtual signing • with video-recorded Human Signing

  28. Presentation by Streams - Television • WP2 Sign Tutor • WP1 Television • Closed signing for Broadcast DTT • Enhanced signing experience • Regulation and Standards • WP2 Internet • Information and Education for Sign Language Learners • WP3 Face to Face • WP6 Comparison of virtual signing

  29. Virtual Humans on TV: The Advantages Low transmission rate < 25 kbit/s Compatibility with signing on other media and sign languages Precise, sharp representation of signer Open display options Compliance with international standards: MPEG, DVB Future-proof: cost saving allows vast no. of signed programmes unified framework from video-based to VH signing

  30. Broadcast VH Signing:Achievements Integrated TX system for broadcast to STBs Implementing virtual human s/w in STB MPEG-2 delivery layer for maximum compliance: with existing hardware with MPEG & DVB standards with proprietary formats MPEG-4 Audio-Video codec and player MPEG-4 compliant virtual human MPEG-4 SNHC virtual human codec and player MPEG-4 based closed signing service demonstrated at IBC 2002

  31. MPEG-2 AV decoder MPEG-2 AV encoder MPEG-4 multimedia player MPEG-4 video encoder MPEG-4 video decoder Proprietary Multimedia player Compositor dePacket deMUX MUX Packet MPEG-4 SNHC encoder MPEG-4 SNHC decoder BAF encoder BAF decoder MPEG-2 TS Compositor Decoder Encoder System System Delivery Broadcast VH Signing:Functional architecture normative proprietary

  32. UDP/TCP packetiser IRT-DSP MPEG encoder IP filter DVB receiver card RF modulator Broadcast VH Signing:System layer implementation MPEG-2 TS Compositor Encoder Decoder System System Delivery

  33. Broadcast VH Signing:Perspectives Advanced TX system for broadcast to MHP compliant STBs Open, MPEG & DVB compliant architecture Improved synchronisation layer Integrating a compositing layer Implementing an enriched MPEG-4 multimedia authoring tool Integrating SiGML stream

  34. Demonstration

  35. Presentation by Streams -WWW - Web pages with signing Field trials • WP2 Sign Tutor • WP1 Television • Closed signing for Broadcast DTT • WP2 Internet • Information and Education for sign language learners • Web-pages with signing • WP3 Face to Face • High Street Post Office Counter Services • WP6 Comparison of virtual signing

  36. Internet ‘play list’ content provider web-browser + plug-in forecast creation tool Weather Forecast Application 1rst DEMO 2nd DEMO user avatar weather signs

  37. Demo

  38. The field trials with Deaf users • Hosting at site of Dutch Deaf organisation Dovenschap: www.dovenschap.org • Running from end-June until end-October • Deaf users can join the field trial by filling in a form on the website • CD-rom with necessary software sent to users

  39. Field Trial Promoted • 70 e-mails to webmasters of Deaf clubs, Deaf schools, Deaf organisations and private sites of Deaf persons • promotion on Teletext (T.V.) • on informative websites for Deaf people • visit at meeting of national Deaf organisation with 12 member organisations • article in magazine for sign language interpreters • 30 CD-roms sent to Deaf clubs and schools

  40. Trial Feedback • Helpdesk, contacted by e-mail • Discussion page on website • Evaluation form: software and installation, included with receiving software • Evaluation form: avatar and sign language, will be sent end of October 2002

  41. Present Situation • Field trial still running • News slowly spreading • Positive reactions • Results at the end of November

  42. Presentation by Streams – Face to Face • WP2 Sign Tutor • WP1 Television • Closed signing for Broadcast DTT • WP2 Internet • WP3 Face to Face • High Street Post Office Counter Services • Close involvement with RNID • WP6 Comparison of virtual signing • with video-recorded Human Signing

  43. WP3 Overview • Evaluation – October 2001 • New TESSA system – Mar 2002 • Post Office Trial – May 2002 – Present • Sign Recognition – April 2002 – Present

  44. Evaluation – October 2001 • Evaluation conducted at PO concept store using TESSA V3. • 10 Deaf People and 5 Counter Clerks participated over 10 days. • Mirror of previous evaluation + Some comparative tests of virtual signing with a video recorded human signer (full details in WP6 presentation)

  45. Evaluation – Observations • Clerks complained about the speed of transactions • Caused by : • Toggle switch for recogniser • Mis-recognitions caused by large vocabulary • Poor mapping from recognised speech to phrases • Cumbersome graphical interface

  46. Tessa V4 – Recognition System • ‘Bag of words’ language model. • Only words relevant to post office phrases recognised • Many fewer insertion errors • More resilient to external noise Hello Goodbye Going Where First Second Class …

  47. TESSA V4 – Phrase Mapping • Phrase mapping system derived from work on Automatic Call Routing • Represent each of the signed phrases and the test phrase as vectors in a co-occurrence matrix Phrase N Phrase 2 Phrase 3 Phrase 1 A 0 2 0 . . . 1 About 0 0 0 . . . 0 Access 1 0 0 . . . 0 Account 0 1 1 . . . 0 . . . . . . . . . . . . . . . . . . . . . . . . You 0 0 0 . . . 0 you’ve 0 0 0 . . . 1 Your 1 0 0 . . . 0

  48. TESSA V4 – Phrase Mapping • Weight the entry W(i,j) such that : • Calculate distance between vectors representing each canonical phrase and input phrase. More details in S. Cox. “Speech and Language Processing for a Constrained Speech Translation System”. In Proc. Int. Conf. On Spoken Language Processing. October 2002 M.Lincoln and S.Cox. “A Comparison of Language Processing Techniques for a Constrained Speech Translation System” (Submitted ICASSP 2003)

  49. TESSA V4 - Mapping Evaluation • Subset of 155 phrases. • 5 Talkers, each asked to • write down another way of expressing the phrase • record speaker saying this phrase • Recognise speech (NB No Adaptation) • 75.1% Correct ; 49.8% Accurate • Test phrase mapping on both text and recognised speech

  50. TESSA V4 – Mapping Evaluation

More Related