1 / 42

The Long March ( 長 征 ) to 3D Video

The Long March ( 長 征 ) to 3D Video. Leonardo Chiariglione Speech at 3D Systems and Applications Seoul – 2014/05/28. It has already been not a short march. Analogue Printing Photography Telegraphy Telephony Audio recording Radio Television Video recording. Digital

Télécharger la présentation

The Long March ( 長 征 ) to 3D Video

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The Long March (長征)to 3D Video Leonardo Chiariglione Speech at 3D Systems and Applications Seoul – 2014/05/28

  2. It has already been not a short march Analogue • Printing • Photography • Telegraphy • Telephony • Audio recording • Radio • Television • Video recording Digital • Video conference • Video telephony • Video interactive • Television • (3D TV)

  3. The dimensions of future media • Time/space resolution • Screen content • Colour • Brightness • Scalability • 3D Video • 3D Audio • Metadata • File format • Sensors/actuators • Human interaction • Fusion of real & virtual • Detection/analysis • Linking • Energy saving • User profile

  4. There has been progress in resolution • QSIF • SIF • Standard Definition (interlace) • High Definition (Interlaced/progressive) • 4k (Progressive) • 8k (Progressive)

  5. The cost of being digital Video Audio

  6. Compression is making progress affordable

  7. Are there limits to compression? • Input bandwidth to humans • Eyes: 2 channels of 430–790 THz • Ears: 2 channels of 20 Hz – 20 kHz • A nerve fiber connecting senses to the brain can transmit a new impulse every ~6ms = 167 spikes/s (1 bit ~16 spikes) • Eye • 1.2 M fibers transmit 10 bit/s each • An eye sends ~12 Mbit/s to brain • Ear • 30 k fibers in the cochlear nerve • An ears sends ~300 kbit/s to brain

  8. Sensors-to-brain bitrates 430– 790 THz 1.2M nerve fibers ~12Mbit/s 0.020-20kHz ~0.3Mbit/s 30k nerve fibers

  9. High Dynamic Range and Wider Color Gamut • Higher Dynamic Range and Wider Color Gamut can give users a better sense of “being there”, with a viewing experience closer to real life experience • Light bulb > 10,000 nits • Surface lit in the sunlight > 100,000 nits • Night sky < 0.005 nits • Question: if dynamic ranges and volumes of the color gamut increase significantly, are existing MPEG video coding standards able to efficiently support future needs?

  10. Wider Color Gamut ITU-R BT.709 ITU-R BT.2020

  11. Dynamic Range- Examples Bright areas can have > 10,000 Cd/m2 luminance Dark areas can have < 0.01 Cd/m2 luminance

  12. Screen Content applications • Wireless display • Companion screen • Control rooms with high resolution display wall • Digital operating room (DiOR) • Virtual desktop infrastructure (VDI) • Screen/desktop sharing and collaboration • Cloud computing and gaming • Factory automation display • Supervisory control and data acquisition (SCADA) display • Automotive/navigation display • PC over IP (PCoIP) • Ultra-thin client • Remote sensing

  13. Use case #1: Hi-res display wall

  14. Use case #2: collaboration

  15. Use case 3: DiOR

  16. Where we are • Janury 2014: Joint Call for Proposals for Coding of Screen Content • April 2014: Proposals evaluation • Conclusion: evidence that significantly improved coding efficiency can be obtained by exploiting screen content characteristics with novel dedicated coding tools • April 2014: Standardization plan and tentative time line • First Test Model: Apr. 2014 • PDAM: Oct. 2014 • DAM: Feb. 2015 • FDAM: Oct. 2015

  17. Test sequence #1 (text and graphics with motion)

  18. Test sequence #2 (text and graphics with motion)

  19. Test sequence #3 (mixed content)

  20. Test sequence #4 (animation)

  21. MPEG standards for coding multiple cameras • A long history, starting from MPEG-2 (mid 1990s) • MPEG standards (existing and under development) • Multiview coding – can only display views captured at the source • Depth-based coding – can also display limited number of additional views • Camera arrangement: cameras are assumed to be linearly arranged

  22. Free viewpoint television (FTV)/1 • Free viewpoint television (FTV): a hypothetical 3D transmission system that enables a viewer to select arbitrarys viewpoints, inside and outside a scene • FTV requires many technologies, not just from MPEG • A 3D video format supporting the generation of views not already included in the bitstream generated by the encoder would be a major enabler for FTV. • Purpose of MPEG FTV exploration: to develop the know-how to enable MPEG to develop the said 3D video format

  23. Free viewpoint television (FTV)/2 • Areas considered in the MPEG FTV exploration • Compare and evaluate the depth quality attainable for general camera arrangements • Evaluate view synthesis algorithms and improve their performance • To investigate the coding efficiency of the most promising coding technologies currently available • To investigate the influence of mis-registration on the View Synthesis performance • To investigate the representation capability of BIFS to clarify the elements that need to be standardized

  24. FTV Seminar A Viewing Revolution in the Making Date: 2014 July 8 T14:00-18:00 Venue: Main Hall B, Sapporo Convention Center Sapporo, Japan Exhibition of FTV demos Room 101, 10:00-17:00, July 1 to 4.

  25. 3D Audio – NHK Loudspeaker Array Frame

  26. Parallel worlds • For centuries humans have been building two different types of worlds Knowledge Books Physical Informational Films Music

  27. Immersion • A definition of immersion: a state in which connections of a human with • Physical world are severed • Informational world are activated

  28. How far is immersion progressing? Fairly… …or too far?

  29. Can we reconnect the two worlds? • Smartphones • Enable universal access to the Informational world while sensing also the Physical world • Enhance history and meaning of the real world with powerful digital elements • Let’s create two-way bridges • Extend reality to virtual • Add reality to virtual Physical & Informational 29

  30. Functions of an Augmented Reality browser • Retrieve scenario from the internet • Start video acquisition and track objects • Recogniseobjects and recover camera pose • Get streamed 3D graphics and compose new scenes • Get input from various sensors • Access interaction possibilities and objects from a remote server • Adapt to offer optimal AR experience

  31. The AR technology chain Local Real World Environment Remote Real World Environment Remote Sensors & Actuators Local Sensors & Actuators MPEG ARAF Authoring Tools ARAF Browser User ARAF Augmented Reality Application Format Media Servers Service Servers

  32. Augmented Reality Application Format • A set of MPEG-4 scene graph nodes • Audio, image, video, graphics, programming, communication, user interactivity, animation • Map, MapMarker, Overlay, ReferenceSignal, ReferenceSignalLocation, CameraCalibration, AugmentedRegion • Connection to sensors defined in MPEG-V • Orientation, Position, Angular Velocity, Acceleration, GPS, Geomagnetic, Altitude, Local camera(s) • Compressed media • Image, (3D) sound, (3D) video, 2D/3D graphics ARAF

  33. The whole used to be the message Classic Books: the value is in the content as a whole

  34. Today the link adds value to the message On-line knowledge: the value is in the link

  35. The video used to be the message Classic video content: the value is in the content as a whole

  36. Next the link will add value to the video message New video content: the value is in the link From EU FP7 BRIDGET project

  37. An unequal fight • Many new services – all more demanding in bandwidth • Compression improves, but cannot cope with all the demands just by itself • UHD is 4 times the uncompressed bitrate of HD, but HEVC “only” compresses two times AVC) • And we have HDR, WCG, SCC, FTV… • At prime time 30% of USA internet is taken by Netflix traffic • We need more tools to solve the problem

  38. The mobile industry perspective 10 x more spectrum 10 x better spectrum utilisation 10 x more base stations 1000 x more capacity X X =

  39. Making the network smarter • Video has lion’s share of internet traffic – more so as we add more dimensions to the user experience • We need to cope with (human-vehicle) mobility • More and more of human life happens on the move • We need new smarter approaches instead of just throwing more network capacity, beyond • Digital video recording (on premises or networked) • Peer-to-Peer (P2P) Overlays • Content Distribution Networks (CDNs)

  40. Video and Information Centric Networking Migration path from today’s IP infrastructure to pub/sub support for ICN Same content available at different network locations Information Centric Network IP Network Client - content - network mobility under energy consumption constraints From FP7/NICT EU-JAPAN GreenICN project

  41. Green MPEG Media Pre-processor Media Encoder Media Media Encoded Media Encoded Media Media Decoder Presentation Subsystem Green Meta- data Generator Green Meta- data Generator Green Metadata Green Metadata Green Meta- data Extractor Power control Power control Power control Power control Green Feedback Green Feedback Power optimization module Power optimization module

  42. http://mpeg.chiariglione.org/

More Related