340 likes | 350 Vues
This research explores the challenges and solutions for streaming high quality video and audio in immersive environments, and aims to create a testbed for immersive applications. The focus is on distributed immersive performances, such as remote musical duets. The research investigates low latency transmission, synchronization, and recording of media streams across networks. The goal is to enable immersive experiences with high fidelity audio and high definition video.
E N D
More Pixels and Samples:High Resolution Media Streaming Roger Zimmermann Data Management Research Laboratory University of Southern CaliforniaLos Angeles, CA 90089 http://dmrl.usc.edu
Outline • Motivation • Background • Remote Media Immersion • Distributed Immersive Performance • High-performance Data Recording Architecture • Demonstration • Conclusions
Motivation • The charter of the Integrated Media Systems Center (IMSC) is “Immersipresence” • Immerse real (e.g. people) and virtual elements into a common space • Becomes much more interesting in a distributed environment • Many sub-problems: tracking, gesture recognition, data management, … • Video and audio are an important component
What is the problem? • Live streaming is either • Low to medium quality, or • Very expensive, i.e., there are only a few people to call … • Other obstacles • Complicated (not like the telephone) • Often requires room engineering • Network bandwidth is not available • Some of the technical constraints can and will be solved
Ex.: Network Infrastructure • UTOPIA (Utah Telecommunications Open Infrastructure Agency): public works project to provide fiber to the home (FTTH). • SuperNet, Alberta, Canada. Public project to provide a high speed Internet infrastructure. • NSF sponsored workshop, Oct. 23-24, 2003, Chicago, Illinois. The importance of “broaderband” networks is recognized.
Research Timeline 2002 Jun 2-3 Unveiling of RMI Demonstration Oct 29 Internet2 Meeting: RMI Demonstration Dec 28 DIP Experiment 1: Distributed Duet 2003 Jan 18 Recording from Stream Jan 19 DIP Experiment 2: Remote Master Class Jun 2-3 DIP Experiment 3: Duet with Audience 2004 Jan 29 APAN Meeting: HYDRA Experiment
What is the RMI? “The goal of the Remote Media Immersion system is to build a testbed for the creation of immersive applications.” Immersive application aspects: Multi-model environment (aural, visual, haptic, …) Shared space with virtual and real elements High fidelity Geographically distributed Interactive
RMI Challenges • Immersive, high-quality video acquisition and rendering • High Definition video 1080i and 720p (40 Mb/s) • Immersive, high-quality audio acquisition and rendering • 10.2 channels of uncompressed audio (12 Mb/s) • Storage and transmission of media streams across networks • Synchronization between streams (A/V, A/A, V/V)!
ISI East IMSC RMI Experimental Setup • Synchronized immersive audio and HDTV streamed playback from Yima server over Internet2 • 16 channels of immersive audio, uncompressed at 16 Mb/s • 1920x1080i HDTV content, MPEG-2 compressed at 40 Mb/s • Control of end-to-end process: capturing, network interface, transmission, rendering
Internet2 Fall ‘02Member Meeting Video: HDTV 1280x720p Audio: 10.2 channel, immersive soundsystem New World Symphony, Miami, FL
Distributed Immersive Performance • Outgrowth of Remote Media Immersion (RMI) • Create seamless immersive environment for distributed musicians, conductor (active) and audience (passive) • Compelling relevance for any human interaction scenario: education, journalism, communications • Scenario: • Orchestra not available in town • Famous soloist cannot fit travel into schedule • Multiple soloists in different places
60 ms 20 ms 40 ms 30 ms 10 ms 30 ms Challenge: network latency
Key observations: • Network latency maps to audio delay on stage • Video delay is zero • Challenge: • Synchronization • Transmitting low latency video of conductor to players and audience • Maintaining constant delay between players Player 1 15m: 45ms 15m: 45ms Conductor Player 2 10m: 30ms
Barriers and Requirements 1. Real-time continuous media (CM) stream transmission (network protocol) with low latency 2. Precise timing: GPS clock, synchronization 3. Data loss management: error concealment, FEC, retransmission, multi-path streaming 4. Many-to-many transmission capability 5. Low latency, high-quality real-time video and audio acquisition and rendering 6. Real-time CM stream recording 7. User experiments, requirements, specifications, performance evaluation
Distributed Immersive Performancev.1.0-The Duet • Experiments and Objectives • Experimental testbed and demonstration system • Demonstrate and document a distributed musical performance with two musicians (a duet) • Two-way interactive video and 10.2 channel immersive audio capability • Explore other applications involving passive and active participants, such as two-site interactive meetings • Evaluate technical barriers and psychophysical effects of latency and fidelity on music and other forms of human interaction between two interconnected sites • Dennis Thurmond - USC Thornton School of Music • Elaine Chew - USC Industrial and Systems Engineering
Distributed Immersive Performancev.1.0-The Duet Linux PC Linux PC DV FireWire Camera DV FireWire Camera DV FireWire Camera 100BaseT campus net 100BaseT IMSC net 350meters Ramo Hall of Music (RHM 106) Powell Hall (PHE 106) • Video: NTSC resolution, 31 Mb/s DV, software decode, one-way latency: 110 ms due to DV camera compression + < 5 ms network • Audio: uncompressed, 16 or more channels at 1 Mb/s each, one-way latency: < 10 ms due to audio processing + < 5 ms network
HYDRA Streaming Architecture • Most previous work in streaming media has focused on the retrieval and playback functionality. • More and more devices directly output digital media streams: • E.g., camcorders (FireWire, USB, SDI),microphones (Bluetooth), mobile handsets (3G) • Need for a backend data stream recording /playback system (“Super TiVo”) • HYDRA (High-performance Data Recording Architecture) [ICEIS 2003]
Challenges • Variable bit rate media streams • Multi-zoned disks • Different read and writetransfer rates
Live Streaming • Latency is a crucial limiting factor: • Only ~ 20-40 ms is unnoticeable (foruniversal interactive applications) • Tradeoff: Latency versus bandwidth • Compression reduces bandwidth • But: high compression increases latency(e.g., interframe MPEG compression) • Approach: • Perform experiments within this design spacee.g. DV: NTSC resolution, 31Mb/s, SW/HW codecse.g. uncompressed audio and video
ArchitectureHYDRA HD Live Streaming • Acquisition and rendering PC are both Linux based (RH 9 includes kernel support for FireWire). • MPEG transport stream extraction. • Data transport via UDP packets with single retransmissions JVC HD10U HD-SDI RTP/ UDP/IP VGA Display FireWire MPEG-2 Decoder MPEG TS Extractor
Rendering • Solution 1: Software based rendering • Use X11 hw acceleration: XvMC (libmpeg2) • Motion compensation and iDCT with GPU • Our hw: NVIDIA FX 5200 ($100) • Performance: ~ 90 fps @ 1280x720 with 3 GHz P4
Rendering • Issues with software rendering • Precise timing: 29.97 fps • Decoding time for I, P, and B frames varies • Buffering of decoded frames necessary to achieve precise timing • Transport stream splitter and audio decoding • Video card refresh rate (timing) is independent of MPEG timing, but • Non-standard display modes are possible: 720p on Linux (16x9) • Decoding latency
Rendering • Solution 2: Hardware based rendering • E.g.: CineCast HD board from Vela Research • Digital HD-SDI and analog RGB/YPrPb outputs • Great and stable picture (but $$$) • Genlock input for synchronization
Rendering • Issues with hardware rendering • Linux drivers hard to come by • CineCast HD board uses SCSI interface • Wrote our own SCSI extensions to the Linux SCSI Generic driver (/dev/sg0) • Decoding latency: requires 8 x 64 kB to start decoding • Consumer HD card:Telemann HiPix ($400)But: No Linux drivers(no Windows filters?) • New Vela card:CineCast HD LE
Distributed Immersive Performance v.2.0-Extended Architecture • Conflicting requirements: Low latency and low bandwidth (i.e., use of compression) • Solution - two-tier architecture: • Between performers • Low latency stereo audio streaming • Low latency video streaming • Between performers and audience • High definition video streaming • Multichannel audio streaming (10.2 channel) • Recording of all streams sychronously for archival purposes and later playback.
Multichannel audio Stereo audio Low latency, low resolution video High latency, high resolution video Performer 1 Performer 2 Playback and Recording Audience
Thank You! Questions? • More info at: • Data Management Research Lab • http://dmrl.usc.edu • Integrated Media Systems Center • http://imsc.usc.edu • Acknowledgments: • Kun Fu, Beomjoo Seo, Shihua Liu, Dwipal A. Desai, Didi Shu-Yuen Yao, Mehrdad Jahangiri, Farnoush Banaei-Kashani, Rishi Sinha, Hong Zhu, Nitin Nahata, Sahitya Gupta, Vasan N. Sundar,