1 / 59

Data Formats and Codecs

INF SERV – Media Storage and Distribution Systems:. Data Formats and Codecs. 1/9 – 2003. Why codecs and formats?. Codecs (coders/decoders) Determine how information is represented Important for servers and distribution systems Required sending speed Amount of loss allowed Buffers required

mooresamuel
Télécharger la présentation

Data Formats and Codecs

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. INF SERV – Media Storage and Distribution Systems: Data Formats and Codecs 1/9 – 2003

  2. Why codecs and formats? • Codecs (coders/decoders) • Determine how information is represented • Important for servers and distribution systems • Required sending speed • Amount of loss allowed • Buffers required • … • Formats • Determine how data is stored • Important for servers and distribution systems • Where is the data? • Where is the data about the data?

  3. Media data

  4. Medium: "Thing in the middle“ here: means to distribute and present information Media affect human computer interaction The mantra of multimedia users Speaking is faster than writing Listening is easier than reading Showing is easier than describing Media data

  5. Time-independent media Text Graphics Discretemedia Time-dependent media Audio Video Animation Continuousmedia Interdependant media Multimedia "Continuous" refers to the user’s impression of the data, not necessarily to its representation Combined video and audio is multimedia - relations must be specified Dependence of Media

  6. Properties of a Multimedia System • Flexibility • Provide mechanisms to handle all kinds of media, in particular, discrete and continuous media • A VCR and a desktop publishing system for text and graphics are no multimedia systems • An editor with voice annotation is a multimedia system • Integration • Independent media storage • Computer-controlled media combination • Definition A multimedia system is characterized by theintegrated computer-controlled handling of independent discrete and continuous media

  7. Multimedia: Not Your Ordinary Data • Multimedia is different from traditional digital data: • High data volume • Continuous streaming • Several related streams • Quality of services

  8. High Data Volume • Throughput: • Higher volume than for traditional data • Longer transactions than for traditional data • Requires • Performance and bandwidth • Resource management techniques • Compression • Typical values • Uncompressed video: 140 – 216 Mbit/s • Uncompressed audio (CD): 1.4 Mbit/s • Uncompressed speech: 64 Kbit/s • Compressed audio & video (VoD): down to 1.2 – 4 Mbit/s • Compressed audio & video (Conf.): down to 128 Kbit/s • Compressed speech: down to 6.2 Kbit/s

  9. Coding for distribution

  10. Compression - Necessity • E.g., video sequence • 25 images/sec. • PAL standard • 3 byte/pixel • YUV (luminance + 2 chrominance values) • RGB (red-green-blue values) • Image resolution 640 * 480 pixel • Data rate = 640 * 480 * 3 Byte * 25/s = 23040000 byte/s ~ 22 MByte/s • Approx. 1/100 stream over ADSL • Approx. 1/16 stream over Ethernet • Approx. 1/2 stream over Fast Ethernet • Compression is necessary

  11. Compression – General Requirements • Dependence on application type: • Dialoguemode • Retrievalmode

  12. Compression – Mode Dependent Requirements • Dialogue and retrieval mode requirements: • Synchronization of audio, video, and other media • Dialogue mode requirements: • End-to-end delay < 150ms • Compression and decompression in real-time • Symmetric • Retrieval mode requirements: • Fast forward and backward data retrieval • Random access within 1/2 s • Asymmetric • We look mainly at retrieval mode!

  13. Compression Categories

  14. Basic Encoding Steps

  15. Run-Length Coding • Assumption • Long sequences of identical symbols • Example

  16. Bit-Plane Coding • Assumption • Even longer sequences of identical bits • Example 10,0,6,0,0,3,0,2,2,0,0,2,0,0,1,0, … ,0,0 (absolute) 0,x,1,x,x,1,x,0,0,x,x,1,x,x,0,x, … ,x,x (sign bits) 1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, … ,0,0 (MSB) 0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0, … ,0,0 (MSB-1) 1,0,1,0,0,1,0,1,1,0,0,1,0,0,0,0, … ,0,0 (MSB-2) 0,0,0,0,0,1,0,0,0,0,0,0,0,0,1,0, … ,0,0 (MSB-3) (0,1) (2,1) (0,0)(1,0)(2,0)(1,0)(0,0)(2,1) (5,0)(8,1) Up to 20% savings over run-length coding can be achieved

  17. Huffman Coding • Assumption • Some symbols occur more often than others • E.g., character frequencies of the English language • Fundamental principle • Frequently occurring symbols are coded with shorter bit strings

  18. Huffman Coding • Example • Characters to be encoded: • A, B, C, D, E • Probability to occur: • p(A)=0.3, p(B)=0.3, p(C)=0.1, p(D)=0.15, p(E)=0.15

  19. Huffman • Table and example of application to data stream

  20. JPEG • “JPEG”: Joint Photographic Expert Group • International Standard: • For digital compression and coding of continuous-tone still images: • Gray-scale • Color • Since 1992 • Joint effort of: • ISO/IEC JTC1/SC2/WG10 • Commission Q.16 of CCITT SGVIII • Compression rate of 1:10 yields reasonable results

  21. JPEG • Very general compression scheme • Independence of • Image resolution • Image and pixel aspect ratio • Color representation • Image complexity and statistical characteristics • Well-defined interchange format of encoded data • Implementation in • Software only • Software and hardware • “Motion JPEG” for video compression • Sequence of JPEG-encoded images

  22. JPEG • Sequence of compression steps • Different resolutions possible • Lossy or lossless mode • lossless compression factor ~1,6:1 • Symmetrical codec

  23. JPEG – Baseline Mode: Quantization • Use of quantization tables for the DCT-coefficients • Map interval of real numbers to one integer number • Allows to use different granularity for each coefficient

  24. JPEG – 4 Modes of Compression

  25. Motion JPEG • Use series of JPEG frames to encode video • Pro • Lossless mode – editing advantage • Frame-accurate seeking – editing advantage • Arbitrary frame rates – playback advantage • Arbitrary frame skipping – playback advantage • Scaling through progressive mode – distribution advantage • Min transmission delay = 1/framerate – conferencing advantage • Supported by popular frame grabbers • Contra • Series of JPEG-compressed images • No standard, no specification • Worse, several competing quasi-standards • No relation to audio • No inter-frame compression

  26. International Standard Video codec for video conferences at p x 64kbit/s (ISDN): Real-time encoding/decoding, max. signal delay of 150ms Constant data rate Intraframe coding DCT as in JPEG baseline mode Interframe coding, motion estimation Search of similar macroblock in previous image and compare Position of this macroblock defines motion vector Difference between similar macroblocks H.261 (px64)

  27. International Standard: Compression of audio and video for playback (1.5 Mbit/s): Real-time decoding Sequence of I-, P-, and B-Frames: Random access at I-frames at P-frames: i.e. decode previous I-frame first at B-frame: i.e. decode I and P-frames first MPEG (Moving Pictures Expert Group)

  28. MPEG-2 • From MPEG-1 to MPEG-2 • Improvement in quality • From VCR to TV to HDTV • No CD-ROM based constraints • Higher data rates • MPEG-1: about 1.5 MBit/s • MPEG-2: 2-100 MBit/s • Evolution • 1994: International Standard • Also later known as H.262 • Prominent role for digital TV in DVB (digital video broadcasting) and DVD (digital video disk) • Commercial MPEG-2 realizations available

  29. MPEG-2 • Beyond MPEG-1: • Higher quality encoding • Higher data rates • Interleaved modes • Use cases • Broadcast quality production • DVB-T: Terrestrial • DVB-S: Satellite • DVB-C: Cable • Program Stream • for post-processing, storage, and DVD distribution • Transport Stream • for broadcasting, error resilience • Scaling: • Signal to Noise Ration (SNR) scaling - progressive compression error correcting codes • Spatial scaling - several pixel resolutions • Temporal scaling - frame dropping

  30. MPEG-4 • MPEG-4 (ISO 14496) originally • Targeted at systems with very scarce resources • To support applications like • Mobile communication • Videophone and E-mail • Max. data rates and dimensions (roughly) • Between 4800 and 64000 bits/s • 176 columns x 144 lines x 10 frames/s • Further demand • To provide enhanced functionality to allow for analysis and manipulation of image contents

  31. MPEG-4 • Hence: find standardized ways to • Represent units of aural, visual or audiovisual content • audio/visual objects" or AVOs • object coding independent of other objects, surroundings and background • natural and synthetic objects • Compose these objects together • i.e. creation of compound objects that form audiovisual scenes • Multiplex and synchronize the data associated with AVOs • for transportation over network channels providing a QoS (Quality-of-Service) • Interact with the audiovisual scene generated at the decoder’s site

  32. MPEG-4: Scope • Definition of • „System Decoder Model“ • specification for decoder implementations • Description language • binary syntax of an AV object’s bitstream representation • scene description information • Corresponding concepts, tools and algorithms, especially for • content-based compression of simple and compound audiovisual objects • manipulation of objects • transmission of objects • random access to objects • animation • scaling • error robustness

  33. MPEG-4: Scope • Targeted bit rates for video and audio: • VLBV core • „Very Low Bit-rate Video“ • 5 - 64 Kbit/s • image sequences with CIF resolution and up to 15 frames/s • Higher-quality video • 64 Kbit/s - 4 Mbit/s • quality like digital TV • Natural audio coding • 2 - 64 Kbit/s

  34. MPEG-4: Video and Image Encoding • Encoding / decoding of • Rectangular images and video • coding similar to MPEG-1/2 • motion prediction • texture coding • Images and video of arbitrary shape • as done in conventional approach • 8x8 DCT or shape-adaptive DCT • plus coding of shape and transparency information • Encoder • Must generate timing information • speed of the encoder clock = time base • desired decoding times and/or expiration times • by using time stamps attached to the stream • Can specify the minimum buffer resources needed for decoding

  35. MPEG-4: Composition of Scenes • Scene description includes: • Tree to define hierarchical relationships between objects • Objects’ positions in space and time • by converting the objects’ local coordinate system into a global coordinate system • Attribute value selection • e.g. pitch of sound, color, texture, animation parameters • Description based on some VRML concepts • VRML = „Virtual Reality Modeling Language“ • Interaction with scenes • e.g. change viewing point, drag object, start/stop streams, select language

  36. MPEG-4: Example of a Composition

  37. MPEG-4: Synthetic Objects • Visual objects: • Virtual parts of scenes • e.g. virtual background • Animation • e.g. animated faces • Audio objects: • „Text-to-speech“ • speech generation from given text and prosodic parameters • face animation control • „Score driven synthesis“ • music generation from a score • more general than MIDI • Special effects

  38. MPEG-4: Error Handling • Mobile communication: • Low bit-rate (< 64 Kbps) • Error-prone • MPEG-4 concepts for error handling: • Resynchronization • enables receiver to „tune in“ again • based on markers within bitstream • Data recovery • enables receiver to reconstruct lost data • encode data in an error-resilient manner • Error concealment • enables receiver to bridge gaps in data • e.g. by repeating parts of old frames

  39. Network-aware coding

  40. Network-aware coding • Adapt to reality of the Internet • Content • Is created once, off-line • Is sent many times, under different circumstances • No guarantees concerning • Throughput • Jitter • Packet loss • Sending rate • Must adhere to rules • Often: don’t send more than TCP would • Can’t send at the best available encoding rate

  41. Approaches • Simulcast • Scalable coding • SNR Scalability • Temporal Scalability • Spatial Scalability • Fine Grained Scalability • Multiple Description Coding

  42. 3 simulcast rates Simulcast • Choose a set of sending rates • During content creation • Encode content in best possible quality below that sending rate • During transmission • Choose version with the best admissable quality Best possible quality at possible sending rate Quality Single rate codec Sending rate

  43. Enhancement layer Best possible quality at possible sending rate Quality Base layer Sending rate Scalable coding • Typically used asLayered coding • A base layer • Provides basic quality • Must always be transferred • One or moreenhancement layers • Improve quality • Transferred if possible

  44. Temporal Scalability • Frames can be dropped • In a controlled manner • Frame dropping does not violate dependancies • Low gain example: B-frame dropping in MPEG-1

  45. SNR Scalability • SNR – signal-to-noise ratio • Idea • Base layer • Is regularly DCT encoded • A lot of data is removed using quantization • Enhancement layer is regularly DCT encoded • Run Inverse DCT on quantized base layer • Subtract from original • DCT encode the result • If enhancement layer arrives at client • Add base and enhancement layer before running Inverse DCT

  46. 73 72 61 75 83 -1 2 -12 10 Spatial Scalability • Idea • Base layer • Downsample the original image (code only 1 pixel instead of 4) • Send like a lower resolution version • Enhancement layer • Subtract base layer pixels from all pixels • Send like a normal resolution version • If enhancement layer arrives at client • Decode both layers • Add layers Base layer Less data to code Enhancement layer Better compression due to low values

  47. Fine Grained Scalability • Idea • Cut of compressed tail bits of samples • Base layer • As in SNR coding • Enhancement layer • Use bit-plane coding for enhancement layerinstead of run-level coding • Cut tail bits off until data rate is reached

  48. Best possible quality at possible sending rate Goal of FGS Quality Sending rate Fine Grained Scalability (0,1) (2,1) (0,0)(1,0)(2,0)(1,0)(0,0)(2,1) (5,0)(8,1)

  49. Multiple Description Coding • Idea • Encode data in two streams • Each stream has acceptable quality • Both streams combined have good quality • The redundancy between both streams is low • Problem • The same relevant information must exist in both streams • Old problem: started for audio coding in telephony • Currently a hot topic

  50. Multimedia File Formats

More Related