Basics of Compression

Basics of Compression • Goals: • to understand how image/audio/video signals are compressed to save storage and increase transmission efficiency • to understand in detail common formats like GIF, JPEG and MPEG

What does compression do • Reduces signal size by taking advantage of correlation • Spatial • Temporal • Spectral

Compression Issues • Lossless compression • Coding Efficiency • Compression ratio • Coder Complexity • Memory needs • Power needs • Operations per second • Coding Delay

Compression Issues • Additional Issues for Lossy Compression • Signal quality • Bit error probability • Signal/Noise ratio • Mean opinion score

Compression Method Selection • Constrained output signal quality? – TV • Retransmission allowed? – interactive sessions • Degradation type at decoder acceptable? • Multiple decoding levels? – browsing and retrieving • Encoding and decoding in tandem? – video editing • Single-sample-based or block-based? • Fixed coding delay, variable efficiency? • Fixed output data size, variable rate? • Fixed coding efficiency, variable signal quality? CBR • Fixed quality, variable efficiency? VBR

Fundamental Ideas • Run-length encoding • Average Information Entropy • For source S generating symbols S1through SN • Self entropy: I(si) = • Entropy of source S: • Average number of bits to encode  H(S) - Shannon • Differential Encoding • To improve the probability distribution of symbols

Huffman Encoding • Let an alphabet have N symbols S1 … SN • Let pi be the probability of occurrence of Si • Order the symbols by their probabilities p1 p2 p3 …  pN • Replace symbols SN-1 and SN by a new symbol HN-1 such that it has the probability pN-1+ pN • Repeat until there is only one symbol • This generates a binary tree

Huffman Encoding Example • Symbols picked up as • K+W • {K,W}+? • {K,W,?}+U • {R,L} • {K,W,?,U}+E • {{K,W,?,U,E},{R,L}} • Codewords are generated in a tree-traversal pattern

Properties of Huffman Codes • Fixed-length inputs become variable-length outputs • Average codeword length: • Upper bound of average length: H(S) + pmax • Code construction complexity: O(N log N) • Prefix-condition code: no codeword is a prefix for another • Makes the code uniquely decodable

Huffman Decoding • Look-up table-based decoding • Create a look-up table • Let L be the longest codeword • Let ci be the codeword corresponding to symbol si • If ci has li bits, make an L-bit address such that the first li bits are ci and the rest can be any combination of 0s and 1s. • Make 2^(L-li) such addresses for each symbol • At each entry, record, (si, li) pairs • Decoding • Read L bits in a buffer • Get symbol sk, that has a length of lk • Discard lk bits and fill up the L-bit buffer

Constraining Code Length • The problem: the code length should be at most L bits • Algorithm • Let threshold T=2-L • Partition S into subsets S1 and S2 • S1= {si|pi > T} … t-1 symbols • S2= {si|pi T} … N-t+1 symbols • Create special symbol Q whose frequency q = • Create code word cq for symbol Q and normal Huffman code for every other symbol in S1 • For any message string in S1, create regular code word • For any message string in S2, first emit cq and then a regular code for the number of symbols in S2

Constraining Code Length • Another algorithm • Set threshold t like before • If for any symbol si, pi T, set pi= T • Design the codebook • If the resulting codebook violates the ordering of code length according to symbol frequency, reorder the codes • How does this differ from the previous algorithm? • What is its complexity?

q Golomb-Rice Compression • Take a source having symbols occurring with a geometric probability distribution • P(n)=(1-p0) pn0 n0, 0<p0<1 • Here is P(n) the probability of run-length of n for any symbol • Take a positive integer m and determine q, r where n=mq+r. • Gm code for n: 0…01 + bin(r) • Optimal if • m = has log2m bits if r >sup(log2m)

Golomb-Rice Compression • Now if m=2k • Get q by k-bit right shift and r by last r bits of n • Example: • If m=4 and n=22 the code is 00000110 • Decoding G-R code: • Count the number of 0s before the first 1. This gives q. Skip the 1. The next k bits gives r

The Lossless JPEG standard • Try to predict the value of pixel X as prediction residual r=y-X • y can be one of 0, a, b, c, a+b-c, a+(b-c)/2,b+(a-c)/2 • The scan header records the choice of y • Residual r is computed modulo 216 • The number of bits needed for r is called its category, the binary value of r is its magnitude • The category is Huffman encoded

Basics of Compression

Basics of Compression

Presentation Transcript

Compression

Data Compression Basics

Compression

Compression

Compression

TTM4142 Networked Multimedia Systems Video Basics Lossy Compression

CS 414 – Multimedia Systems Design Lecture 10 – Compression Basics and JPEG Compression (Part 4)

Compression

Compression

Basics of Data Compression

Multimedia Systems Lecture 6 – Basics of Compression

Compression

Compression of Wood

Compression

CS 414 – Multimedia Systems Design Lecture 10 – Compression Basics and JPEG Compression (Part 4)

Compression

Compression of documents