170 likes | 225 Vues
Image Compression - JPEG. Video Compression. MPEG Audio compression Lossy / perceptually lossless / lossless 3 layers Models based on speech generation (throat), or ear characterisitics Image compression JPEG based Images take much more memory than voice
E N D
Video Compression • MPEG • Audio compression • Lossy / perceptually lossless / lossless • 3 layers • Models based on speech generation (throat), or ear characterisitics • Image compression • JPEG based • Images take much more memory than voice • An image is worth a thousand words • Which thousand words? • Video – next week, can we extrapolate?
Image Compression Basics • Model driven • Reduce data redundancy • Neighboring values on a line scan in an image • DPCM, predictive coding • Human perception properties • Human visual system {eye/brain} is more sensitive to some information as compared to others {low frequencies vs high frequencies}: be careful..edges are often critical • Enhancement approaches
Entropy • Entropy – measurement of the uncertainty of the input. Higher the uncertainty the higher the entropy. • Which has higher entropy noise or a 300Hx sine wave? • Computation is histogram based • p(i) = probability of occurrence of a gray level in the image • E = - Si p(i) lg {p(i)} • Identifies the minimum number of bits required to represent the image
Compression Issues • Progressive display • Display partially decompressed images • User begins to see parts of the image, does not have to wait for complete decompression • Hierarchical encoding • Encode images at multiple resolution levels. • Display images at lower resolution level and then incrementally improve the quality • Asymmetry • Time for encoding • Time for decoding
Types of compression • Lossless • Huffman, LZW, Run length, DPCM? • Typical compression: 3:1 • Lossy • Predictive • Frequency based: transform, subbands • Spatial based: filtering, non-linear quantization, vector quantization • Hybrid
JPEG is based on • Huffman coding • Optimal entropy encoding • Run length encoding • Used in G3, fax • Discrete Cosine Transform • Frequency based • Apply perception rules in the frequency domain • The fidelity and level of compression can be controlled – 15:1 or even better
Huffman encoding • Assign fewer bits to symbols {pixel values} that occur more frequently • Number of bits per symbol is non-uniform • The code book has to be made available to the decoder, i.e. this file leads to increase in the file size. • Results in optimal encoding • Number of bits required is close to the entropy
Run length Encoding • Run length, size, amplitude • RL: 4 bits • Size: 4 bits • Amplitude: 10 bits • Maximum compression if the run lengths are long • G3 used for fax • Usually use Huffman to encode the parameters
Discrete Cosine Transform • Real cousin of Fourier transform • Complexity • N*N • Fast DCT – similar to FFT • To reduce cost • Divide image into 8 x 8 blocks • Compute DCT of blocks • Reduce the size of the object to be compressed
Quantization • The eye is more sensitive to the lower frequencies. • Divide each frequency component by a constant • Divide higher frequency components with a larger value • Truncate, and this will reduce the non-zero values • Four quantization matrices are available in JPEG
Color • RGB planes • Transform RGB into YUV • Y – luminance • U,V – chrominance • UV have lower spatial resolutions • Down sampled to take advantage of lower resolution
Overview of JPEG • RGB YUV • Down sample UV • Original data is 8 bits per pixel, all positive [0,255]. Shift to [-128, 127]. • Divide image into 8x8 blocks • DCT on each block • Use quantization table to quantize values in each block {Reducing high freq content} • Use zig-zag scanning to order values in each block • Organize data into bands {DC, low f, mid f, high f} • Run length encoding • Huffman encoding
Reference • G. K. Wallace, “The JPEG Still Picture Compression Standard”, Communications of the ACM, April 1991, vol 34, No. 4, pp 30 - 44