Créer une présentation
Télécharger la présentation

Télécharger la présentation
## 4C8 Dr. David Corrigan

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -

**4C8Dr. David Corrigan**Jpeg and the DCT**2D DCT**DCT Basis Functions**Each band is the same size and there are 64 bands in total**so the entropy is**Slow DCT**• Sledgehammer implementation for 8 point DCT • Each row multiply requires 8 multiplications and 7 additions • So for all 8 rows requires 64 multiplications and 56 additions. • For the full 2D transform it is 1024 mults and 896 adds per 8x8 block!!!**Fast DCT**• Exploit Symmetry • So split Matrix T into two parts...**Fast DCT**• split Matrix T into two parts, change y...**Fast DCT**8 “adds/subtractions” to calculate the RHS vectors and 2 x 16 multiplications and 2 x 12 additions for the matrix multiplications. = 32 adds and 32 multiplications Compare with 56 adds and 64 mults from before.**Fast DCT**This sub-matrix can be simplified with symmetry again!**Fast DCT**So the 16 mults and 12 adds for the 4 x 4 matrix multiplication can be replaced with 4 adds/subs to calculate the RHS vectors and 2 x 4 mults and 2 x 2 adds to do the matrix multiplications. In total that requires 8 mults and 8 adds for this operation and a further 8 add/subs and 16 mults and 12 adds from before. That is 28 adds and 24 mults in total**Fast DCT**We can rewrite this operation as So the 4 mults and 2 adds are replaced with 2mults and 2 adds. So in total we have 2 adds and 2 mults and 2 adds and 4 mults for the non-symmetric 2*2 matrix mult + a further 4 adds and 12 adds and 16 mults for the non-symmetric 4*4 matrix mult + a further 8 adds which is 28 adds and 22 mults in total**JPEG and Colour Images**• JPEG uses YCBCRcolourspace. • The chrominance channels are usually downsampled. • There are 3 commonly used modes • 4:4:4 – no chrominance subsampling • 4:2:2 – Every 2nd column in the chrominance channels are dropped. • 4:2:0 – Every 2nd column and row is dropped. • The DCT is applied separately on each channel.**Subjectively Weighted Quantisation**• In JPEG it is standard to apply different thresholds to different bands**Subjectively Weighted Quantisation**• These values are obtained by perceptual tests. • A user is asked to view an image of a particular size on at specified distance from the screen. • Usually a expressed as a proportion of the screen height. • User is presented with an image and is asked to increase the gain of a given band until he/she just notices a difference in the image. • Note typically a flat grey image is used to avoid masking effects caused by edges and texture • The set of form the quantisation matrix.**Subjectively Weighted Quantisation**• Lower Frequency Bands are assigned lower step sizes. • There is a slight drop of in step size from the DC coefficient to low frequency coefficients. • The step sizes for the chrominance channels increase faster than for luminance.**Comparing Different Quantisations**JPEG Uncompressed Qstep = Qlum**Comparing Different Quantisations**Qstep = Qlum PSNR = 32.9 dB**Comparing Different Quantisations**JPEG Uncompressed Qstep = 2 * Qlum PSNR = 30.6 dB**Comparing Different Quantisations**Qstep = 15 Qstep = Qlum Qstep = 15 PSNR = 37.6 dB**Comparing Different Quantisations**Qstep = 30 Qstep = Qlum Qstep = 30 PSNR = 33.4 dB**Comparing Different Quantisations**PSNR indicates better quality for Qstep = 30 over Qstep = Qlum but this clearly is not true from a subjective analysis. Qstep = 30 Qstep = Qlum Qstep = 30 PSNR = 33.4 dB PSNR = 32.9 dB**Comparing Different Quantisations**Using the subjectively weighted Quantisation instead of a fixed quantisation step size achieves higher levels of compression for equivalents levels of quality.**Challenges of JPEG Coding**• Minimise average codeword length • RLC to encode the zeros. • we must take adavantage of spatial and inter-band correlations. • we need to consider how we order the data • Minimise the coding overhead • minimise the size of the huffmancodetable • we need to reduce the number of symbols we encode • This can affect optimality • Correct for Synchronisation Errors**JPEG Coding**• The most obvious way might seem to code each band separately • ie. Huffman with RLC like we suggested with the Haar Transform. • We could get close to the entropy • This is not the way it is coded because • It would require 64 different codes. High cost in computation and storage of codebooks. • It ignores the fact that the zero coefficients occur at the same positions in multiple bands.**JPEG Coding**• Instead we code each block separately • A block contains 64 coefficients, one from each band. • Each block contains 1 DC coefficient (from the top left band) and 63 AC coefficients • Two codebooks are used in total for all the blocks, one for the DC coefficients and the other for the AC coefficients. • At the end of each Block we insert an End Of Block (EOB) symbol in the datastream**Data Ordering**• Each block covers is a 8x8 grid of coeffs • A Zig-Zag scan converts them into a 1D stream. • As most non-zero values occur in the top left corner using a Zig-Zag scan maximises the lengths zero runs so improves efficiency of RLC**Zig-Zag Scan Example**Non-Zero values are at the top left corner of the block Zig-Zag scan concentrates the non-zero coefficients at the start of the stream -13, -3, 6, 0, 0, 2, 0, 0, 0, -1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 36 more zeros, the end Typical DCT Block Coefficients**Coding the DC Coefficients**Differential Coding**Coding the DC Coefficients**This value is actually the difference between the dc coefficient of the current and previous blocks Typical DCT Block Coefficients**Coding DC Coefficients**• There is potentially a large number of levels to encode. • Up to 4096 depending on the quantization step size. • We break down the symbol value into a size index pair**Coding DC Coefficients**• So if the DC value is -13 • The size is 4 • The index is 0010 • In JPEG only the size is encoded using Huffman • The index is uncoded, efficiency is not dramatically affected. • Only 12 codes required in huffman table • Table size is 16 + 12 = 28 bytes**if ,**expressed as a binary number if expressed as a binary number. So if then and which is 00100 if value = 32 then size = 6 and index = 100000 The number of bits in the index value is always equal to the value of size.**Coding the AC Coefficients**Size/Index Pair for DC coefficient The length of the run and the value of the coeff after it are strongly correlated 40010, -3, 6, 2 zeros, 2, 3 zeros, -1, 17 zeros, 1, 36 more zeros, the end The block usually ends with a long run of zeros Typical DCT Block Coefficients**Coding the AC Coefficients**• Code/Size Correlations • High coeffs follow short runs and low coeffs follow long runs • Final run of zeros • These don’t need to be coded • Just tell the encoder that there are no more non-zero coefficients and move onto the next block.**Symbols**Run/Coefficient Symbols eg. 0, 0, 9 is a run of 2 zeros followed by a 9 However we represent 9 using the size/index format from the dc coeffs 9 has a size of 4 and an index 1001 So we code the run/size pair (2,4) and the index 1001 is appended to the stream**Symbols**• Run/Size Symbols • All possible combinations of runs from 0->15 and size from 1->10 • 160 total symbols • Huffman Codes are used for each symbol • Index values are not coded further**Special Symbols**• ZRL • Used to represent a run of 16 zeros • Used when the run of zeros is greater than 15 • Eg. 17 zeros, 14 - is coded as (ZRL) (1,4) 1110 • EOB • Inserted when a block ends with a run of zeros In total there are 160 run/size symbols and 2 special symbols 162 symbols to 2 encode codetable is 16 + 162 = 178 bytes**Coding Example**-13, -3, 6, 2 zeros, 2, 3 zeros, -1, 17 zeros, 1, 36 more zeros, the end DC Coefficient is -13. The size is 4 and the index is 0010 Typical DCT Block Coefficients Current Stream State: 40010**Coding Example**-13, -3, 6, 2 zeros, 2, 3 zeros, -1, 17 zeros, 1, 36 more zeros, the end The first ac value is -3. That is a run of 0 zeros followed by -3. -3 has size 2 and index 0000 Therefore the run/size pair is (0,2) Current Stream State: 40010 (0,2) 00**Coding Example**-13, -3, 6, 2 zeros, 2, 3 zeros, -1, 17 zeros, 1, 36 more zeros, the end The next ac value is 6. That is a run of 0 zeros followed by 6. 6 has size 3 and index 110 Therefore the run/size pair is (0,3) Current Stream State: 40010 (0,2) 00 (0,3) 110**Coding Example**-13, -3, 6, 2 zeros, 2, 3 zeros, -1, 17 zeros, 1, 36 more zeros, the end The next ac value to encode is a run of 2 zeros followed by a ac coefficient 2. 2 has size 2 and index 10 Therefore the run/size pair is (2,2) Current Stream State: 40010 (0,2) 00 (0,3) 110 (2,2) 10