Introduction to JPEG, MPEG 1/2, and H.261/H.263

Introduction to JPEG, MPEG 1/2, and H.261/H.263 Chuan-Yu Cho

Outline • Video/Image Compression • Still Image Compression • JPEG/ JPEG 2000 • 'Joint Photographic Experts Group‘ • Video Compression • H.261, H.263, H.263+, MPEG-1, MPEG-2, MPEG-4, MPEG-7, MPEG-21.

Still Image Coding JPEG, JPEG2000

Image/Video Redundancy • Spatial redundancy 253 255 A B

Transform coding • Encoder • Decoder Transform Coefficients Zigzag Scan (2D->1D) Image block Bitstream T Q Entropy coding T-1 Q-1 Entropy coding Bitstream Reconstructed Image block Reconstructed Transform Coefficients Inverse Zigzag Scan (1D->2D)

Block-Based Coding • Why divide to blocks? • Image->Blocks

-415/16 = -26 2D->1D -415 -29 -62 25 55 -20 -1 3 7 -21 -62 9 11 -7 -6 6 -46 8 77 -25 -30 10 7 -5 -50 13 35 -15 -9 6 0 3 11 -8 -13 -2 -1 1 -4 1 -10 1 3 -3 -1 0 2 -1 -4 -1 2 -1 2 -3 1 -2 -1 -1 -1 -2 -1 -1 0 -1 • 16 11 10 16 24 40 51 61 • 12 12 14 19 26 58 60 55 • 14 13 16 24 40 57 69 56 • 14 17 22 29 51 87 80 62 • 18 22 37 56 68 109 103 77 • 24 35 55 64 81 104 113 92 • 49 64 78 87 103 121 120 101 • 72 92 95 98 112 100 103 99 -26 -3 -6 2 2 0 0 0 1 -2 -4 0 0 0 0 0 -3 1 5 -1 -1 0 0 0 -4 1 2 -1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 -26 –3 1 –3 2 –6 2 –4 1 –4 1 1 5 0 2 0 0 –1 2 0 0 0 0 0 –1 –1 EOB Example of JPEG Coding(Encoder) Transform coding(DCT) -26 -3 -6 2 2 0 0 0 1 -2 -4 0 0 0 0 0 -3 1 5 -1 -1 0 0 0 -4 1 2 -1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 52 55 61 66 70 61 64 73 63 59 66 90 109 85 69 72 62 59 68 113 144 104 66 73 63 58 71 122 154 106 70 69 67 61 68 104 126 88 68 70 79 65 60 70 77 68 58 75 85 71 64 59 55 61 65 83 87 79 69 68 65 76 78 94 Number->binary -26 –3 1 –3 2 –6 2 –4 1 –4 1 1 5 0 2 0 0 –1 2 0 0 0 0 0 –1 –1 EOB 1010110 0100 001 0100 0101 100001 0110 100011 001 100011 001 001 100101 11100110 110110 0110 11110100 000 1010 Quantization Zigzag Scan Entropy Coding (bit stream)

1D->2D Inverse Transform coding(DCT) -26 -3 -6 2 2 0 0 0 1 -2 -4 0 0 0 0 0 -3 1 5 -1 -1 0 0 0 -4 1 2 -1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Inverse Quantization Inverse Zigzag Scan -26 –3 1 –3 2 –6 2 –4 1 –4 1 1 5 0 2 0 0 –1 2 0 0 0 0 0 –1 –1 EOB Example of JPEG Coding(decoder) -416 -33 -60 32 48 0 0 0 12 -24 -56 0 0 0 0 0 -42 13 80 -24 -40 0 0 0 -56 17 44 -29 0 0 0 0 18 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 58 64 67 64 59 62 70 78 56 55 67 89 98 88 74 69 60 50 70 119 141 116 80 64 69 51 71 128 149 115 77 68 74 53 64 105 115 84 65 72 76 57 56 74 75 57 57 74 83 69 59 60 61 61 67 83 93 81 67 62 69 80 84 84 Binary->number 1010110 0100 001 0100 0101 100001 0110 100011 001 100011 001 001 100101 11100110 110110 0110 11110100 000 1010 -26 –3 1 –3 2 –6 2 –4 1 –4 1 1 5 0 2 0 0 –1 2 0 0 0 0 0 –1 –1 EOB Inverse Entropy Coding (bit stream)

-415 -29 -62 25 55 -20 -1 3 7 -21 -62 9 11 -7 -6 6 -46 8 77 -25 -30 10 7 -5 -50 13 35 -15 -9 6 0 3 11 -8 -13 -2 -1 1 -4 1 -10 1 3 -3 -1 0 2 -1 -4 -1 2 -1 2 -3 1 -2 -1 -1 -1 -2 -1 -1 0 -1 Example of JPEG Coding(Encoder) 52 55 61 66 70 61 64 73 63 59 66 90 109 85 69 72 62 59 68 113 144 104 66 73 63 58 71 122 154 106 70 69 67 61 68 104 126 88 68 70 79 65 60 70 77 68 58 75 85 71 64 59 55 61 65 83 87 79 69 68 65 76 78 94 DCT

-415/16 = -26 -415 -29 -62 25 55 -20 -1 3 7 -21 -62 9 11 -7 -6 6 -46 8 77 -25 -30 10 7 -5 -50 13 35 -15 -9 6 0 3 11 -8 -13 -2 -1 1 -4 1 -10 1 3 -3 -1 0 2 -1 -4 -1 2 -1 2 -3 1 -2 -1 -1 -1 -2 -1 -1 0 -1 • 16 11 10 16 24 40 51 61 • 12 12 14 19 26 58 60 55 • 14 13 16 24 40 57 69 56 • 14 17 22 29 51 87 80 62 • 18 22 37 56 68 109 103 77 • 24 35 55 64 81 104 113 92 • 49 64 78 87 103 121 120 101 • 72 92 95 98 112 100 103 99 Example of JPEG Coding(Encoder)

-415 -29 -62 25 55 -20 -1 3 7 -21 -62 9 11 -7 -6 6 -46 8 77 -25 -30 10 7 -5 -50 13 35 -15 -9 6 0 3 11 -8 -13 -2 -1 1 -4 1 -10 1 3 -3 -1 0 2 -1 -4 -1 2 -1 2 -3 1 -2 -1 -1 -1 -2 -1 -1 0 -1 Example of JPEG Coding(Encoder) -26 -3 -6 2 2 0 0 0 1 -2 -4 0 0 0 0 0 -3 1 5 -1 -1 0 0 0 -4 1 2 -1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

2D->1D -26 -3 -6 2 2 0 0 0 1 -2 -4 0 0 0 0 0 -3 1 5 -1 -1 0 0 0 -4 1 2 -1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 -26 –3 1 –3 2 –6 2 –4 1 –4 1 1 5 0 2 0 0 –1 2 0 0 0 0 0 –1 –1 EOB Example of JPEG Coding(Encoder) Transform coding(DCT) Quantization Zigzag Scan Entropy Coding (bit stream)

Video Coding MPEG I/II, H.261/H.263

Main Ideas of Still Image Coding (Intra Coding) • Block-based coding • Transform coding (DCT) • Quantization • Zagzig scan • DPCM (Differential PCM) • Entropy coding (Variable-length coding) • Huffman coding • Run-length coding • Arithmetic coding

Main Ideas of Video Coding (Inter Coding) • Intra coding • Block-based coding, transform coding, quantization, zagzig scan, DPCM, entropy coding • Inter coding • Intra coding for residual • Motion estimation/compensation

Image/Video Redundancy • Spatial redundancy • Temporal redundancy 253 255 A B Use A to code B B Frame N A Frame N-1

Encoder For Video Sequence Transform Coefficients Zigzag Scan (2D->1D) Image block Bitstream T Q Entropy coding - MC T-1 Q-1 Reconstructed Image block Reconstructed Transform Coefficients Video Compression • Encoder For Still Image

Results of DCT Coding JPEG PSNR (Peak Singal-to-Noise Ratio) MSE (Mean Square Error)

Temporal Redundancy Frame #1 Frame #2

Residual Image Frame #2 – Frame #1 =

Results of Motion Compensation Coding DCT Coding Residual Image Coded Image PSNR = 43.35 dB Bit Rate = 21957 bits/frame Compression ration= = (256 * 256 * 8) / 21957 = 23.9 PSNR = 22.68 dB, MSE=6.50, MAE=25 Bits for motion vector = 1002 bits

ITU-T Recommendation H.261(Previously “CCITT Recommendation”) Video Codec for Audiovisual Services at p×64 kbit/s Geneva, 1990: revised at helsinki, 1993

H.261 v.s. p×64 • The Recommendation H.261 describes the video coding and decoding methods for the moving picture component of audiovisual services(videophone, videoconference, etc.) at the rates of p×64 kbit/s, where p is in the range 1 to 30. • => p×64 (called p times sixty four) coder

H.261 v.s. MPEG • The H.261 specification is already implemented in several manufacturers. Its target is telecommunications at a rate as low as 64 kbits. MPEG is defined for higher bit rate – 0.9 Mbits to 1.5 Mbits and consequently for higher quality.

H.261 • Video codec for audiovisual services • ISDN Videophone and video conferencing • Low bit rates, low delay • 1984: at m×384 kbits/s (m = 1, …, 5) • 1988-90: at p×64 kbits/s (p = 1, …, 30)

H.261 Coder Video in DCT Q Inverse DCT Loop Filter Motion Compensation

Motion Estimation • For each 16*16 superblock(SB), ME searches the best match in the referenced frame, and returns a motion vector MV = (X,Y). • Both X and Y have integer value not exceeding ±15. • Only the difference (residual) between the SB and the best match is DCT encoded

Motion Estimation (22,20) (32,16) (-10,4) Current frame Referenced frame

MVD Code … … -7&25 0000 0111 -6&26 0000 1001 -5&27 0000 1011 -4&28 0000 111 -3&29 0001 1 -2&30 0011 -1 011 0 1 1 010 2&-30 0010 3&-29 0001 0 4&-28 0000 110 5&-27 0000 1010 6&-26 0000 1000 7&-25 0000 0110 … … Coding of Motion Vectors • Differential coding • VLC for MV difference • Example: • 14 -13 12 … • -1 -27 25 … 0110000101000000111 …

Motion Compensation(MC) & Motion Estimation (ME) • MC is optional for each MB. (MTYPE => MB based) • Only one MV for each MB. • The ME compares a 16x16 superblock in the luminance block (Y) throughout a small search area of the previously transmitted image. • Both horizontal and vertical components of these motion vectors have integer values not exceeding ±15. • The MV is used for all 4 Y blocks. The MV for both Cb and Cr is derived by halving the component values of the MB MV. • [NOT in H.261] The displacement with the smallest absolute superblock difference, determined by the sum of the absolute values of the pel-to-pel difference throughout the block, is considered the MV for the particular MB

Quantization • # of quantizers is 1 for INTRA dc coefficient and 31 for all other coefficients. • Within a MB, the same quantizer is used for all coefficient excepts the INTRA dc one. • The equations for the quantizer can be written in terms of the MB quantization factor, Q sometimes termed MQUANT: • C(u,v) = F(u,v) / 2Q if Q is odd • C(u,v) = (F(u,v) ±1)Q 1 if Q is even (F>0 => +-, F<0=>-+ • Quantization for INTRA dc term: • C = (F+4) / 8 with inverse F = 8C. ±

Loop Filter (FIL) • The filter is separable into one-dimensional horizontal and vertical functions. • The function is non-recursive with coefficients of ¼, ½, ¼ except at block edges. • The function has coefficients of 0, 1, 0 at block edges. • The filter is switched on/off for all 6 blocks in a MB according to MTYPE. ×¼ ×½ ×¼

H.261 Decoder Intra Inverse DCT Inter Loop Filter Motion Compensation

Decoder • Source format • Pictures are coded as luminance and two colour difference components (Y, Cb, and Cr). • CIF (Common Intermediate Format) • Y: 352 × 288 • Cb, Cr: 176 × 144

Decoder • QCIF (Quarter-CIF) • Y: 176 × 144 • Cb, Cr: 88 × 72 • CIF for NTSC (National Television System Committee) input (MPEG SIF 525) • Y: 352 × 240 • Cb, Cr: 176 × 120 • All codecs must be able to operate using QCIF. Some codecs can also operate with CIF.

H.261 Video Formats Y pixel Cb, Cr pixel Block boundary

Arrangement of H.261 352 176 176 176 48 48 288 QCIF CIF

Arrangements of data structure in H.261 176 176 144 48 GOB (Group Of Block) QCIF picture 8 16 8 8 8 16 MB (Macro Block)

Positioning of luminance and chrominance smaples Y pixel Cb, Cr pixel Block boundary

Data Structure of Compressed Bitstream in H.261 Picture Layer GOB Layer MB Layer Block Layer Fixed Length Code Variable Length Code

PSC TR PTYPE PEI … PSPARE PEI … GOB data Structure of picture layer Picture start code (PSC) (20 bits) 0000 0000 0000 0001 0000 Temporal reference (TR) (5 bits) It is formed by incrementing its value in the previously transmitted picture header by one plus the number of non-transmitted pictures since that last transmitted one. (Only the five LSBs used)

PSC TR PTYPE PEI … PSPARE PEI … GOB data Structure of picture layer Type information (PTYPE) (6 bits) Bit 1 Split screen indicator Bit 2 Document camera indicator, “0” off, “1” on; Bit 3 Freeze picture release, “0” off, “1” on; Bit 4 Source format, “0” QCIF, “1” CIF; Bit 5 Optional still image model HI_RES, “0” on, “1” off Bit 6 Spare where Bit 1 is MSB Extra insertion information (PEI) (1 bit) “1” signals the presence of the following optional data field.

GBSC GN GQUANT GEI … GSPARE GEI … MB data GOB Layer • Group of blocks start code (GBSC) (16 bits) • 0000 0000 0000 0001 (if “0000” followed, then it is treated as a PSC) • Group number (GN) (4 bits) • GN indicates the position of the group of blocks. 13, 14 and 15 are reserved for future use. 0 (0000) is used in the PSC.

GBSC GN GQUANT GEI … GSPARE GEI … MB data GOB Layer • Quantizer information (GQUANT) (5 bits) • The quantizer to be used in the GOB until overridden by any subsequent MQUENT. • Extra insertion information (GEI) (1 bit) • “1” signals the presence of the following optional data field. • Spare information (GSPARE) (0/8/16… bits) • If PEI = “1”, then the following 8-bits data is GSPARE.

MBA MTYPE MQUANT MVD CBP Block data MB Layer • Macroblock address(MBA) (Variable length: TABLE 1) • MBA indicates the position of a MB within a GOB. It is the difference between the absolute addresses of the MB and the last transmitted MB. • Type information (MTYPE) (Variable length: TABLE 2)

MBA MTYPE MQUANT MVD CBP Block data MB Layer • Quantizer (MQUANT) (5 bits) • MQUANT is present only if so indicated by MTYPE (1, 3, 6, 9).

Introduction to JPEG, MPEG 1/2, and H.261/H.263