Binary Image Compression

Binary Image Compression • The art of modeling image source • Image pixels are NOT independent events • Run length coding of binary and graphic images • Applications: BMP, TIF/TIFF • Lempel-Ziv coding* • How does the idea of building a dictionary can achieve data compression? EE465: Introduction to Digital Image Processing

From Theory to Practice • So far, we have discussed the problem of data compression under the assumption that the source distribution is known (i.e., the probabilities of all possible events are given). • In practice, the source distribution (probability models) is unknown and can only be approximately obtained by relative frequency counting method due to the inherent dependency among source data. EE465: Introduction to Digital Image Processing

An Image Example Binary image sized 100100 (approximately 1000 dark pixels) EE465: Introduction to Digital Image Processing

A Bad Model • Each pixel in the image is observed as an independent event • All pixels can be characterized by a single discrete random variable – binary Bernoulli source • We can estimate the probabilities by relative frequency counting EE465: Introduction to Digital Image Processing

Synthesized Image by the Bad Model Binary Bernoulli distribution (P(X=black)=0.1) EE465: Introduction to Digital Image Processing

Why does It Fail? • Roughly speaking • Pixels in an image are not independent events (source is not memoryless but spatially correlated) • Pixels in an image do not observe the same probability model (source is not stationary but spatially varying) • Fundamentally speaking • Pixels are the projection of meaningful objects in the real world (e.g., characters, lady, flowers, cameraman, etc.) • Our sensory processing system has learned to understand meaningful patterns in sensory input through evolution EE465: Introduction to Digital Image Processing

Similar Examples in 1D • Scenario I: think of a paragraph of English texts. Each alphabet is NOT independent due to semantic structures. For example, the probability of seeing a “u” is typically small; however, if a “q” appears, the probability of the next alphabet being “u” is large. • Scenario II: think of a piece of human speech. It consists of silent and voiced segments. The silent segment can be modeled by white Gaussian noise; while the voiced segment can not (e.g., pitches) EE465: Introduction to Digital Image Processing

Data Compression in Practice Y entropy coding binary bit stream source modeling discrete source X P(Y) probability estimation Probabilities can be estimated by counting relative frequencies either online or offline The art of data compression is the art of source modeling EE465: Introduction to Digital Image Processing

Source Modeling Techniques • Transformation • transform the source into an equivalent yet more convenient representation • Prediction • Predict the future based on the causal past • Pattern matching • Identify and represent repeated patterns EE465: Introduction to Digital Image Processing

Non-image Examples • Transformation in audio coding (MP3) • Audio samples are transformed into frequency domain • Prediction in speech coding (CELP) • Human vocal tract can be approximated by an auto-regressive model • Pattern matching in text compression • Search repeated patterns in texts EE465: Introduction to Digital Image Processing

How to Build a Good Model • Study the source characteristics • The origin of data: computer-generated, recorded, scanned … • It is about linguistics, physics, material science and so on … • Choose or invent the appropriate tool (modeling technique) Q: why pattern matching is suitable for texts, but not for speech or audio? EE465: Introduction to Digital Image Processing

Image Modeling • Binary images • Scanned documents (e.g., FAX) or computer-generated (e.g., line drawing in WORD) • Graphic images • Windows icons, web banners, cartoons • Photographic images • Acquired by digital cameras • Others: fingerprints, CT images, astronomical images … EE465: Introduction to Digital Image Processing

Lossless Image Compression • No information loss – i.e., the decoded image is mathematically identical to the original image • For some sensitive data such as document or medical images, information loss is simply unbearable • For others such as photographic images, we only care about the subjective quality of decoded images (not the fidelity to the original) EE465: Introduction to Digital Image Processing

Run Length Coding (RLC) What is run length? Run length is defined as the length of consecutively identical symbols Examples HHHHHTHHHHHHH Coin-flip 5 1 7 SSSSEEEENNNNWWWW random walk 4 4 4 4 EE465: Introduction to Digital Image Processing

Run Length Coding (Con’t) Y Transformation by run-length counting Entropy coding binary bit stream discrete source X P(Y) probability estimation Y is the sequence of run-lengths from which X can be recovered losslessly EE465: Introduction to Digital Image Processing

RLC of 1D Binary Source 00001110000001111100 X 4 3 6 5 2 Y Huffman coding (need extra 1 bit to denote what the starting symbol is) compressed data Properties • “0” run-length (red) and “1” run-length (green) alternates - run-lengths are positive integers EE465: Introduction to Digital Image Processing

Variation of 1D Binary RLC When P(x=0) is close to 1, we can record run-length of dominant symbol (“0”) only Example 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 11 0 0 0 0 0 0 0 0 1 … 5 7 4 0 8 run-length Properties - all coded symbols are “0” run-lengths - run-length is a nonnegative integer EE465: Introduction to Digital Image Processing

Modeling Run-length geometric source: P(X=k)=(1/2)k, k=1,2,… Run-length k Probability 1/2 1/4 1/8 1/16 1/32 … 1 2 3 4 5 … EE465: Introduction to Digital Image Processing

Golomb Codes Optimal VLC for geometric source: P(X=k)=(1/2)k, k=1,2,… codeword 0 10 110 1110 11110 111110 1111110 11111110 … … k 1 2 3 4 5 6 7 8 … 0 1 1 0 1 0 1 0 … EE465: Introduction to Digital Image Processing

From 1D to 2D white run-length black run-length Question: what distinguishes 2D from 1D coding? Answer: inter-line dependency of run-lengths EE465: Introduction to Digital Image Processing

Relative Address Coding (RAC)* previous line 0 0 0 0 01 1 1 1 1 1 10 0 0 0 0 0 0 0 0 0 0 7 current line 0 0 0 0 00 1 1 1 1 1 1 1 10 0 0 0 0 1 1 0 0 d1=1 d2=-2 NS,run=2 NS – New Start Its variation was adopted by CCITT for Fax transmission EE465: Introduction to Digital Image Processing

Image Example CCITT test image No. 1 Size: 17282376 Raw data (1bps) filesize of ccitt1.pbm: 513216 bytes filesize of ccitt1.tif: 37588 bytes Compression Ratio=13.65 EE465: Introduction to Digital Image Processing

Graphic Images (Cartoons) Total 12 different colors (black,white,red,green,blue,yellow …) • Observations: • dominant background color (e.g., white) • objects only contain a few other colors EE465: Introduction to Digital Image Processing

Palette-based Image Representation color index 0 white Any (R,G,B) 24-bit color can be repre- sented by its index in the palette. 1 black 2 red green 3 4 blue yellow 5 … … EE465: Introduction to Digital Image Processing

RLC of 1D M-ary Source Basic idea Record not only run-lengths but also color indexes Example color sequence WWWWGGGWWWWWWWRRRR… 0 0 0 0 03 3 30 0 0 0 0 0 02 2 2 210 0 0 0 0 0 0 0 … 5 3 7 4 1 8 run-length 0 3 0 2 1 0 color-index (color, run) representation: (0,5) (3,3) (0,7) (2,4) (1,1) (0,8) … Note: run-length is a positive integer EE465: Introduction to Digital Image Processing

Variation of 1D M-ary RLC When P(x=0) is close to 1, we can record run-length of dominant symbol (“0”) only Example 0 0 0 0 0 1 0 0 0 0 0 0 0 4 0 0 0 0 32 0 0 0 0 0 0 0 0 1 … 5 7 4 0 8 run 1 4 3 2 1 level (run, level) representation: (5,1) (7,4) (4,3) (0,2) (8,1) … Properties - “0” run-lengths only - run-length is a nonnegative integer EE465: Introduction to Digital Image Processing

Image Example Raw data: party8.ppm, 526286, 451308 bytes Compressed: party8.bmp, 17094 bytes Compression ratio=26.4 EE465: Introduction to Digital Image Processing

History of Lempel-Ziv Coding • Invented by Lempel-Ziv in 1977 • Numerous variations and improvements since then • Widely used in different applications • Unix system: compress command • Winzip software (LZW algorithm) • TIF/TIFF image format • Dial-up modem (to speed up the transmission) EE465: Introduction to Digital Image Processing

Dictionary-based Coding • Use a dictionary • Think about the evolution of an English dictionary • It is structured - if any random combination of alphabets formed a word, the dictionary would not exist • It is dynamic - more and more words are put into the dictionary as time moves on • Data compression is similar in the sense that redundancy reveals as patterns, just like English words in a dictionary EE465: Introduction to Digital Image Processing

Toy Example I took a walk in town one day And met a cat along the way. What do you think that cat did say? Meow, Meow, Meow entry pattern 1 I took a walk in town one day 2 And met a I took a walk in town one day And met a pig along the way. What do you think that pig did say? Oink, Oink, Oink 3 along the way 4 What do you think that 5 did say? I took a walk in town one day And met a cow along the way. What do you think that cow did say? Moo, Moo, Moo 6 cat 7 meow … … … - from “Wee Sing for Baby” EE465: Introduction to Digital Image Processing

Basic Ideas • Build up the dictionary on-the-fly (based on causal past such that decoder can duplicate the process) • Achieve the goal of compression by replacing a repeated string by a reference to an earlier occurrence • Unlike VLC (fixed-to-variable), LZ parsing goes the other way (variable-to-fixed) EE465: Introduction to Digital Image Processing

Lempel-Ziv Parsing • Initialization: • D={all single-length symbols} • L is the smallest integer such that all codewords whose length is smaller than L have appeared in D (L=1 at the beginning) • Iterations: wnext parsed block of L symbols • Rule A: If wD, then represent w by its entry in D and update D by adding a new entry with w concatenated with its next input symbol • Rule B: If wD, then represent the first L-1 symbols in w by its entry in D and update D by adding a new entry with w EE465: Introduction to Digital Image Processing

Example of Parsing a Binary Stream Dictionary input: 0 1 1 0 0 1 1 1 0 1 1 … entry pattern 1 0 w: 0, 11, 10, 00, 01, 110, 011, … 2 1 rule: L=1 A B B B A B A 3 01 output: 1, 2, 2, 1, 3, 4, 7, … (entries in D) 11 4 L=2 Illustration: Step 1: w=0, Rule A, output 1, add 01 to D, L=2 Step 2: w=11, Rule B, output 2, add 11 to D Step 3: w=10, Rule B, output 2, add 10 to D Step 4: w=00, Rule B, output 1, add 00 to D Step 5: w=01, Rule A, output 3, add 011 to D, L=3 Step 6: w=110, Rule B, output 4, add 110 to D 10 5 6 00 011 7 L=3 110 8 fixed -length variable -length EE465: Introduction to Digital Image Processing

Binary Image Compression Summary • Theoretical aspect • Shannon’s source entropy formula tells us the lower bound for coding a memoryless discrete source • To achieve the source entropy, we need to use variable length codes (i.e., long codeword assigned to small-probability event and vice versa) • Huffman’s algorithm (generating the optimal prefix codes) offers a systematic solution • Practical aspect • Data in the real world contains memory (dependency, correlation …) • We don’t know the probability distribution function of the source • Tricky point: the goal of image compression (processing) is not to make your algorithm work for one image, but for a class of images EE465: Introduction to Digital Image Processing

The Art of Source Modeling How do I know a model matches or not? Hypothesis Model: Bernoulli distribution (P(X=black)=0.1) Observation Data EE465: Introduction to Digital Image Processing

A Matched Example Synthesis Model EE465: Introduction to Digital Image Processing

Good Models for Binary Images Transformation by Lempel-Ziv parsing Y Transformation by run-length counting Huffman coding binary bit stream discrete source X P(Y) probability estimation Y is the sequence of run-lengths from which X can be recovered losslessly Y is the sequence of dictionary entries from which X can be recovered losslessly EE465: Introduction to Digital Image Processing

Remaining Questions • How to choose different compression techniques, say RLC vs. LZC? • Look at the source statistics • For instance, compare binary image (good for RLC) vs. English texts (good for LZC) • What is the best binary image compression technique? • It is called JBIG2 – an enhanced version of JBIG • What is the entropy of binary image source? • We don’t know and the question itself is questionable • A more relevant question would be: what is the entropy for a probabilistic model which we believe generates the source? EE465: Introduction to Digital Image Processing

Binary Image Compression

Binary Image Compression

Presentation Transcript

Image Compression

Image Compression

Image Compression

Image Compression

Image Compression

Image Compression

Image Compression

Image Compression

Image Compression

Image Compression

Image Compression

Image compression

Image Compression

Image Compression Binary Image Compression

Image Compression

Image Compression

Image Compression

Image Compression

Image Compression

Image Compression

Image Compression