Applications of Data Hiding in Digital Images

Applications of Data Hiding in Digital Images Tutorial for The ISSPA’99, Brisbane, Australia August 22-25, 1999 Jessica Fridrich • Center for Intelligent Systems • SUNY Binghamton, Binghamton, NY 13902-6000, U.S.A, • and • Mission Research Corporation • 1720 Randolph Rd. SE, Albuquerque, NM 87105, U.S.A Fax/Ph: (607) 777-2577 E-mail: fridrich@binghamton.edu Http://www.ssie.binghamton.edu/fridrich

Outline • Introduction to Data Hiding • - History • - Motivation • - Definition • - Terminology • - Properties • Covert communication (steganography) • Digital watermarking (robust message embedding) • Watermarking for tamper detection and authentication • Attacks on hiding schemes • Open problems, challenges

Data Hiding in Digital Imagery • Relatively very young and fast growing • Well over 90% of all publications published in the last 6 years • Highly multidisciplinary field combining image and signal • processing with cryptography, communication theory, • coding theory, signal compression, and the theory of visual • perception • Tremendous interest from industry and military

Data Hiding - History • First techniques included invisible ink, secret writing using • chemicals, templates laid over text messages, microdots, • changing letter/word/line/paragraph spacing, changing fonts • Images, video, and audio files provide sufficient redundancy • for effective data hiding • Postscript files, PDF files, and HTML can also be used for • non-robust data hiding to a limited extent • Executable files, provide very little space for data hiding • Fonts

The Need for Data Hiding • Covert communication using images (secret message is • hidden in a carrier image) • Ownership of digital images, authentication, copyright • Data integrity, fraud detection, self-correcting images • Traitor-tracing (fingerprinting video-tapes) • Adding captions to images, additional information, • such as subtitles, to video, embedding subtitles or audio • tracks to video (video-in-video) • Intelligent browsers, automatic copyright information, • viewing a movie in a given rated version • Copy control (secondary protection for DVD)

Requirements Application Covert communication Copyright protection of images (authentication) Fingerprinting (traitor-tracing) Adding captions to images, additional information, such as subtitles, to videos Image integrity protection (fraud detection) Copy control in DVD Intelligent browsers, automatic copyright information, viewing movies in given rated version capacity robustness invisibility Requirements security embedding complexity detection complexity Low High

and Redundancy Irrelevancy make data hiding possible • Information-theoretic • Removed by lossless • compression • Perceptual • Removed by lossy • compression  2 gray levels + =  5 gray levels + = Original  31 gray levels + =

Data Hiding - Definition Secret message Secret message Key Carrier document Embedding algorithm Transmission via network Detector Key • Relationship carrier - message • Who extracts the message? (source versus destination coding) • How many recipients are there? • Is the key a public knowledge or a shared secret? • Do we embed different messages into one carrier? • Embedding / detection bundled with a key in a tamper-proof hardware? • Is the speed of embedding / detection important?

Properties of hiding schemes Robustness The ability to extract hidden information after common image processing operations: linear and nonlinear filters, lossy compression, contrast adjustment, recoloring, resampling, scaling, rotation, noise adding, cropping, printing / copying / scanning, D/A and A/D conversion, pixel permutation in small neighborhood, color quantization (as in palette images), skipping rows / columns, adding rows / columns, frame swapping, frame averaging (temporal averaging), etc. Undetectability Impossibility to prove the presence of a hidden message. This concept is inherently tied to the statistical model of the carrier image. The ability to detect the presence does not automatically imply the ability to read the hidden message. Undetectability should not be mistaken for invisibility  a concept related to human perception. Invisibility Perceptual transparency. This concept is based on the properties of the human visual system or the human audio system. Security The embedded information cannot be removed beyond reliable detection by targeted attacks based on a full knowledge of the embedding algorithm and the detector (except a secret key), and the knowledge of at least one carrier with hidden message.

The “Magic” Triangle There is a trade-off between capacity, invisibility, and robustness Capacity Naïve steganography Secure steganographic techniques Digital watermarking Undetectability Robustness Additional factors: • Complexity of embedding / extraction • Security

Outline • Introduction • Covert communication (steganography) • Message hiding in RGB images • - Absolutely secure steganographic method • - LSB encoding • Message hiding in palette images • - Permuting the palette • - LSB encoding in the palette • - EZ Stego • - Improved EZ Stego • Digital watermarking (robust message embedding) • Watermarking for tamper detection and authentication • Attacks on watermarks • Open problems, challenges

Covert Communication Purpose: To conceal the very presence of communication, to make the communication invisible. Encryption: To make the message unintelligible Warden Willie Andy Bob Secret communication??!! I just posted a picture of my cat on my web page!

CovertCommunication Secret message - Encryptionand steganography provide double protection - Randomized message is easier to hide Encryption Unit Carrier Image Embedding Algorithm Modified Carrier

Steganography for RGB images • Absolutely secure steganographic technique • Method: • Embed a small message (8 bits), by repeated scanning of • a cover image till a certain password-dependent message- • digest function returns the required 8-tuple of bits. • Comments: • Absolute secrecy tantamount to one time pad used in • cryptography • Guarantees correct noise distribution and undetectability. • Time consuming, very limited capacity, not applicable to • image carriers for which we only have one copy.

Steganography for RGB images LSB Encoding (Least Significant Bit) • Method: • Replace the LSB of each pixel with the secret message • Pixels may be chosen randomly according to a secret key • Pixels may be chosen adaptively according to neighborhood • Message should always be encrypted • Comments: • The simplest and most common steganographic technique • Premise = changes to the least significant bit will be masked by • noise commonly present in digital images. • Color images provide more room for hiding messages • If more than one LSB is used, statistically detectable changes may • result • A provably secure method should introduce changes consistent • with the noise model

Steganography for palette images LSB encoding cannot be directly applied to palette-based images because new colors, that are not present in the palette, would be created. Two sources of palette images: 1. Color truncation + dithering of photographs 2. Computer generated images (fractals, cartoons, animations) A secure steganographic method will produce modified carriers compatible with the source

Possibilities Hiding in the palette Hiding in the image data Non-adaptive techniques Adaptive techniques Artifacts Palette artifacts Image data artifacts

Possible approaches Message hiding in the palette Permuting palette entries - Image is not modified - Very limited capacity of log2(256!)=215 bytes - Too fragile (resaving) - Suspicious palette order is an artifact LSB encoding in the palette - Very limited capacity (at most 3256 bits) - Palette artifacts? Common disadvantage: Capacity is severely limited and independent of the image size

Possible approaches Message hiding in the image data - greedy techniques Decrease color depth and expand 1. Collapse 256 colors  128 colors 2. Expand 128 colors  256 colors by including a close color (e.g., flip the LSB of the blue channel) 3. Embed a binary message into the LSB of the blue channel of randomly selected pixels 4. Read the message from the LSB of the blue channel Alternatively 1. Decrease color depth to 32 colors and include all colors obtained from LSB shuffling of all 32 colors (one color produces 23 new colors) 2. Encode messages into the LSB of pixel colors 1 bpp 3 bpp

Possible approaches Message hiding in the image data Parity embedding 1. Assign parity to palette colors 2. Embed message bits as the parity of colors

Message: 0 1 1 0 0 1 0 1 0 1 1 1 0 1 0 1 0 1 0 0 0 1 1 1 1 Randomly chosen pixel with color C1 Find the color in the sorted palette C1 index = 30 = 00011110 00011110 Replace the LSB of the index to color C1 with the message bit 00011111 C2 The new index now points to a neighboring color C2 Replace the index of the pixel in the original image to point to the new color C2. Sorted palette Critical assumption: Colors close in the luminance-sorted palette are also close in the color space.

New approach using color parities Message hiding in the image data (1) For each message bit randomly select a pixel (2) Calculate the set of the closest palette colors (in Euclidean norm) The distance d between colors (R1G1B1) and (R2G2B2) is d 2 = (R1–R2)2+(G1–G2)2+(B1–B2)2 (3) Find the closest color whose parity agrees with the message bit. Parity of a color is defined as R+G+B mod 2. (4) Change the index for the pixel to point to the new color. To extract the secret message, pixels are selected using a key and the secret message is simply read by extracting the parity bits of the colors of selected pixels. 1 bpp

Message: 0 1 1 0 0 1 0 1 0 1 1 1 0 1 0 1 0 1 0 0 0 1 1 1 1 Randomly chosen pixel with color C1 Find the closest colors in the palette … … Replace C1 with the closest color that has the same parity as the message bit Color parity of (R,G,B) = R+G+B mod 2. • Advantages over EZ Stego: • The total change to the image due to message embedding is always • smaller • We avoid occasionally large changes in color that are possible with • EZ Stego

Optimal parity assignment Oblivious reading requirement: The optimal parity assignment has to be reconstructable from the modified image at the receiving end. Optimal parity Optimal parity Modified carrier embed = = Extract message message • Efficient algorithm for optimal parity assignment • Optimal parity depends only on the palette and does not • depend on the image content! • The optimal palette is also optimal for multiple-pixel embedding The average decrease in the RMS error due to optimal palette parity is about 25-35%.

Adaptive Steganography Non-adaptive steganography = modifications due to message embedding are uncorrelated with image features. Examples are LSB encoding in randomly selected pixels, modulation of randomly selected frequency bins in a fixed band, etc. Adaptive steganography = modifications are correlated with the image content (features). - Pixels carrying message bits are selected adaptively depending on the image - Avoiding areas of uniform color - Selecting pixels with large local standard deviation Potential problem with message recovery:We have to be able to extract the same set of message carrying pixels at the receiving end from the modified image.

Computer generated Julia set • Large areas of uniform color • Internal structure of the image - it is a fractal Julia set • Fonts

Artifacts caused by non-adaptive methods Artifacts around the Julia set. Artifacts in the fonts.

Method 1: Adaptive block embedding Message embedding • Divide the image into disjoint 33 blocks • Randomly choose blocks and evaluate some local • statistical quantity, such as standard deviation or • number of colors and decide whether or not a • message bit can be embedded (good vs. bad block) • If block is bad, skip it and do not insert message bit • If block is good, insert the bit into the block parity • If after embedding the block becomes bad, keep • the change but repeat the same message bit in the • next block Message extraction • Generate the same random walk through the • image blocks • Read the parity from all good blocks

Limitations Ultimately, image understanding is important for secure adaptive steganography. A human can easily recognize that a pixel is actually a dot above the letter "i" and must not be changed. However, it would be very hard to write a computer program capable of making such intelligent decisions in all possible cases. Example of a difficult area for secure adaptive message embedding - fonts on a complex background

Embedding while dithering True-color images are converted to palette images via - color quantization - dithering Idea: To embed message bits while doing the dithering Quantize Dither and Embed 256 color image True color image Compute palette Increase color depth by interpolating Or start directly with the true-color image

Embedding while dithering 1. Select a random collection of pixels that will carry message bits. 2. For non-message pixels use classical dither to the closest palette color 3. For message pixels dither to the closest color with the right parity. Rounding error is added to the next pixel + p11 p12+E11 Original 24-bit image Q Q E11 = q11 - p11 Dithered quantized image q11 q12 Palette P = {q1, …, q256} Non-message pixels:q is the closest palette color Message pixels:q is the closest palette color with the right parity Q:

Performance example Test image in JPEG format Original Non-adaptive Embedding while dithering

Outline • Introduction, history, motivation, definition, • terminology, properties • Covert communication (steganography) • Digital watermarking (robust message embedding) • - Copyright protection of digital images (authentication) • - Fingerprinting (traitor-tracing) • - Adding captions to images, additional information to videos • - Methods for Robust Data Hiding (Watermarking) • - Image integrity protection (fraud detection) • - Copy control in DVD • Watermarking for tamper detection and authentication • Attacks on watermarks • Open problems, challenges

Applications of Data Hiding in Digital Images

Applications of Data Hiding in Digital Images

Presentation Transcript

Information Hiding in Digital Data

Data Hiding (3 of 3)

Data Hiding in Halftone Images by Stochastic Error Diffusion

Digital Images

Digital Images

Steganography of Reversible Data Hiding

Reversible hiding in DCT-based compressed images

AFOSR PROGRAM REVIEW DATA HIDING IN COMPRESED DIGITAL VIDEO

Data Hiding Watermarking for Halftone Images

Information hiding in stationary images

Multilevel reversible data hiding based on histogram modification of difference images

Hiding Visual Patterns in Halftone Images

Adaptive Data Hiding in Edge Areas of Images with Spatial LSB Domain Systems

Adaptive Data Hiding in Edge Areas of Images With Spatial LSB Domain Systems

Reversible Data Hiding

Data Hiding

Multiple layer data hiding scheme for medical images

Digital Images

A Secure Data Hiding Scheme for Binary Images

A Data Hiding Method for Few-Color Images

Reversible Data Hiding

High Capacity Data Hiding for Grayscale Images