Current Trends in Image Quality Perception

Current Trends in Image Quality Perception Mason Macklem Simon Fraser University

General Outline • Examine current image quality standard • need for improvements on current standard • Examine common image compression techniques • potential quality techniques applicable to each • Discuss further theoretical and constructive models

Image Compression Model • Transform an image from one domain into a better domain, in which the imperceptible information contained in the image is easily discarded • Goal: more efficient representation

3 Ways to Improve Compression • Better domain: design better image transforms, improve energy compaction • Imperceptible: design better perceptual image metric • Discarded: design better image quantization methods

Current Standard: MSE-based • Mean-Squared Error (MSE): • Root-Mean-Squared Error (RMS): • Peak Signal-To-Noise Ratio (PSNR):

MSE-based metrics • Measure image quality locally, ie. pixel-by-pixel area • Not representative of what the eye actually sees • Returns a single number, intended to represent quality of compressed image • Not accurate for cross-image or cross-algorithm comparisons

MSE pathologies • Local (pixel-by-pixel) quality measure • does not differentiate between constant (not noticeable) and varying (noticeable) error-types • does not take into account local contrast • assumes no delay or noise in channel • Known result: above error types are treated differently by HVS

Sinusoidal error Original bird Constant error

Low contrast = no masking High contrast = masking

Sinusoidal error (MSE = 12.34) Image offset 1 pixel (MSE = 230.7) Original bird

MSE Pathology II: Fractal Compression • Based on theory of Partitioned Iterated Function Systems (PIFS) • uses larger blocks contained in the image to represent smaller blocks • represent smaller blocks using displacement vector • match larger to smaller to maintain contraction • blocks chosen to minimize MSE • partly motivated due to promising MSE results

Divide image into domain and range blocks Find closest affine transformation for each range image from domain blocks Set maximum depth, code all unmatched blocks manually (ie. DCT) Highly computational, dependent on choice of domain and range blocks Balance computational and quality requirements fewer blocks checked, lower image quality slow encoding offset by fast decoding Fractal Compression Model

Models to improve computational complexity: • loosen criteria for “matching” blocks, ie. take first block below a given threshold, take closest block within a given radius • Good MSE/PSNR results not reflected in visual appearance of resulting image • success of fractal compression dependent more on internal composition of image than on overall model • if similar blocks are not present in domain blocks, then dissimilar blocks will be matched

Better transforms & vision models • Choice of better domain highly dependent on visual criteria • Better quality metric impacts the design stage of compression algorithm • better assessment of visual quality = more accurate prediction of compression artifacts • Fractal Compression model depended on inaccurate quality model (?)

Better Transforms • Lossless: • All information in reconstructed image is identical to original image • Eg., BMP, GIF • Lossy: • Discard information in original image to achieve higher compression rates • Strategically discard only imperceptible information • Eg. JPEG, TIF, Wavelet compression • In network-based applications, more focus is given to lossy transforms

JPEG • Split image into 8x8 blocks • Small enough image sections to assume high correlation between adjacent pixels • Apply 8x8 DCT transform to each block • Shift energy in each block to uppermost entries • Quantize, run-length encode • Quantization: lossy step, discard information • RLE: takes advantage of sparseness of result

8x8 DCT Matrix

Divide each entry of the image matrix by the corresponding entry in the quantization matrix Class of matrices built into JPEG standard Contained in the JPEG file, with image information JPEG Quantization Matrices • Flexibility with • quantization tables (?)

MSE Pathology III: DCT Sinusoidal error (MSE = 12.34) Original image DCT-based error (MSE = 320.6)

JPEG2000 & Wavelet Compression • New JPEG standard wavelet-based • Wavelet compression studied extensively for years • JPEG2000 first attempt at standardizing • WSQ: used to compress fingerprints for FBI • used in place of JPEG, which quickly blurred important information • Similar compression ratios to JPEG, but with higher quality

Wavelet Transform • Alternative to Fourier transform • Localized in time and frequency • No blocking/windowing artifacts • Compact support • Sums of dilations and translations of (mother) wavelet function

Complete nested sequence of function spaces Vj, with {0} intersection Scale-invariance: f(t) is in Vj iff f(2t) is in Vj+1 Shift-invariance: f(t) is in Vj iff f(t-k) is in Vj (k integer) Shift-invariant Basis: V0 has an orthonormal basis (scaling function) Difference spaces: Wavelets: basis functions for Wj’s express function in terms of scaling function and wavelets Multi-resolution Analysis

DWT & Filter Banks • DWT: banded matrix, with filter coefficients on diagonals • Multiply matrix by input signal • Highpass filter: flip coefficients and alternate signs • Discard even entries to construct output signal

DWT separates function into averages and details global and local info Two filters: highpass and lowpass lowpass: low frequency (averages) highpass: high frequency (details) Highpass filter: decimates constant signal (no detail info) Lowpass filter: decimates oscillating signal (no global info) Result: two signals, half length of original most info in lowpass signal

DWT & Image Compression columns rows

Wavelets and Images • Bottom-up • imperceptible differences separated into details • L1 norm applied to 1st quadrant only • Top-down • 1st quadrant entries give same general image • L1 norm applied to detail quadrants • Both give similar results as MSE-based methods

Picture Quality Scale (PQS) • Parameterized error measure • separate image into different types of error • calculate weighted sum, with weights determined by curve-fitting subjective results • Five factors: • normalized MSE (regular and thresholded) • blocking artifacts • MSE on correlated errors • Errors near high-contrast image-transitions

Each factor has associated error image • Designed so that the contributions to the final quality rating can be localized • Better idea of location of error in compression assists the algorithm design-time • Results equivalent to MSE • Miyahara, Kotani & Algazi (JAIST & UCDavis) (Miyahara, Kotani & Algazi)

Lessons from PQS • Start with visual system • base model on observations of subjects • Localize information about error • using pictorial distance representation, rather than outputting a number to represent quality • Need more than MSE-based measures • PQS fails on same pathologies as MSE

Fidelity vs. Quality • Image Fidelity: • Measured in terms of the “closeness” of an image to an original source, or ideal, image • eg. MSE-measures, PQS • Image Quality: • Measured in terms of a single image’s internal characteristics • Depends on the criteria, application-specific • eg. Medical Imaging

Fidelity-based Approach • Modelled by IPO (Eindhoven) • Natural image as “conveyer of visual information about natural world” • “quality” based on internal properties of image, but only on past experiences of subject • eg. “quality” of picture of grass depends on its ability to conform to subject’s expectations of the appearance of grass

Pros: Very nice theoretically Clearly-defined notions of quality Based on theory of cognitive human vision Flexible for application-specific model Cons: Practical to implement? Subject-specific definition of quality Subjects more accurate at determining relative vs. absolute measurement IPO model

Next-wave: HVS-based

Current Trends in Image Quality Perception