240 likes | 415 Vues
Perceived video quality measurement. Muhammad Saqib Ilyas CS 584 Spring 2005. Agenda. Motivation Subjective vs. objective measures Mean Square Error based metrics Recommended framework Perceived video quality metrics Structural distortion based metrics Other video quality metrics
E N D
Perceived video quality measurement Muhammad Saqib Ilyas CS 584 Spring 2005
Agenda • Motivation • Subjective vs. objective measures • Mean Square Error based metrics • Recommended framework • Perceived video quality metrics • Structural distortion based metrics • Other video quality metrics • Conclusions
Motivation • Background in voice over IP • PSQM (Perceived Speech Quality Measurement) • Improving voice over IP quality over slow WAN links
Subjective measures • Mean opinion score • Most reliable • Slow • Expensive
MSE based metrics • Mean Square Error • Peak SNR
Pre-processing • Temporal alignment • Compression, processing and transmission • Color space transformation • Visual blur in HVS • Linear space invariant lowpass characterized by PSF (Point Spread Function) • Light adaptation • Weber’s law
Contrast Sensitivity Function • Models sensitivity to different spatial and temporal frequencies • Bandpass in nature • Implementation: • Filter • Weighting factors after channel decomposition • Mostly lowpass filters used • More robust to changes in viewing distance • Temporal filters
Channel decomposition • Visual cortex neurons tuned to • spatial and temporal frequencies • Orientation • Direction of motion • Neuron modeled as 2D Gabor function • Collection of neurons modeled as octave band Gabor filter bank • Spatial spectrum sampled at • Octave intervals in radial frequency dimension • Uniform intervals in orientation dimension
Channel decomposition • Visual cortex neuron output saturates with increase in contrast • Typically modeled based on application and computational constraints • Sophisticated channel decomposition • Wavelet transforms • DCT
Error normalization and masking • Masking/facilitation • Presence of one image component will decrease/increase visibility of another image component at the same spatial location • Strongest when two signals have the same frequency components and orientation • Implementation • Gain control – space-varying visibility threshold for the particular channel
Error normalization and masking • Base error threshold for every channel is elevated to account for the presence of the reference signal • Elevated threshold used to normalize error signal into JND
Error pooling • Combines error values from various channels into one • Typically Minkowski pooling is used ei,k is the normalized and masked error of the k-th coefficient in the i-th channel β is a constant typically with a value between 1 and 4
Video distortion meter • Image quality assessment • Cognitive emulator • Asymmetric tracking • Humans detect quality transition from good to poor more readily
Multi-metric MPEG quality • Combines • Error sensitivity based metric • Blockiness detection
Digital Video Quality (DVQ) • Simplicity is key consideration • LC: Local contrast is ratio of DCT amplitudes to DC amplitude for a block • CM: Contrast masking
Others • Moving picture quality metric • Color space based metric • Blocking artifact based metric
Critique • Computational complexity • Memory requirement • Viewing resolution • Resolution of display device • Digital pixel – luminance value non-linear relationship • Viewing distance • Reliance on linear channel decomposition • Correlation between channels modeled using sophisticated masking techniques • Current masking models inaccurate
Structural Similarity Index Metric (SSIM) • Different kinds of image distortion have different perceived quality • Metrics discussed so far measure error • Error and structural distortion agree quite often • But the same amount of error may lead to different structural distortion • Bottom-up approach • Simulate the hypothesized functionality of the overall HVS
Reduced reference metrics Discussed in proposal presentation Based on temporal and spatial dissimilarity information
No reference / blind • Complications • Unquantifiable factors when reference is not available include but not limited to: • Aesthetics • Cognitive relevance • Learning • Visual context • Philosophy • All images/videos are perfect unless distorted during: • Acquisition • Processing • Reproduction
No reference • Determining the possible distortion introduced during these stages • Reference is “perfect” natural images/videos • Measured with respect to a model best suited to a given distortion type or application • E.g., natural images/videos do not contain blocking artifacts • To improve prediction • Some HVS aspects are also modeled • Texture and luminance masking
Other metrics • Marker bits hidden in video frames • Marker bits additionally transmitted on an aux channel • Watermarking
Conclusion • Perceptual quality measures not yet mature • Computational complexity a big hurdle especially for real-time applications • Accuracy of models also doubtful • where pin-point accuracy is not required • Less accurate RR or NR metrics may be used in