1 / 33

Bi-level Video: Video Communication at Very Low Bit Rates

Bi-level Video: Video Communication at Very Low Bit Rates. Jiang Li; Gang Chen; Jizheng Xu; Yong Wang; Hanning Zhou; Keman Yu; King To Ng; Heung-Yeung Shum. OUTLINE. INTRODUCTION (Wireless Video) Scalable Video Coding Introduction (Bi-level Video) Approaches Experimental Result Conclusion.

alessa
Télécharger la présentation

Bi-level Video: Video Communication at Very Low Bit Rates

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Bi-level Video:Video Communication at Very Low Bit Rates Jiang Li; Gang Chen; Jizheng Xu; Yong Wang; Hanning Zhou; Keman Yu; King To Ng; Heung-Yeung Shum

  2. OUTLINE • INTRODUCTION (Wireless Video) • Scalable Video Coding • Introduction (Bi-level Video) • Approaches • Experimental Result • Conclusion

  3. INTRODUCTION • Qos requirements for delivery of real-time video • Bandwidth: minimum bandwidth requirements (e.g. 28 kb/s) • Delay constraints (e.g. 1 s) • Upper limits on bit error rate (e.g. 1%)

  4. INTRODUCTION (cont.) • Wireless Channel • Unreliability • Noise, Multipath and shadowing makes BER high • Bandwidth Fluctuation • Mobile terminal moves between different networks • Multipath fading, co-channel interference, noise disturbances… • Heterogeneity • Receivers may be different in terms of latency requirements, visual quality requirements, processing capabilities, power limitations, and bandwidth limitations…

  5. INTRODUCTION (cont.) • Heterogeneity (video conferencing) Receiver Receiver Unicast video distribution using multiple point-to-point connections link1 link2 Sender Receiver Receiver Receiver

  6. INTRODUCTION (cont.) Receiver Receiver Multicast video distribution using point-to-multipoint transmission Sender link1 link2 Receiver Receiver Receiver Lack of flexibility in multicast: Receiver may be different in terms of latency requirements, visual quality requirements, processing capabilities, power limitations, and bandwidth limitations…

  7. INTRODUCTION (cont.) • Solutions • Scalable Video Coding • Network-Aware Adaptation of End Systems • Knowing the current status of network resources (e.g. bit error condition, available bandwidth) • Adapting video streams based on network status • Adaptive Qos Support from Networks • Adapting video streams during periods of QoS fluctuations and handoffs

  8. Scalable video coding basic Enhancement 1 Enhancement 2

  9. An example for multicast Scalable video coding (cont.) basic Enhancement 1 Enhancement 2

  10. Scalable video coding (cont.) • Scalable video coding mechanisms • SNR Scalability • Degraded quality • Spatial Scalability • Smaller image size • Temporal Scalability • Lower frame rate

  11. Introduction for Bi-level Video • Problems on Wireless Networks at Vary Low Bit Rates(mobile phone, palm-size PC …) • Low bandwidth (e.g. 9.6k bps) • Weak computational power • Short battery lifetime • Limited display capability (only support for several colors or black/white display)

  12. Introduction for Bi-level Video (cont.) • Using MPEG1/2/4 or H.261/263 • Not smooth • Look like a collection of color blocks (only basic colors are preserved)

  13. Introduction for Bi-level Video (cont.) • Outline features of scenes are more important than basic colors of blocks • Bi-level image A gray-scale image Bi-level image

  14. Approaches • We generate a bi-level image sequence from a gray-scale image sequence by thresholding • Key problem • Noises and flickers in the bi-level image sequence must be filtered out (∵consuming too many bits) • How to convert a bi-level image sequence and process it so that it is easy to be compressed and still keeps acceptable perception quality

  15. Approaches (cont.) step1 step2 step4 step3

  16. Approaches (cont.) • Static Region Detection and Duplication • Detecting static regions in the current frame and duplicate their pixels from the previous fame • Filtering noise and flickers out • Bit-rate control

  17. Approaches (cont.) • Dissimilarity threshold • The threshold of the difference between corresponding pixels in two successive frames • The higher the dissimilarity threshold is, the more pixels are viewed as being similar to corresponding pixels in the previous frame, and the lower bit-rate the generated bit stream is 5 Thresh = 2 2 5 Frame j Thresh = 6 2 Frame j-1 Frame j

  18. Approaches (cont.) • Calculating the difference between the j & j-1 frames • Laplacian (拉普拉辛) of a pixel • The second derivative of intensity at that pixel • Lj (x,y) =

  19. Approaches (cont.) • Calculating the difference between the j & j-1 frames • If the Laplacian of a region remains unchanged, the region is most likely static • Sum of absolute differences (SAD) of Laplacian of pixels in a square surrounding the target pixel (x,y)

  20. Approaches (cont.) • Calculating the difference between the j & j-1 frames Lj(x,y) – Lj-1(x,y) SADj(x,y)

  21. Approaches (cont.) • Calculating the difference between the j & j-1 frames • If SADj(x,y) ≤ td (dissimilarity threshold), the pixel is marked as static motion motion motion static Static (white) and motion (black) regions To avoid misidentified static regions goback

  22. Approaches (cont.) • Adaptive Thresholding • Used to convert a gray-scale image to a bi-level image • Using Ridler’s Iterative Selection method • e.g. sequence {1,3,5,6,7,9,12,13} t1 = (1+3+5+6+7+9+12+13)/8 = 7 tb = (1+3+5+6)/4 = 3.75 to = (9+12+13)/3 = 11.33 t2 = (3.75+11.33)/2 = 7.54 tb = (1+3+5+6+7)/4 = 5.5 to = (9+12+13)/3 = 11.33 t3 = (5.5+11.33)/2 = 8.415 tb = (1+3+5+7)/4 = 5.5 to = (9+12+13)/3 = 11.33 t4 = (5.5+11.33)/2 = 8.415 ∴ t = 8.415 Take threshold = t + tc goback

  23. Approaches (cont.) • Adaptive Context-based Arithmetic Encoding • Confidence level • The difference between the gray-scale value of the pixel and the threshold • Pixels with their gray-scale value near the threshold could be determined as either black or white • For those pixels with their absolute values of confidence level less than the half-width of threshold band, the bi-level values of the pixels are assigned according to the indexed probability in the probability table • Coding the whole frame rather than lots of blocks goback

  24. Approaches (cont.) • Rate Control time = t frame rate = n Buffer size B = Imax + 4r/n Imax : the maximum number of bits per frame that is allowed to be sent to the buffer r : maximum video bit-rate n : specified frame rate p Imax+4r/n p p I I Assign every group rt bits tn-1 p-frames bp=(rt-bi)/(tn-1) bi bits p p p p p p p p p p I bp bits bp bits I

  25. Approaches (cont.) • Rate Control 15% 15% Overflow, increase f to 9 Increase f by 1 Decrease f by 1

  26. Experimental Results (c) Salesman Complex background (a) Akiyo Little head motion (b) Grandma Large head motion ( MPEG4 video clips ) (e) Wang Little head motion (d) Chen Large head motion (ordinary clips captured from real scenes using PC digital cameras)

  27. Experimental Results (cont.) Complex background Large head motion

  28. Experimental Results (cont.) • H.263+ VS. bi-level video

  29. Experimental Results (cont.) (a) Akiyo (c) Salesman Complex but stable background H.263+ consumes few bits since the difference of two successive frames is almost negligible

  30. Conclusion • An I-frame in bi-level video is much smaller than in conventional DCT-based videos • short start-up time for streaming video • Can insert more I-frame • Transmission errors quickly be recovered • Provide a smooth perception of motion even in low frame conditions

  31. Conclusion (cont.) • Clearer shape, smoother motion, shorter initial latency, cheaper computation cost • The number of each pixel is reduced to 1 • The coding need not estimate and store motion vectors, thus reduces computational cost and coding bit-rate. • Static region detection reduces flicker effects and therefore improves coding efficiency • The coding system works well in handheld PCs, palm-size PCs, mobile phones that possess only a small display screen, very limited computation capability and transmission bandwidth

  32. Conclusion (cont.) • Microsoft portrait • A very low bit-rate video conferencing software • http://research.microsoft.com/~jiangli/portrait/

More Related