170 likes | 271 Vues
Learn about using computer vision to identify display bugs on web pages caused by HTML errors, layout problems, and disparities among browsers. Use dynamic time warping and segmentation techniques to compare and analyze web page displays effectively.
E N D
Using Computer Vision to Test Web Display Xu Liu liuxu@cs.umd.edu
Why test web display? • Display Bugs Bug “Opera 7.54” Normal IE6
More bugs BUG - Opera Normal - Firefox Normal - Mozilla Normal - IE
Where comes these bugs? • Different parsers on DHTML, CSS • Web designer doesn’t follow W3C standard • IE tolerant buggy HTML • Java Script, ActiveX, Flash
Which kinds of bugs do we have? • Text/Image Overlap • Incorrect blank area • Missing Text/Image Generally they are all layout problems
How do we detect these bugs • Is HTML source helpful? • Yes, but we need a correct parser which is being tested • HTML does not have straight forward relationship with display • Eyes always tell the truth – directly use the image of snapshot
Find the outlier • Let them vote: • IE, Firefox, Mozilla, Netscapte, Opera, MyIE… • Anyone seems distinct from others is probably an outlier • Assumption : Major are correct, Minor are incorrect
Basic Question • How do we compare 2 images? Are they same? Missing Front and end These 2 look different but they should be considered the same
Simpler Question • How do we compare 2 sequences? S1: 1 2 3 4 5 6 8 7 S2: 1 2 4 5 5 5 6 7 Direct Compare |S1-S2|=0+0+1+1+0+1+2+0=5 Dynamic Time Warping !! In fact ||S1-S2||=0, they should be considered the same
Dynamic Time Warping (DTW) • A widely used technique in signal processing • Speech recognition, Image matching Compare S1,S2
Can we directly apply DTW to compare web pages? • No! If we directly compare 2 pages: Most of the error will be omitted Local VS Global Segmentation First!!
How do we derive segmentations Edge Detection First Over Segment Merge
Next • For every page • We have all its segments • For every segment • If it CANNOT be found in other pages, possibly it is an error
Result http://www.microsoft.com/smallbusiness/default.mspx by Opera 7.54
Result http://www.microsoft.com/learning/default.asp by Opera 7.54
Shortage and Future Work • Make segmentation more accurate • Make the system completely automatic