Street Smarts: Visual Attention on the Go

Street Smarts:Visual Attention on the Go Alexander Patrikalakis May 13, 2009 6.XXX

Vision of Attention • For machines to recreate human visual attention, we must accept that humans: • Maintain multi-scale orientation, intensity, and color feature neuronal maps in parallel • Combine multi-scale features into a central conspicuity (saliency) map • Maintain a Winner-Take-All neural network that saccades to and subsequently inhibits decreasingly salient points

Example Object recognition at all points of an image is infeasible time-wise Visual attention allows us to find the interesting points quickly Ullman agrees: “Recognition over the whole scene leads to a combinatorial explosion.”

Implementation Steps • Analyzed previous work done by Ullman, Itti, and Koch on visual attention • Implemented visual saliency model in C++ using Intel OpenCV, IPP, and TBB • Implemented FOA shifting by saccading to points with decreasing saliency map values; same effect as a 2D neuronal matrix

Results • Tested algorithm on 13 geometric scenes, and obtained plausible salient winners in each • Tested algorithm on 40 natural scenes (roads and highways) and found that signs and signals are very salient (usually saccaded to first) • Algorithm resilient to noise and takes advantage of multi-scale analysis

Itti: Normalization • Promote maps with small numbers of strong maxima • Suppress maps with large numbers of equally strong maxima • Method: scales maps by the difference between global maximum and mean of remaining maxima

Ullman, Itti, Koch: Multi-scale features Multi-scale Architecture Three Feature Maps

Ullman: The Winner-Takes-All (WTA)

Simple Example

Noise Resilience

Multi-scale Advantage 1

Multi-scale Advantage 2

Problematic distractions

Contributions • Reviewed past work done on biologically inspired visual attention models • Identified Itti’s algorithm as a candidate for saliency detection in natural scenes involving road signs • Demonstrated algorithm’s effectiveness on many natural scenes involving road signs • Created a prototype saliency heuristic for evaluating sign effectiveness

Street Smarts: Visual Attention on the Go