10 likes | 185 Vues
Target lacking a feature. Target=. Target=. Z=0.8. Target=. Target=. Target=. Distractors irregularly placed. Z=0.22. Distractors dissimilar to each other. Z=0.25. S.
E N D
Target lacking a feature Target= Target= Z=0.8 Target= Target= Target= Distractors irregularly placed Z=0.22 Distractors dissimilar to each other Z=0.25 S The target red vertical bar evokes responses from 3 cell types: (1) orientation selective cells tuned to vertical, (2) color selective cells tuned to red, and (3) conjunctively tuned cells selective to red-vertical. All 3 cell types experience no iso-feature suppression, the most responsive of them should signal the target saliency. Assuming that cells tuned to the single features determine the ease of the corresponding single feature searches, then the double feature search should be no less difficult than the easier of the two single feature searches, and may be more efficient than the single feature searches if the conjunctively tuned cell is the most responsive. Z = (S-S)/σ , z score, measuring saliencies of items Z=-0.63, next to target, z=0.68 Distractors irregularly placed This explains why the double feature advantage is stronger in motion-orientation double feature search than the color-orientation double feature search (Nothdurft 2000), since motion-orientation conjunction cells are more abundant in V1 than the color-orientation conjunction cells. Understanding conjunction and double feature searches by a saliency map in primary visual cortex Li Zhaoping, Department of Psychology, University College London, z.li@ucl.ac.uk, www.gatsby.ucl.ac.uk/~zhaoping Double feature search -- orientation-color Conjunction search --- orientation-color The V1 saliency map agrees with visual search behavior. Two neural substrates necessary to make a basic feature: (1) Tuning of cells’ receptive fields to feature,i.e., a population of V1 cells selective to different values of this feature dimension, such that the feature can be signaled,(2) tuning of the horizontal connections to feature, i.e., selectivity of the horizontal intra-cortical connections to the optimal feature values of both the pre-synaptic and post-synaptic cells in this feature dimension, such that a lack of iso-feature (e.g., iso-orientation) suppression of the target can lead to a relatively higher response. E.g., a vertical bar pops out among horizontal ones since cells are selective to orientation,and horizontal connections link cells tuned to similar orientations, hence responses to horizontal bars are suppressed due to iso-orientation suppression. V1 produces a saliency map Target, and its Z score Input images Model outputs Comments Model output Target differs from background in both color and orientation Outputs S to higher Visual Areas Highlighting important image locations. These locations evoke stronger responses because they have fewer iso-orientation neighbors that suppress them and/or more co-linear neighbors that facilitate them. Target= Conjunction search • Hence, on conjunction searches • A conjunction of 2 orientations is difficult to find since V1 cells are not tuned to two different orientations that differ significantly from each other. • A conjunction of motion-orientation (or depth-orientation) is easy to find since many V1 cells are conjunctively tuned to both motion direction (or disparity) and orientation. We predict: there are underlying horizontal connections linking cells tuned conjunctively to the same orientation and motion direction (or disparity). • A conjunction of color-orientation can be easy or difficult to find depending on the stimuli, since most V1 cells are tuned only to orientation or only to color, and a small population of V1 cells is broadly tuned to both orientation and color. Prediction: Color-orientation conjunction search can be made easier by adjusting the scale and/or density of the stimuli, since V1 cells conjunctively tuned to both orientation and color are mainly tuned to a specific spatial frequency band. Single feature search --- color Single feature search -- orientation Z=-0.9 V1 model Input to model Search becomes easier in homogeneous backgrounds, since z increases with decreasing σ Observations: Double feature searches are easier than the corresponding single feature searches, which in turn are easier than conjunction searches. The V1 model is based on V1 physiology and anatomy (e.g., horizontal connections linking cells tuned to similar orientations), tested to be consistent with physiological data on contextual influences (e.g., iso-orientation suppression, Knierim and van Essen (1992) co-linear facilitation, Kapadia et al 1995). Stimuli for a conjunction search for target Response from a model with conjunction cells Response from a model without conjunction cells Homogeneous background, identical distractors regularly placed Z=3.4 Histogram of all responses S regardless of features Observations: Motion-orientation and depth-orientation conjunctions are not much more difficult than the single feature searches (Nakayama and Silverman 1986, Mcleod et al 1988), Color-orientation conjunction search is more difficult (Treiman and Gelade 1980). Double feature advantage is greater in motion-orientation than color-orientation (Nothdurft 2000) Original input V1 response S S=0.2, z=1.0 S=0.4, z=7 V1 processing σ S=0.12,z=-1.3 A colored bar evokes responses in cells tuned to orientation only, or tuned to color only, or tuned to both color and orientation. The conjunction cell experiences the least iso-feature suppression and enables pop-out. A colored bar evokes responses in cells tuned to orientation only or tuned to color only, S=0.22, z=1.7 This is so even when a target has negative z score, because the items next to the target becomes more salient in a homogeneous background, attracting attraction. Target= Homogeneous background, identical distractors regularly placed The responses from the orientation selective cells are visualized by the thickness or the black, oriented lines, from color tuned cells by the size of the colored circle, from conjunctively tuned cells by the size of the adequated colored and oriented ellipses. The horizontal connections link cells tuned to similar features (orientation, color, or both). Z=-0.83, next to target, z=3.7 Saliency of an item is assumed to increase with its evoked V1 response. We assume that efficiency of a visual search task increases with the salience of the target(or its most salient part, e.g., the horizontal bar in the target cross above).The high z score, z = 7, (of the horizontal bar), a measure of the cross’ salience, enables the cross to pop out, since its evoked V1 response (to the horizontal bar) is much higher than the average population response of the whole image.The cross has a unique feature, the horizontal bar, which evokes the highest response since it experiences no iso-orientation suppression while all distractors do. Hence, intra-cortical interaction is a neural basis for why feature searches are often efficient. Question: How much easier is a double feature search than the corresponding single feature searches, and how much easier are the single feature searches than the conjunction search? How do they depend on the underlying features? Model behavior agrees with the subtle changes in search efficiency in asymmetries in visual search --- search efficiency change when target and distractors swap roles. Shown in 2 examples. Only input images are shown, output response differences are too small to be visualized here, but z score differences can be significant. On double feature searches: V1’s output as saliency map is viewed under the idealization of the top-down feedback to V1 being disabled, e.g., shortly after visual exposure or under anesthesia. Ellipse in circles vs. Circle in ellipses. Signaling saliency regardless of features:Contrary to common beliefs, this does not mean that the cells reporting salience must be un-tuned to specific features. In other words, here “regardless of” means the following — in this saliency map, the meaning of firing rates for saliency is universal, and, given an input scene, the same firing rate from two V1 (output) neurons selective to different features mean the same salience value of the two corresponding inputs even if, say, one of the cells is color selective, responding to a static red bar, and the other cell is tuned to motion, responding to a moving black dot. Usually, an image item, say, a red short bar, evokes responses from many cells with different optimal features and overlapping tuning curves or receptive fields. The actual input features have to be decoded in a complex and feature specific manner from the population responses. However, locating the most responsive cell to a scene locates the most salient item whether or not features can be decoded beforehand or simultaneously from the same cell population. It is economical not to use subsequent cell layers (whether they are feature tuned or not) for a saliency map; the small receptive fields in V1 also mean that this saliency map can have a higher resolution. For more details, see “A saliency map in primary visual cortex” in Trends in Cognitive Sciences, Vol. 6, No.1 January 2002, p.9-16. Target: circle, z = 0.7 Target: ellipse, z = 2.8 Curved line among straight lines vs. Straight among curved. Target: curved, z = 1.12 Target: straight, z = 0.3