1 / 66

Eye-Based Interaction in Graphical Systems: Theory & Practice

Eye-Based Interaction in Graphical Systems: Theory & Practice. Part I Introduction to the Human Visual System. A: Visual Attention.

Télécharger la présentation

Eye-Based Interaction in Graphical Systems: Theory & Practice

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Eye-Based Interaction in Graphical Systems: Theory & Practice Part I Introduction to the Human Visual System

  2. A: Visual Attention “When the things are apprehended by the senses, the number of them that can be attended to at once is small, `Pluribus intentus, minor est ad singula sensus' ” — William James • Latin translation: “Many filtered into few for perception” • Visual scene inspection is performed minutatim (piecemeal), not in toto

  3. A.1: Visual Attention—chronological review • Qualitative historical background: a dichotomous theory of attention—the “what” and “where” of (visual) attention • Von Helmholtz (ca. 1900): mainly concerned with eye movements to spatial locations, the “where”, I.e., attention as overt mechanism (eye movements) • James (ca. 1900): defined attention mainly in terms of the “what”, i.e., attention as a more internally covert mechanism

  4. A.1: Visual Attention—chronological review (cont’d) • Broadbent (ca. 1950): defined attention as “selective filter” from auditory experiments; generally agreeing with Von Helmholtz’s “where” • Deutsch and Deutsch (ca. 1960): rejected “selective filter” in favor of “importance weightings”; generally corresponding to James’ “what” • Treisman (ca. 1960): proposed unified theory of attention—attenuation filter (the “where”) followed by “dictionary units” (the “what”)

  5. A.1: Visual Attention—chronological review (cont’d) • Main debate at this point: is attention parallel (the “where”) or serial (the “what”) in nature? • Gestalt view: recognition is a wholistic process (e.g., Kanizsa figure) • Theories advanced through early recordings of eye movements

  6. A.1: Visual Attention—chronological review (cont’d) • Yarbus (ca. 1967): demonstrated sequential, but variable, viewing patterns over particular image regions (akin to the “what”) • Noton and Stark (ca. 1970): showed that subjects tend to fixate identifiable regions of interest, containing “informative details”; coined term “scanpath” describing eye movement patterns • Scanpaths helped cast doubt on the Gestalt hypothesis

  7. A.1: Visual Attention—chronological review (cont’d) Fig.2: Yarbus’ early scanpath recording: • trace 1: examine at will • trace 2: estimate wealth • trace 3: estimate ages • trace 4: guess previous activity • trace 5: remember clothing • trace 6: remember position • trace 7: time since last visit

  8. A.1: Visual Attention—chronological review (cont’d) • Posner (ca. 1980): proposed attentional “spotlight”, an overt mechanism independent from eye movements (akin to the “where”) • Treisman (ca. 1986): once again unified “what” and “where” dichotomy by proposing the Feature Integration Theory (FIT), describing attention as a “glue” which integrates features at particular locations to allow wholistic perception

  9. A.1: Visual Attention—chronological review (cont’d) • Summary: the “what” and “where” dichotomy provides an intuitive sense of attentional, foveo-peripheral visual mechanism • Caution: the “what/where” account is probably overly simplistic and is but one theory of visual attention

  10. B: Neurological Substrate of the Human Visual System (HVS) • Any theory of visual attention must address the fundamental properties of early visual mechanisms • Examination of the neurological substrate provides evidence of limited information capacity of the visual system—a physiological reason for an attentional mechanism

  11. B.1: The Eye Fig. 3: The eye—“the world’s worst camera” • suffers from numerous optical imperfections... • ...endowed with several compensatory mechanisms

  12. B.1: The Eye (cont’d) Fig. 4: Ocular optics

  13. B.1: The Eye (cont’d) • Imperfections: • spherical abberations • chromatic abberations • curvature of field • Compensations: • iris—acts as a stop • focal lens—sharp focus • curved retina—matches curvature of field

  14. B.2: The Retina • Retinal photoreceptors constitute first stage of visual perception • Photoreceptors  transducers converting light energy to electrical impulses (neural signals) • Photoreceptors are functionally classified into two types: rods and cones

  15. B.2: The Retina—rods and cones • Rods: sensitive to dim and achromatic light (night vision) • Cones: respond to brighter, chromatic light (day vision) • Retinal construction: 120M rods, 7M cones arranged concentrically

  16. B.2: The Retina—cellular makeup • The retina is composed of 3 main layers of different cell types (a 3-layer “sandwich”) • Surprising fact: the retina is “inverted”— photoreceptors are found in the bottom layer (furthest away from incoming light) • Connection bundles between layers are called plexiform or synaptic layers

  17. B.2: The Retina—cellular makeup (cont’d) Fig.5: The retinocellular layers (w.r.t. incoming light): • ganglion layer • inner synaptic plexiform layer • inner nuclear layer • outer synaptic plexiform layer • outer layer

  18. B.2: The Retina—cellular makeup (cont’d) Fig.5 (cont’d): The neuron: • all retinal cells are types of neurons • certain neurons mimic a “digital gate”, firing when activation level exceeds a threshold • rods and cones are specific types of dendrites

  19. B.2: The Retina—retinogeniculate organization (from outside in, w.r.t. cortex) • Outer layer: rods and cones • Inner layer: horizontal cells, laterally connected to photoreceptors • Ganglion layer: ganglion cells, connected (indirectly) to horizontal cells, project via the myelinated pathways, to the Lateral Geniculate Nuclei (LGN) in the cortex

  20. B.2: The Retina—receptive fields • Receptive fields: collections of interconnected cells within the inner and ganglion layers • Field organization determines impulse signature of cells, based on cell types • Cells may depolarize due to light increments (+) or decrements (-)

  21. B.2: The Retina—receptive fields (cont’d) Fig.6: Receptive fields: • signal profile resembles a “Mexican hat” • receptive field sizes vary concentrically • color-opposing fields also exist

  22. B.3: Visual Pathways • Retinal ganglion cells project to the LGN along two major pathways, distinguished by morphological cell types:  and  cells •  cells project to the magnocellular (M-) layers •  cells project to the parvocellular (P-) layers • Ganglion cells are functionally classified by three types: X, Y, and W cells

  23. B.3: Visual Pathways—functional response of ganglion cells • X cells: sustained stimulus, location, and fine detail • nervate along both M- and P- projections • Y cells: transient stimulus, coarse features, and motion • nervate along only the M-projection • W cells: coarse features and motion • project to the Superior Colliculus (SC)

  24. B.3: Visual Pathways (cont’d) Fig.7: Optic tract and radiations (visual pathways): • The LGN is of particular clinical importance • M- and P-cellular projections are clearly visible under microscope • Axons from M- and P-layers of the LGN terminate in area V1

  25. B.3: Visual Pathways (cont’d) Table.1: Functional characteristics of ganglionic projections

  26. B.4: The Occipital Cortex and Beyond Fig.8: The brain and visual pathways: • the cerebral cortex is composed of numerous regions classified by their function

  27. B.4: The Occipital Cortex and Beyond (cont’d) • M- and P- pathways terminate in distinct layers of cortical area V1 • Cortical cells (unlike center-surround ganglion receptive fields) respond to orientation-specific stimulus • Pathways emanating from V1 joining multiple cortical areas involved in vision are called streams

  28. B.4: The Occipital Cortex and Beyond—directional selectivity • Cortical Directional Selectivity (CDS) of cells in V1 contributes to motion perception and control of eye movements • CDS cells establish a motion pathway from V1 projecting to areas V2 and MT (V5) • In contrast, Retinal Directional Selectivity (RDS) may not contribute to motion perception, but is involved in eye movements

  29. B.4: The Occipital Cortex and Beyond—cortical cells • Two consequences of visual system’s motion-sensitive, single-cell organization: • due to motion sensitivity, eye movements are never perfectly still (instead tiny jitter is observed, termed microsaccade)—if eyes were stabilized, image would fade! • due to single-cell organization, representation of natural images is quite abstract: there is no “retinal buffer”

  30. B.4: The Occipital Cortex and Beyond—2 attentional streams • Dorsal stream: • V1, V2, MT (V5), MST, Posterior Parietal Cortex • sensorimotor (motion, location) processing • the attentional “where”? • Ventral (temporal) stream: • V1, V2, V4, Inferotemporal Cortex • cognitive processing • the attentional “what”?

  31. B.4: The Occipital Cortex and Beyond—3 attentional regions • Posterior Parietal Cortex (dorsal stream): • disengages attention • Superior Colliculus (midbrain): • relocates attention • Pulvinar (thalamus; colocated with LGN): • engages, or enhances, attention

  32. C: Visual Perception (with emphasis on foveo-peripheral distinction) • Measurable performance parameters may often (but not always!) fall within ranges predicted by known limitations of the neurological substrate • Example: visual acuity may be estimated by knowledge of density and distribution of the retinal photoreceptors • In general, performance parameters are obtained empirically

  33. C.1: Spatial Vision • Main parameters sought: visual acuity, contrast sensitivity • Dimensions of retinal features are measured in terms of projected scene onto retina in units of degrees visual angle, where S is the object size and D is distance

  34. C.1: Spatial Vision—visual angle Fig.9: Visual angle

  35. C.1: Spatial Vision—common visual angles Table 2: Common visual angles

  36. C.1: Spatial Vision—retinal regions • Visual field: 180° horiz.  130° vert. • Fovea Centralis (foveola): highest acuity • 1.3° visual angle; 25,000 cones • Fovea: high acuity (at 5°, acuity drops to 50%) • 5° visual angle; 100,000 cones • Macula: within “useful” acuity region (to about 30°) • 16.7° visual angle; 650,000 cones • Hardly any rods in the foveal region

  37. C.1: Spatial Vision—visual angle and receptor distribution Fig.10: Retinotopic receptor distribution

  38. C.1: Spatial Vision—visual acuity Fig.11: Visual acuity at eccentricities and light levels: • at photopic (day) light levels, acuity is fairly constant within central 2° • acuity drops of linearly to 5°; drops sharply (exp.) beyond • at scotopic (night) light levels, acuity is poor at all eccentricities

  39. C.1: Spatial Vision—measuring visual acuity • Acuity roughly corresponds to foveal receptor distribution in the fovea, but not necessarily in the periphery • Due to various contributing factors (synaptic organization and later-stage neural elements), effective relative visual acuity is generally measured by psychophysical experimentation

  40. C.2: Temporal Vision • Visual response to motion is characterized by two distinct facts: persistence of vision (POV) and the phi phenomenon • POV: essentially describes human temporal sampling rate • Phi: describes threshold above which humans detect apparent movement • Both facts exploited in media to elicit motion perception

  41. C.2: Temporal Vision—persistence of vision Fig.12: Critical Fusion Frequency: • stimulus flashing at about 50-60Hz appears steady • CFF explains why flicker is not seen when viewing sequence of still images • cinema: 24 fps  3 = 72Hz due to 3-bladed shutter • TV: 60 fields/sec, interlaced

  42. C.2: Temporal Vision—phi phenomenon • Phi phenomenon explains why motion is perceived in cinema, TV, graphics • Besides necessary flicker rate (60Hz), illusion of apparent, or stroboscopic, motion must be maintained • Similar to old-fashioned neon signs with stationary bulbs • Minimum rate: 16 frames per second

  43. C.2: Temporal Vision—peripheral motion perception • Motion perception is not homogeneous across visual field • Sensitivity to target motion decreases with retinal eccentricity for slow motion... • higher rate of target motion (e.g., spinning disk) is needed to match apparent velocity in fovea • …but, motion is more salient in periphery than in fovea (easier to detect moving targets than stationary ones)

  44. C.2: Temporal Vision—peripheral sensitivity to direction of motion Fig.13: Threshold isograms for peripheral rotary movement: • periphery is twice as sensitive to horizontal-axis movement as to vertical-axis movement • (numbers in diagram are rates of pointer movement in rev./min.)

  45. C.3: Color Vision—cone types • foveal color vision is facilitated by three types of cone photorecptors • a good deal is known about foveal color vision, relatively little is known about peripheral color vision • of the 7,000,000 cones, most are packed tightly into the central 30° foveal region Fig.14: Spectral sensitivity curves of cone photoreceptors

  46. C.3: Color Vision—peripheral color perception fields • blue and yellow fields are larger than red and green fields • most sensitive to blue, up to 83°; red up to 76°; green up to 74° • chromatic fields do not have definite borders, sensitivity gradually and irregularly drops off over 15-30° range Fig.15: Visual fields for monocular color vision (right eye)

  47. C.4: Implications for Design of Attentional Displays • Need to consider distinct characteristics of foveal and peripheral vision, in particular: • spatial resolution • temporal resolution • luminance / chrominance • Furthermore, gaze-contingent systems must match dynamics of human eye movement

  48. D: Taxonomy and Models of Eye Movements • Eye movements are mainly used to reposition the fovea • Five main classes of eye movements: • saccadic • smooth pursuit • vergence • vestibular • physiological nystagmus • (fixations) • Other types of movements are non-positional (adaptation, accommodation)

  49. D.1: Extra-Ocular Muscles Fig.16: Extrinsic muscles of the eyes: • in general, eyes move within 6 degrees of freedom (6 muscles)

  50. D.1: Oculomotor Plant Fig.17: Oculomotor system: • eye movement signals emanate from three main distinct regions: • occipital cortex (areas 17, 18, 19, 22) • superior colliculus (SC) • semicircular canals (SCC)

More Related