1 / 62

Perception

This lecture explores the optical quality of the human eye and its impact on perception. Topics include range of focus, image formation on the retina, line spread function, modulation transfer function, Ricco's Law, spatial resolution, Vernier acuity, flicker fusion threshold, frame rate vs. flicker rate, interlaced displays, color sequential displays, Mach banding, Stevens' Power Law, and numeric representation of intensity.

paulamorgan
Télécharger la présentation

Perception

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Perception Kurt Akeley CS248 Lecture 18 29 November 2007 http://graphics.stanford.edu/courses/cs248-07/

  2. Today • This is the last for-credit lecture • Material from next-weeks lectures will not be tested • Emphasize perception • Pull together and re-emphasize ideas from earlier lectures • Introduce some new ideas • Tie everything back to performance

  3. What is the image of this (ideal) line? Optical quality of the eye • Range of focus: • 5” to infinity (you) • 40” to infinity (me, corrected) Fovea Image from www.wikipedia.com

  4. Retinal image of an ideal line Eye image from www.wikipedia.com

  5. Line spread function

  6. Retinal image of a sine wave grating Lower contrast Eye image from www.wikipedia.com

  7. Modulation transfer function

  8. Ricco’s Law • Area and intensity are indistinguishable for objects that subtend less than (roughly) 6 arc min. • This allows antialiasing to work • Especially fractional-width points and lines • Antialiased pixels should subtend less than 6 arc min

  9. Ricco’s Law and line spread (a coincidence?) 6 arc min

  10. Spatial resolution of the eye • Cone spacing in the fovea: • L and M cones: 0.5 arc min • S cones: 10 arc min • Nyquist frequency for foveal photopic vision is 60 cpd • Half the 120 cone/deg density • Nyquist frequency is much lower outside the fovea • Effective receptor density falls to 1/20th that of the fovea ? • Rendering can take advantage of this • E.g., insets in flight-simulation graphics accelerators Thus the lower spectral response seen in the color theory lecture

  11. No aliasing in foveal vision Peripheral Nyquist frequency (approximate) Foveal Nyquist frequency

  12. No aliasing in foveal S cones either • Optics of the eye are substantially worse for 400 nm light • MTF did not show this (it is an aggregate)

  13. Vernier acuity • Can detect an offset of 5 arc sec • But sensor spacing is 30 arc sec • How does this work? • Not due to random sensor locations (works with very short lines) 5 arc sec

  14. How vernier acuity (probably) works Cone spacing

  15. Display resolution θ d h Satisfies Ricco’s Law (less than 6 arc min)

  16. Matching foveal resolution Foveal resolution

  17. Flicker • Flicker fusion threshold • Statistically 16 Hz • Increases • In peripheral vision • With brighter scenes • With viewer fatigue • Flicker rates: • Movies: 48 Hz (typical), 72 Hz (using computer displays) • Video: 60 Hz (US NTSC), 50 Hz (Europe and Asia, PAL) • Computer displays: 60-100 Hz (CRT), no flicker (LCD) • Fluorescent lights: 120 Hz (US), 100 Hz (Europe, Asia) Hence “jumping” numeric or CRT displays, when you aren’t looking directly at them

  18. Frame rate vs. flicker rate • Increasing flicker rate above frame rate: • Avoids flicker-rate problems • But introduces visual artifacts • Image doubling (2x) or even tripling (3x)

  19. Interlaced displays • Two fields per “frame” • Display odd lines in the first field • Display even lines in the second field • “Frame” is misleading: • True interlaced sampling is “flying spot” • Each pixel is sampled and displayed at proportional times • Motion artifacts are avoided • Interlaced frames (e.g., video display of a movie) • All pixels are sampled at the same moment • But display is sequential, causing motion artifacts • Still common in video • 1080i is standard • 1080p is becoming more common Big battle during definition of HDTV!

  20. Field n Field n+1 Field n+2 Interlacing and antialiasing • Small moving objects can disappear • Object subtends a single pixel • Fields are rendered properly (not from a single frame) • One solution is antilaliasing with a large filter kernel • Rendered objects necessarily subtend more than a single pixel

  21. Color sequential displays • Time-sequential red, green, blue (and sometimes white) • Examples: • Many digital projectors • Professional head-mounted displays • Should render each “frame” separately • Movies don’t • So time sequential projectors yield “rainbow” effects • Simulation systems do • So motion artifacts are avoided

  22. Mach banding – slope discontinuities Same peak intensities

  23. Human response is not linear • Twice as many photons/sec does not appear twice as bright • Instead 5.7 times as many photons appear twice as bright • Brightness (human perception) and intensity (actual photon rate) are related by Steven’s Power Law:

  24. Human sensitivity is not linear either • Can distinguish intensity differences of 1% • Static images • Photopic (intensities bright enough for cones to see) • This corresponds to a linear change in brightness

  25. Motion matters

  26. Numeric representation • Optimal numeric representation would arrange for adjacent intensities to be (barely) indistinguishable. • Thus optimal numeric representation is • nonlinear in intensity (relative differences of 1 percent) • but linear in brightness (absolute differences of k(0.01)0.4)

  27. Contrast ratio • Visible contrast: • 4-5 orders of magnitude within a scene (at the same time) • 6 orders of magnitude of “adaptation” • Can take up to 40 minutes, though

  28. Solutions • Brightness-linear storage • Use linear arithmetic (get incorrect answers) • Use non-linear arithmetic (get correct answers) • Convert convert to intensity-linear, operate, convert back • Implement nonlinear arithmetic • Intensity-linear storage • Gamma correct (convert to brightness-linear form) when displaying

  29. Brightness-linear storage Intensities can be added, brightnesses cannot Store image linear in brightness (unusual in 3-D systems) Best use of available storage precision 256 representable levels are enough Requires conversion for each pixel operation (e.g., blend) n 8 8-bitframebuffer Gammaconverter Display DAC 8

  30. Intensity-linear storage Store image linear in intensity (typical in 3-D systems) Native arithmetic format Requires conversion during display Large brightness steps at low intensities 256 DAC levels is OK, but frame buffer needs more n n n-bitframebuffer Gammaconverter Display DAC 8

  31. What is n ? Assume 8-bit DAC Gamma of 2.4 …

  32. Display gamut No finite set of primaries can reproduce the entire gamut. But more primaries do a better job.

  33. Perception and Performance (adapted from my VR2004 keynote)

  34. Latency • For an out-the-window display • 100 to 150 milliseconds • For a head-mounted display • 5 to 15 milliseconds** • Total response latency, sum of • Tracking/input delay, plus • Rendering delay, plus • Display delay • A 72 Hz display refreshes every 14 ms ** source: Fred Brooks

  35. Latency solution • Reduce system latency to 5-15 ms range • Requires 2-4 ms frame time (250-500 Hz) • Assuming 3-frame latency • Estimated cost: 5x

  36. Running total

  37. Stereo solution • Binocular disparity is a very strong visual cue • Must render separately for each eye • Occlusion • View-dependent lighting (e.g. reflections, specularity) • Alternatives tend to be hacks • Estimated cost: 2x

  38. Running total

  39. Incorrect retinal cue – blur gradient Correct Incorrect

  40. f Focus cue solution • Multiple image plane display • Fixed relationship to viewer (e.g. head mounted) • Low resolution in depth • Non-occluding images with depth filtering • Separate left and right displays (2x cost already accounted) • Leverages 2D technology • Amounts to a 2.5D display • Cost estimate: 3x

  41. Running total

  42. High Dynamic Range (HDR) Numbers from Sunnybrook Technologies • Human limitations • 1,000,000:1 range of sensitivity • 100,000:1 contrast within scene • Current displays • CRT 300:1 contrast ratio • LCD 1000:1 contrast ratio • SIGGRAPH 2003 ET • Sunnybrook Technologies

  43. Sunnybrook Technologies • Dual-density display • Conventional LCD panel in front (full-resolution) • White LED array used as back-light (~1/50 resolution)

  44. Sunnybrook Technologies • Scattering masks low resolution LEDs

  45. HDR solution • Requires 16-bit framebuffer components • Rendering • Blending • Full-scene anti-aliasing • Requires multi-resolution rendering • Full-resolution for LCD, corrected for back-lighting • Low-resolution for back-lighting • Estimated cost: 2x

  46. Running total

  47. Field of view • Human field of view (FOV) • Monocular: 160 deg (wide) x 135 deg (high) • Binocular: 200 deg (wide) • Binocular overlap: 120 deg (wide) • Typical screen FOV • 55 deg (wide) x 41 deg (high) d d

  48. Optical flow matters “Women Go With the (Optical) Flow”, Desney S. Tan, Mary Czerwinski, George Robertson. http://research.microsoft.com/users/marycz/chi2003flow.pdf

  49. FOV solution • Double horizontal FOV to 110 degrees • Double vertical FOV to 80 degrees • Cleverness to distribute resolution ? • e.g. cylindrical projection • Estimated cost: 5x

  50. Pixels subtend different angles • Assumes planar display Center pixel Edge Pixel Field of view

More Related