Upload
others
View
2
Download
0
Embed Size (px)
Citation preview
Jeff B. PelzVisual Perception Laboratory
Carlson Center for Imaging ScienceRochester Institute of Technology
Insights into High-level Visual PerceptionInsights into High-level Visual Perception
or “Where You Look is What You Get”or “Where You Look is What You Get”
StudentsStudents
Roxanne Canosa (Ph.D. Imaging Science)
Jason Babcock (MS Color Science)
Eric Knappenberger (MS Imaging Science)
Dan Lerner (BS Imaging Science)
Marianne Lipps (BS Imaging Science)
“Optical Illusions”
Reveal the shortcomings of the visual system, and our
best effort to make sense from incomplete information
OutlineOutline
1. What are the fundamental limitations
of the visual system?
OutlineOutline
2. What strategies are employed to
compensate for those limitations?
1. Fundamental limitations
OutlineOutline
2. Strategies to compensate for limitations
3. Can we build tools that take advantage of
those strategies to inform the design and
evaluation of imaging systems?
1. Fundamental limitations
OutlineOutline
2. Strategies to compensate for limitations
3. Build design and evaluation tools
1. Fundamental limitations
4. Can we use our understanding of the human visual system to aid design of next-generation computer vision systems?
u Visual perception is a complex process thatunfolds over time, typically occurring at alevel below conscious awareness.
u People are often unaware of the details of howthey perform many tasks, including gatheringvisual information from the environment.
u By monitoring the eye movement patterns ofobservers as they perform a task, we can learnabout task strategy and performance.
IntroductionIntroduction
Fundamental LimitationsFundamental Limitations
1. What are the fundamental limitations
of the visual system?
There were evolutionary pressures for high-acuity vision (human as predator), and a widefield-of-view (human as prey).
The Design of the Visual SystemThe Design of the Visual System
There were evolutionary pressures for high-acuity vision (human as predator), and a widefield-of-view (human as prey).
Even if the entire cortex were devoted to vision,there are not sufficient resources to represent alarge visual field at high acuity.
The Design of the Visual SystemThe Design of the Visual System
The solution favored by nature representeda compromise between the two demands.
The foveal compromise makes use of:
A. Anisotropic sampling of the scene
B. Serial execution (task switching)
C. Limited internal representations
D. Focused attention
The The Foveal CompromiseFoveal Compromise
The foveal compromise
High-acuity central fovea
Limited-acuity periphery
A. Anisotropic Sampling of the Visual FieldA. Anisotropic Sampling of the Visual Field
periphery center periphery
phot
orec
epto
r de
nsity
If you can read this you must be cheating.+
Anisotropic Sampling of the Visual FieldAnisotropic Sampling of the Visual Field
The visual field must be sampled by thehigh-acuity fovea:
If you can read this you must be cheating
The foveal compromise requires a mechanismfor moving the eyes about the scene.
Anisotropic Sampling of the Visual FieldAnisotropic Sampling of the Visual Field
OutlineOutline
2. What strategies are employed to
compensate for those limitations?
1. Fundamental limitations
Each eye has three agonist-antagonist muscle pairs torotate the eye horizontally,vertically, and about theoptical axis.
Foveal Compromise: Eye MovementsFoveal Compromise: Eye Movements
Types of Eye MovementsTypes of Eye Movements
Smooth pursuit: match object motion
Vestibular-ocular response: compensate for self-motion
Vergence: merge images at different distances
Saccades: move fovea to new location
Background: Eye Movement TypesBackground: Eye Movement Types
Smooth pursuit
Vestibular-ocular response
Vergence
Saccades - Image destabilization: shifts fovea to new image region
Imagestabilization
u SaccadesAmplitude: < 1° → > 45° visual angle
Velocity: > 600°/secondFrequency: ~ 3-4/second (>150,000/day)
Saccades are made to targets requiring high spatial resolution and to the locus of attention.
Destabilizing Eye MovementsDestabilizing Eye Movements
B. Serial Execution: Sequential SamplingB. Serial Execution: Sequential Sampling
Serial Execution: Sequential SamplingSerial Execution: Sequential Sampling
Serial Execution: Sequential SamplingSerial Execution: Sequential Sampling
Serial Execution: Sequential SamplingSerial Execution: Sequential Sampling
Serial Execution: Sequential SamplingSerial Execution: Sequential Sampling
Serial Execution: FoveationsSerial Execution: Foveations
With each eye movement, the fovea ‘slidesunder’ a new portion of the retinal image.
A new portion of the image is sampled, buteach new sample is centered on the fovea
Serial Execution: FoveationsSerial Execution: Foveations
Serial Execution: FoveationsSerial Execution: Foveations
Serial Execution: FoveationsSerial Execution: Foveations
Serial Execution: FoveationsSerial Execution: Foveations
C. Internal RepresentationC. Internal Representation
B
A
If a high-acuity internal representation is built
up over multiple fixations, it should be easy to
detect even small differences between images.
Internal RepresentationInternal Representation
Following are two versions of the school
children, separated by a blank slide.
There is a difference between the two;
your task is to identify the difference.
View them in alternation, trying to find
the difference. The difference is clearly
visible in the slide at the end.
Internal RepresentationInternal Representation
A
View ~3 sec, then advanceView ~3 sec, then advance
View ~1/2 sec, then continueView ~1/2 sec, then continue
B
View ~3 sec, then REVERSEView ~3 sec, then REVERSE
A
Compare to previous slideCompare to previous slide
Something beyond variable acuity is responsible.
Deploying attention to different areas insequence conserves limited resources.
Changes to the scene can be made to unattendedregions without affecting conscious perception.
In nature, such changes usually induce apparentmotion, drawing attention to the region.
Limited Neural ResourcesLimited Neural Resources
The limited acuity periphery must besampled by the high-acuity fovea,resulting in serial data acquisition.
The eye movements guiding thatacquisition are externally-observablemarkers of acuity demands, deploymentof attention, and perceptual strategies.
Serial Execution: Eye MovementsSerial Execution: Eye Movements
Serial Execution; Image PreferenceSerial Execution; Image Preference
3 sec viewing
OutlineOutline
2. Strategies to compensate for limitations
3. Can we build tools that take advantage of
those strategies to inform the design and
evaluation of imaging systems?
1. Fundamental limitations
Measuring eye movementsMeasuring eye movements
The Problem:
“After all, the eye is sitting in a bag of fat in ahole in your head, and there are six big musclespulling on it.”
Cornsweet, 1976
The Solution:
“Barlow photographed a droplet of mercury placedon the limbus. Translations of the head wereminimized by having subjects lie on a stone slabwith their heads wedged tightly inside a rigid ironframe”
Kowler, 1990
Measuring eye movementsMeasuring eye movements
Measuring eye movementsMeasuring eye movements
Measuring eye movementsMeasuring eye movements
Video-based eyetrackerLimbus eyetracker
Measuring eye movementsMeasuring eye movements
Scleral eye-coils Dual Purkinje eyetracker
Infrared / VideoHeadband-mounted eyetracker
Head-mounted Head-mounted eyetrackereyetracker
Infrared, Video-based Infrared, Video-based EyetrackersEyetrackers
u Bright Pupil; On-axis Illumination
IRED
IRcamera
Remote eyetrackerRemote eyetracker
Infrared / VideoRemote-head eyetracker
Change BlindnessChange Blindness
Human Computer InterfaceHuman Computer Interface
= 250 ms
VisualizationVisualization
Image & Subject DependenceImage & Subject Dependence
Radiographic Search: ScanpathRadiographic Search: Scanpath
Radiographic Search: Fixation DensityRadiographic Search: Fixation Density
Measuring eye movementsMeasuring eye movements
These commercially available eyetrackersare restricted to laboratory use.
The ability to monitor perception as peopleperform real tasks in the real world wouldallow us to ask new kinds of questions.
RIT Wearable EyetrackerRIT Wearable Eyetracker
color CMOS scene camera
calibration LASER
hot mirror
folding mirror
IR illuminator/optics module
monochrome CMOS eye camera
RIT Wearable EyetrackerRIT Wearable Eyetracker
Fixation Sequence Before Image CaptureFixation Sequence Before Image Capture
Complex, Familiar TasksComplex, Familiar Tasks
OutlineOutline
2. Strategies to compensate for limitations
3. Build design and evaluation tools
1. Fundamental limitations
4. Can we use our understanding of the human
visual system to aid design of next-generation computer vision systems?
Because vision is effortless for humans, computervision was chosen as an early research domain.
Early attempts at computer vision systems attackedthe problem by brute force with limited success:
Tried Image Understanding on static 2D images(“From Pixels to Predicates”)
MotivationMotivation
Even in the face of Moore’s Law, computerswill not have sufficient power in the foreseeablefuture to solve “vision” by brute force.
LimitedLimited ComputationalComputational ResourcesResources
Even in the face of Moore’s Law, computerswill not have sufficient power in the foreseeablefuture to solve “vision” by brute force.
Computer-based perception faces the samefundamental challenge that human perceptiondid during evolution:
limited computational resources
LimitedLimited ComputationalComputational ResourcesResources
The solution favored by nature:
A. Anisotropic sampling of the scene
B. Serial execution (task switching)
C. Limited internal representations
D. Focused attention
The The Foveal CompromiseFoveal Compromise
Sensorial Experience
High-level Visual Perception
Attentional Mechanisms
Eye Movements
MotivationMotivation: : CognitiveCognitive ScienceScience
Human Cognition
Attentional Mechanisms
Eye Movements
Motivation: Cognitive ScienceMotivation: Cognitive Science
Artificial Intelligence
Computer Vision
“Active Vision”
Human Cognition
Sensorial Experience
High-level Visual Perception
Inspiration - Inspiration - Active VisionActive Vision
Active vision was the first step. Unliketraditional approaches to computer vision,active vision systems focused on extractinginformation from dynamic, 3D scenes.
CS @ U PennVision & robotics @ UR
Aloimonos, 1987 Bajcsy, 1988
Ballard, 1989 Brooks, 1991
Active VisionActive Vision
Inspired by anisotropic, binocular vision inhumans, researchers built neuromorphicvision systems that took advantage of‘active’ cameras.
Humanoid robotics @ MITVision & robotics @ UR
InspirationInspiration - “ - “ActiveActive VisionVision””
Visual routines were an important component
of the Active Vision approach. Pre-defined
routines are scheduled and run to extract
information when and where it is needed.
Limited representation + task-switching
Deploying attention and eye movements arecontrolled below conscious awareness; theremust be mechanisms (strategies) that protectus from the constraints of visual perception inthe real world - that help us make sense fromthe incomplete data available.
PerceptualPerceptual StrategiesStrategies
Beyond the mechanics of how the eyesmove during real tasks, we are interested instrategies that may support the consciousperception that is continuous temporally aswell as spatially.
PerceptualPerceptual strategiesstrategies
GoalGoal - “ - “StrategicStrategic VisionVision””
Strategic Vision can use high-level, top-down strategies for extracting informationfrom complex environments.
GoalGoal - “ - “StrategicStrategic VisionVision””
Strategic Vision can use high-level, top-down strategies for extracting informationfrom complex environments.
One goal of our research is to study humanbehavior in natural, complex tasks to searchfor visual routines that emerge under real-world constraints.
Perceptual StrategiesPerceptual Strategies
LimitedLimited representationsrepresentations: Successive Foveations: Successive Foveations
LimitedLimited representationsrepresentations: Successive Foveations: Successive Foveations
0 msec
LimitedLimited representationsrepresentations: Successive Foveations: Successive Foveations
770 msec
LimitedLimited representationsrepresentations: Successive Foveations: Successive Foveations
1400 msec
LimitedLimited representationsrepresentations: Successive Foveations: Successive Foveations
2000 msec
LimitedLimited representationsrepresentations: Successive Foveations: Successive Foveations
2700 msec
LimitedLimited representationsrepresentations: Successive Foveations: Successive Foveations
2800 msec
guiding fixation look-ahead fixation interaction
2000 msec 800 msec
Perceptual Strategies: Perceptual Strategies: Look-ahead Look-ahead fixationsfixations
. . .
Intervening tasks
0 5000
Sub-tasks
Fixations
milliseconds
Interposed look-ahead
2000 7000milliseconds
Sequenced look-ahead
Sub-tasks
Fixations
Perceptual Strategies: Perceptual Strategies: Look-ahead Look-ahead fixationsfixations
Humans employ strategies to ease thecomputational and memory loads inherent incomplex tasks. Look-ahead fixationsrepresent one such strategy:
Opportunistic execution of information-gathering visual routines to pre-fetchinformation needed for future subtasks.
Perceptual Strategies: Perceptual Strategies: Look-ahead Look-ahead fixationsfixations
u Monitoring eye movements gives us a windowinto perception and cognition that can revealdetails not available even to the observer.
u Visual Strategies observed can help usunderstand how people use vision in theirinteraction with the world, and perhaps aid inthe design of artificial systems that takeadvantage of this knowledge.
ConclusionsConclusions
ConclusionsConclusions
Tools that monitor subjects’ eye movementscan aid in the design and evaluation of imagingsystems.
The design of next-generation computer visionsystems may be aided by implementing algorithmsderived by understanding the strategies employedby the human visual system to compensate forlimited computational resources.
Questions?Questions?