Upload
preston-donovan
View
28
Download
1
Embed Size (px)
DESCRIPTION
Studying Visual Attention with the Visual Search Paradigm Marc Pomplun Department of Computer Science University of Massachusetts at Boston E-mail: [email protected] Homepage: http://www.cs.umb.edu/~marc/. Overview: The Feature Integration Theory Visual Search The Guided Search Theory - PowerPoint PPT Presentation
Citation preview
Studying Visual Attention with the
Visual Search Paradigm
Marc Pomplun
Department of Computer ScienceUniversity of Massachusetts at Boston
E-mail: [email protected]:
http://www.cs.umb.edu/~marc/
Studying Visual Attention with the
Visual Search ParadigmOverview:
• The Feature Integration Theory• Visual Search• The Guided Search Theory• The Area Activation Model
The Binding Problem
• Different features of the visual scene are coded by separate systems– e.g., direction of motion, location,
color and orientation
• How do we know this?– Anatomical & neurophysiological
evidence– Brain Imaging (fMRI & PET)
• So how do we experience a coherent
world?
Feature Integration Theory (Treisman et al)
• Attention is used to bind features together
• Code one object at a time on the basis of its location
• Bind together whatever features are attended at that location
Feature Integration Theory
• Sensory “features” (color, size, orientation, etc) are coded in parallel by specialized modules
• Modules form two kinds of “maps”– Feature maps (e.g., color maps,
orientation maps etc.)– A master map of locations
Feature Integration Theory
• Feature maps contain two kinds of information:
- presence of a feature anywhere in the field (“there’s something red out there”)
- implicit spatial information about the feature
• Activity in the feature maps can tell us which features are contained in the visual scene.
• It cannot tell us which other features the “green blob” has.
• The master map codes the location of features.
Feature Integration TheoryThe basic idea of the FIT is that visual attention is used for • Locating features• Binding appropriate features together
There are two stages of object perception:• Preattentive stage: Individual features are extracted in parallel across the whole visual scene. • Attentive stage: When attention is directed to a location, the local features are combined to form a whole.
Feature Integration Theory
• Attention moves within the location
map• Focus of attention selects whatever
features are linked to that location
• Features of other objects are excluded
• Attended features are then entered into the current temporary object representation
Feature Integration Theory
Empirical evidence for the FIT has been obtained through
• Visual search tasks
• Illusory conjunctions
We will focus on the paradigm of visual search.
Visual Search
Feature Search
• Is there a red T in the display?
TT
T
T
T
T
T
TT
T T• Target defined by a single feature• According to
FIT, this should not demand attention• Target should “pop out”
Conjunction Search
• Is there a red T in the display?
X
T
TT
X
T
TX
T TX
X
T
T
• Target is now defined by its shape and color
• This involves binding features and so should demand attention
• Need to attend to each item until target is found
Feature SearchChanging the number of distractors:
TT
T
T
T
T
T
TT
T T
TT
T
T
T
T
T
TT
TT
T
TT T
TT
TT
T
T T
T
TT
T
T T T
T
Conjunction SearchChanging the number of distractors:
XTT
T TX
X
T
X
X
T
T T
X
T
TX
T TX
X
T
TX
XX
T
T
T
TT
T
T
T
T
TX
X
X XX
XX
X
X
X
Visual Search Experiments
• Record time taken to determine whether target is present or not
• Vary the number of distractors
• Search for features should be independent of the number of distractors
• Conjunction search should get slower with more distractors
Visual Search
0
500
1000
1500
2000
2500
3000
1 5 15 30Display Size
Feature Target
Conjunction Target
• Conjunction targets demand serial search
significant slope
• Feature targets pop out
flat display size function
Problem with FIT: Pop-Out of Conjunction
Targets
• A moving X pops out of a display of moving O’s and static X’s
O
O OO
OX
X
X
XX• Target is defined by a conjunction of movement and form• At least some conjunctions do not require focal attention
Guided Search Theory
The Guided Search Theory (GST) is similar to the FIT in that it also assumes two subsequent stages of visual search performance:• a preattentive, parallel stage• an attentive, serial stageHowever, the main difference to FIT is that GST assumes the preattentive stage to obtain spatial saliency information that is used to guide attention in the serial stage.
Guided Search Theory
According to GST, saliency is encoded in an additional map, called the saliency map.The saliency map is created during the preattentive stage and can combine multiple features if necessary.In the subsequent serial search process, attention is first directed to the highest “peak” in the saliency map, then to the second-highest, and so on.This visual guidance allows efficient search even for some conjunction targets.
Guided Search Theory
Support for the GST comes from eye-movement research.Eye-movement recording allows researchers to determine the items that a subject looks at during visual search.
Guided Search Theory
Guided Search Theory
In the previous example, • 80% of fixations were closest to an item sharing color with the target,• 20% of fixations were closest to an item sharing orientation with the target.It seems that the color dimension is guiding the subject’s visual search process.Of course, due to imprecision of eye movements and their measurement, better statistics are necessary to determine the guiding dimension.
Guided Search Theory
In visual search tasks, subjects are usually guided by one target feature or a combination of target features.This supports the idea of GST that preattentively derived information from multiple dimensions guides and thereby facilitates the subsequent serial search process.
Guided Search TheoryThere are two problems with GST:• According to GST, grouping the guiding distractors should result in reduced guidance (less bottom-up activation). However, the opposite happens.• There is no quantitative implementation of a Guided Search model that could predict guidance, i.e., saccadic selectivity for a given search task.
To overcome these problems, we proposed the Area Activation Model of saccadic selectivity in visual search tasks.
Area Activation
Assumptions:• Processing resources during a fixation
are distributed like a two-dimensional Gaussian function centered at fixation.
• Fixation positions are chosen to allow a maximum of information processing according to the assumed processing resources.
• Scan paths are chosen in such a way that they connect the optimal fixation positions with minimal eye-movement cost (path length).
Area Activation - Strong Guidance
Area Activation - Strong Guidance
Area Activation - Weak Guidance
Area Activation - Weak Guidance
Area Activation - Empirical Results
Area Activation
Problems with the Area Activation Model:
• Empirical number of fixations per trial needs to be known in advance.
• Only very basic factors influencing visual search have been implemented so far.
Nevertheless, Area Activation can be considered a very first step towards a quantitative model of visual search.
ConclusionsWe have discussed how the visual search paradigm can be employed to investigate the mechanisms of visual attention.
Various models of attention have been developed and evaluated with visual search tasks; in more recent studies, this was done based on eye-movement data.
In the next lecture, we will look at slightly different paradigms, which are aimed at identifying factors that determine visual scan paths.
See you then!