Upload
others
View
5
Download
0
Embed Size (px)
Citation preview
Perc
eptu
al
and S
enso
ry A
ugm
ente
d C
om
puti
ng
Co
mp
ute
r V
isio
n W
S 0
/09
Image Segmentation
Vittorio Ferrari
ETH Zurich
Slide credits:
K. Grauman, B. Leibe, S. Lazebnik, S. Seitz,
Y Boykov, W. Freeman, P. Kohli
Perc
eptu
al
and S
enso
ry A
ugm
ente
d C
om
puti
ng
Topics of This Lecture
• Introduction
Gestalt principles
Image segmentation
• Segmentation as clustering
k-Means
Feature spaces
Mixture of Gaussians, EM
• Model-free clustering: Mean-Shift
• Graph theoretic segmentation: Normalized Cuts
• Interactive Segmentation with GraphCuts
Perc
eptu
al
and S
enso
ry A
ugm
ente
d C
om
puti
ng
Examples of Grouping in Vision
Determining image regions
Grouping video frames into shots
Object-level grouping
Figure-ground
Slide credit: Kristen Grauman
What things should
be grouped?
What cues
indicate groups?
Perc
eptu
al
and S
enso
ry A
ugm
ente
d C
om
puti
ng
Co
mp
ute
r V
isio
n W
S 0
8/0
9
Similarity in appearance
http://chicagoist.com/attachments/chicagoist_alicia/GEESE.jpg, http://wwwdelivery.superstock.com/WI/223/1532/PreviewComp/SuperStock_1532R-0831.jpg
Slide adapted from Kristen Grauman
Perc
eptu
al
and S
enso
ry A
ugm
ente
d C
om
puti
ng
Co
mp
ute
r V
isio
n W
S 0
8/0
9
Symmetry
http://seedmagazine.com/news/2006/10/beauty_is_in_the_processingtim.php
Slide credit: Kristen Grauman
Perc
eptu
al
and S
enso
ry A
ugm
ente
d C
om
puti
ng
Co
mp
ute
r V
isio
n W
S 0
8/0
9
Common Fate
Image credit: Arthus-Bertrand (via F. Durand)
Slide credit: Kristen Grauman
Perc
eptu
al
and S
enso
ry A
ugm
ente
d C
om
puti
ng
Co
mp
ute
r V
isio
n W
S 0
8/0
9
Proximity
http://www.capital.edu/Resources/Images/outside6_035.jpg
Slide credit: Kristen Grauman
Perc
eptu
al
and S
enso
ry A
ugm
ente
d C
om
puti
ng
Co
mp
ute
r V
isio
n W
S 0
8/0
9
The Gestalt School
• Grouping is key to visual perception
• Elements in a collection can have properties that result
from relationships
“The whole is greater than the sum of its parts”
Illusory/subjective
contoursOcclusion
Familiar configuration
http://en.wikipedia.org/wiki/Gestalt_psychology
Slide credit: Svetlana Lazebnik Image source: Steve Lehar
Perc
eptu
al
and S
enso
ry A
ugm
ente
d C
om
puti
ng
Co
mp
ute
r V
isio
n W
S 0
8/0
9
Gestalt Theory
• Gestalt: whole or group
Whole is greater than sum of its parts
Relationships among parts can yield new properties/features
• Psychologists identified series of factors that predispose
set of elements to be grouped (by human visual system)
Untersuchungen zur Lehre von der Gestalt,
Psychologische Forschung, Vol. 4, pp. 301-350, 1923
http://psy.ed.asu.edu/~classics/Wertheimer/Forms/forms.htm
“I stand at the window and see a house, trees, sky.
Theoretically I might say there were 327 brightnesses
and nuances of colour. Do I have "327"? No. I have sky,
house, and trees.”
Max Wertheimer(1880-1943)
Slide credit: B. Leibe
Perc
eptu
al
and S
enso
ry A
ugm
ente
d C
om
puti
ng
Co
mp
ute
r V
isio
n W
S 0
8/0
9
Gestalt Factors
These factors make intuitive sense, but are very difficult to translate
into algorithms.
Image source: Forsyth & PonceSlide credit: B. Leibe
Perc
eptu
al
and S
enso
ry A
ugm
ente
d C
om
puti
ng
Co
mp
ute
r V
isio
n W
S 0
8/0
9
Continuity through Occlusion Cues
Slide credit: B. Leibe
Perc
eptu
al
and S
enso
ry A
ugm
ente
d C
om
puti
ng
Co
mp
ute
r V
isio
n W
S 0
8/0
9
Continuity through Occlusion Cues
Continuity, explanation by occlusion
Slide credit: B. Leibe
Perc
eptu
al
and S
enso
ry A
ugm
ente
d C
om
puti
ng
Co
mp
ute
r V
isio
n W
S 0
8/0
9
Continuity through Occlusion Cues
Image source: Forsyth & PonceSlide credit: B. Leibe
Perc
eptu
al
and S
enso
ry A
ugm
ente
d C
om
puti
ng
Co
mp
ute
r V
isio
n W
S 0
8/0
9
Continuity through Occlusion Cues
Image source: Forsyth & PonceSlide credit: B. Leibe
Perc
eptu
al
and S
enso
ry A
ugm
ente
d C
om
puti
ng
Co
mp
ute
r V
isio
n W
S 0
8/0
9
Figure-Ground Discrimination
Slide credit: B. Leibe
Perc
eptu
al
and S
enso
ry A
ugm
ente
d C
om
puti
ng
Co
mp
ute
r V
isio
n W
S 0
8/0
9
The Ultimate Gestalt test
Slide adapted from B. Leibe
Perc
eptu
al
and S
enso
ry A
ugm
ente
d C
om
puti
ng
Co
mp
ute
r V
isio
n W
S 0
8/0
9
Image Segmentation
• Goal: identify groups of pixels that go together
Slide credit: Steve Seitz, Kristen Grauman
Perc
eptu
al
and S
enso
ry A
ugm
ente
d C
om
puti
ng
Co
mp
ute
r V
isio
n W
S 0
8/0
9
The Goals of Segmentation
• Separate image into objects
Image Human segmentation
Slide credit: Svetlana Lazebnik
Perc
eptu
al
and S
enso
ry A
ugm
ente
d C
om
puti
ng
Co
mp
ute
r V
isio
n W
S 0
8/0
9
The Goals of Segmentation
• Separate image into objects
• Group together similar-looking pixels for efficiency of
further processing
X. Ren and J. Malik. Learning a classification model for segmentation. ICCV 2003.
“superpixels”
Slide credit: Svetlana Lazebnik
Perc
eptu
al
and S
enso
ry A
ugm
ente
d C
om
puti
ng
Co
mp
ute
r V
isio
n W
S 0
8/0
9
Topics of This Lecture
• Introduction
Gestalt principles
Image segmentation
• Segmentation as clustering
k-Means
Feature spaces
Mixture of Gaussians, EM
• Model-free clustering: Mean-Shift
• Graph theoretic segmentation: Normalized Cuts
• Interactive Segmentation with GraphCuts
Perc
eptu
al
and S
enso
ry A
ugm
ente
d C
om
puti
ng
Co
mp
ute
r V
isio
n W
S 0
8/0
9
Image Segmentation: Toy Example
• These intensities define the three groups.
• We could label every pixel in the image according to
which of these it is.
i.e. segment the image based on the intensity feature.
• What if the image isn‟t quite so simple?
intensity
input image
black pixelsgray
pixels
white
pixels
1 23
Slide credit: Kristen Grauman
Perc
eptu
al
and S
enso
ry A
ugm
ente
d C
om
puti
ng
Co
mp
ute
r V
isio
n W
S 0
8/0
9
Pix
el count
Input image
Input imageIntensity
Pix
el count
Intensity
Slide credit: Kristen Grauman
Perc
eptu
al
and S
enso
ry A
ugm
ente
d C
om
puti
ng
Co
mp
ute
r V
isio
n W
S 0
8/0
9 • Now how to determine the three main intensities that
define our groups?
• We need to cluster.
Input imageIntensity
Pix
el count
Slide credit: Kristen Grauman
Perc
eptu
al
and S
enso
ry A
ugm
ente
d C
om
puti
ng
Co
mp
ute
r V
isio
n W
S 0
8/0
9
• Goal: choose three “centers” as the representative
intensities, and label every pixel according to which of
these centers it is nearest to.
• Best cluster centers are those that minimize SSD
between all points and their nearest cluster center ci:
Slide credit: Kristen Grauman
0 190 255
1 23
Intensity
Perc
eptu
al
and S
enso
ry A
ugm
ente
d C
om
puti
ng
Co
mp
ute
r V
isio
n W
S 0
8/0
9
Clustering
• With this objective, it is a “chicken and egg” problem:
If we knew the cluster centers, we could allocate points to
groups by assigning each to its closest center.
If we knew the group memberships, we could get the centers by
computing the mean per group.
Slide credit: Kristen Grauman
Perc
eptu
al
and S
enso
ry A
ugm
ente
d C
om
puti
ng
Co
mp
ute
r V
isio
n W
S 0
8/0
9
K-Means Clustering
• Basic idea: randomly initialize the k cluster centers, and
iterate between the two steps we just saw.
1. Randomly initialize the cluster centers, c1, ..., cK
2. Given cluster centers, determine points in each cluster
– For each point p, find the closest ci. Put p into cluster i
3. Given points in each cluster, solve for ci
– Set ci to be the mean of points in cluster i
4. If ci have changed, repeat Step 2
• Properties Will always converge to some solution
Can be a “local minimum”
– Does not always find the global minimum of objective function:
Slide credit: Steve Seitz
Perc
eptu
al
and S
enso
ry A
ugm
ente
d C
om
puti
ng
Co
mp
ute
r V
isio
n W
S 0
8/0
9
Segmentation as Clustering
K=2
K=3
img_as_col = double(im(:));
cluster_membs = kmeans(img_as_col, K);
labelim = zeros(size(im));
for i=1:k
inds = find(cluster_membs==i);
meanval = mean(img_as_column(inds));
labelim(inds) = meanval;
end
Slide credit: Kristen Grauman
Perc
eptu
al
and S
enso
ry A
ugm
ente
d C
om
puti
ng
Co
mp
ute
r V
isio
n W
S 0
8/0
9
K-Means Clustering
• Java demo:
http://home.dei.polimi.it/matteucc/Clustering/tutorial_html/AppletKM.html
Perc
eptu
al
and S
enso
ry A
ugm
ente
d C
om
puti
ng
Co
mp
ute
r V
isio
n W
S 0
8/0
9
Feature Space
• Depending on what we choose as the feature space, we
can group pixels in different ways.
• Grouping pixels based on
intensity similarity
• Feature space: intensity value (1D)
Slide credit: Kristen Grauman
Perc
eptu
al
and S
enso
ry A
ugm
ente
d C
om
puti
ng
Co
mp
ute
r V
isio
n W
S 0
8/0
9
Feature Space
• Depending on what we choose as the feature space, we
can group pixels in different ways.
• Grouping pixels based
on color similarity
• Feature space: color value (3D)
R=255
G=200
B=250
R=245
G=220
B=248
R=15
G=189
B=2
R=3
G=12
B=2R
G
B
Slide credit: Kristen Grauman
Perc
eptu
al
and S
enso
ry A
ugm
ente
d C
om
puti
ng
Co
mp
ute
r V
isio
n W
S 0
8/0
9
Segmentation as Clustering
• Depending on what we choose as the feature space, we
can group pixels in different ways.
• Grouping pixels based
on texture similarity
• Feature space: filter bank responses (e.g. 24D)
Filter bank
of 24 filters
F24
F2
F1
…
Slide credit: Kristen Grauman
Perc
eptu
al
and S
enso
ry A
ugm
ente
d C
om
puti
ng
Co
mp
ute
r V
isio
n W
S 0
8/0
9
Spatial coherence
• Assign a cluster label per pixel possible discontinuities
• How can we ensure they
are spatially smooth?
1 2
3
?
Original Labeled by cluster center‟s
intensity
Slide adapted from Kristen Grauman
Perc
eptu
al
and S
enso
ry A
ugm
ente
d C
om
puti
ng
Co
mp
ute
r V
isio
n W
S 0
8/0
9
Spatial coherence
• Depending on what we choose as the feature space, we
can group pixels in different ways.
• Grouping pixels based on
intensity+position similarity
Way to encode both similarity and proximity.
Slide adapted from Kristen Grauman
X
Intensity
Y
Perc
eptu
al
and S
enso
ry A
ugm
ente
d C
om
puti
ng
Co
mp
ute
r V
isio
n W
S 0
8/0
9
K-Means without spatial information
• K-means clustering based on intensity or color is
essentially vector quantization of the image attributes
Clusters don‟t have to be spatially coherent
Image Intensity-based clusters Color-based clusters
Slide adapted from Svetlana Lazebnik Image source: Forsyth & Ponce
Perc
eptu
al
and S
enso
ry A
ugm
ente
d C
om
puti
ng
Co
mp
ute
r V
isio
n W
S 0
8/0
9
K-Means with spatial information
• K-means clustering based on intensity or color is
essentially vector quantization of the image attributes
Clusters don‟t have to be spatially coherent
• Clustering based on (r,g,b,x,y) values enforces more
spatial coherence
Slide adapted from Svetlana Lazebnik Image source: Forsyth & Ponce
Perc
eptu
al
and S
enso
ry A
ugm
ente
d C
om
puti
ng
Co
mp
ute
r V
isio
n W
S 0
8/0
9
Summary K-Means
• Pros
Simple, fast to compute
Converges to local minimum
of within-cluster squared error
• Cons/issues
Setting k?
Sensitive to initial centers
Sensitive to outliers
Detects spherical clusters only
Assuming means can be
computed
Slide credit: Kristen Grauman
Perc
eptu
al
and S
enso
ry A
ugm
ente
d C
om
puti
ng
Co
mp
ute
r V
isio
n W
S 0
8/0
9
Probabilistic Clustering
• Basic questions
What‟s the probability that a point x is in cluster m?
What‟s the shape of each cluster?
• K-means doesn‟t answer these questions.
• Basic idea
Instead of treating the data as a bunch of points, assume that
they are all generated by sampling a continuous function.
This function is called a generative model.
Defined by a vector of parameters θ
Slide credit: Steve Seitz
Perc
eptu
al
and S
enso
ry A
ugm
ente
d C
om
puti
ng
Co
mp
ute
r V
isio
n W
S 0
8/0
9
Mixture of Gaussians
• One generative model is a mixture of Gaussians (MoG)
K Gaussian blobs with means μb covariance matrices Vb, dimension d
– Blob b defined by:
Blob b is selected with probability
The likelihood of observing x is a weighted mixture of Gaussians
,
Slide adapted from Steve Seitz
Perc
eptu
al
and S
enso
ry A
ugm
ente
d C
om
puti
ng
Co
mp
ute
r V
isio
n W
S 0
8/0
9
Expectation Maximization (EM)
• Goal
Find blob parameters θ that maximize the likelihood function
overall all N datapoints
• Approach:
1. E-step: given current guess of blobs, compute probabilistic ownership
of each point
2. M-step: given ownership probabilities, update blobs to maximize
likelihood function
3. Repeat until convergence
Slide adapted from Steve Seitz
Perc
eptu
al
and S
enso
ry A
ugm
ente
d C
om
puti
ng
Co
mp
ute
r V
isio
n W
S 0
8/0
9
EM Details
• E-step
Compute probability that point x is in blob b, given current
guess of θ
• M-step
Compute overall probability that blob b is selected
Mean of blob b
Covariance of blob b
(N data points)
Slide adapted from Steve Seitz
Perc
eptu
al
and S
enso
ry A
ugm
ente
d C
om
puti
ng
Co
mp
ute
r V
isio
n W
S 0
8/0
9
Applications of EM
• Turns out this is useful for all sorts of problems
Any clustering problem
Any model estimation problem
Missing data problems
Finding outliers
Segmentation problems
– Segmentation based on color
– Segmentation based on motion
– Foreground/background separation
...
• EM demo
http://lcn.epfl.ch/tutorial/english/gaussian/html/index.html
Slide credit: Steve Seitz
Perc
eptu
al
and S
enso
ry A
ugm
ente
d C
om
puti
ng
Co
mp
ute
r V
isio
n W
S 0
8/0
9
Segmentation with EM
Image source: Serge Belongie
k=2 k=3 k=4 k=5
EM segmentation results
Original image
Slide credit: B. Leibe
Perc
eptu
al
and S
enso
ry A
ugm
ente
d C
om
puti
ng
Co
mp
ute
r V
isio
n W
S 0
8/0
9
Summary: Mixtures of Gaussians, EM
• Pros
Probabilistic interpretation
Soft assignments between data points and clusters
Generative model, can predict novel data points
Relatively compact storage ( )
• Cons
Initialization
– often a good idea to start from output of k-means
Local minima
Need to know number of components K
– solutions: model selection (AIC, BIC), Dirichlet process mixture
Need to choose generative model (math form of a cluster ?)
Numerical problems are often a nuisance
Slide adapted from B. Leibe
Perc
eptu
al
and S
enso
ry A
ugm
ente
d C
om
puti
ng
Co
mp
ute
r V
isio
n W
S 0
8/0
9
Topics of This Lecture
• Introduction
Gestalt principles
Image segmentation
• Segmentation as clustering
k-Means
Feature spaces
Mixture of Gaussians, EM
• Model-free clustering: Mean-Shift
• Graph theoretic segmentation: Normalized Cuts
• Interactive Segmentation with GraphCuts
Perc
eptu
al
and S
enso
ry A
ugm
ente
d C
om
puti
ng
Co
mp
ute
r V
isio
n W
S 0
8/0
9
Finding Modes in a Histogram
• How many modes are there?
Mode = local maximum of a given distribution
Easy to see, hard to compute
Slide adapted from Steve Seitz
Perc
eptu
al
and S
enso
ry A
ugm
ente
d C
om
puti
ng
Co
mp
ute
r V
isio
n W
S 0
8/0
9
Mean-Shift Segmentation
• An advanced and versatile technique for clustering-
based segmentation
http://www.caip.rutgers.edu/~comanici/MSPAMI/msPamiResults.html
D. Comaniciu and P. Meer, Mean Shift: A Robust Approach toward Feature Space Analysis,
PAMI 2002.
Slide credit: Svetlana Lazebnik
Perc
eptu
al
and S
enso
ry A
ugm
ente
d C
om
puti
ng
Co
mp
ute
r V
isio
n W
S 0
8/0
9
Mean-Shift Algorithm
• Iterative Mode Search1. Initialize random seed center and window W
2. Calculate center of gravity (the “mean”) of W:
3. Shift the search window to the mean
4. Repeat steps 2+3 until convergence
Slide adapted from Steve Seitz
Perc
eptu
al
and S
enso
ry A
ugm
ente
d C
om
puti
ng
Co
mp
ute
r V
isio
n W
S 0
8/0
9
Region of
interest
Center of
mass
Mean Shift
vector
Mean-Shift
Slide by Y. Ukrainitz & B. Sarel
Perc
eptu
al
and S
enso
ry A
ugm
ente
d C
om
puti
ng
Co
mp
ute
r V
isio
n W
S 0
8/0
9
Region of
interest
Center of
mass
Mean Shift
vector
Mean-Shift
Slide by Y. Ukrainitz & B. Sarel
Perc
eptu
al
and S
enso
ry A
ugm
ente
d C
om
puti
ng
Co
mp
ute
r V
isio
n W
S 0
8/0
9
Region of
interest
Center of
mass
Mean Shift
vector
Mean-Shift
Slide by Y. Ukrainitz & B. Sarel
Perc
eptu
al
and S
enso
ry A
ugm
ente
d C
om
puti
ng
Co
mp
ute
r V
isio
n W
S 0
8/0
9
Region of
interest
Center of
mass
Mean Shift
vector
Mean-Shift
Slide by Y. Ukrainitz & B. Sarel
Perc
eptu
al
and S
enso
ry A
ugm
ente
d C
om
puti
ng
Co
mp
ute
r V
isio
n W
S 0
8/0
9
Region of
interest
Center of
mass
Mean Shift
vector
Mean-Shift
Slide by Y. Ukrainitz & B. Sarel
Perc
eptu
al
and S
enso
ry A
ugm
ente
d C
om
puti
ng
Co
mp
ute
r V
isio
n W
S 0
8/0
9
Region of
interest
Center of
mass
Mean Shift
vector
Mean-Shift
Slide by Y. Ukrainitz & B. Sarel
Perc
eptu
al
and S
enso
ry A
ugm
ente
d C
om
puti
ng
Co
mp
ute
r V
isio
n W
S 0
8/0
9
Region of
interest
Center of
mass
Mean-Shift
Slide by Y. Ukrainitz & B. Sarel
Perc
eptu
al
and S
enso
ry A
ugm
ente
d C
om
puti
ng
Co
mp
ute
r V
isio
n W
S 0
8/0
9
Tessellate the space
with windowsRun the procedure in parallel
Slide by Y. Ukrainitz & B. Sarel
Real Modality Analysis
Perc
eptu
al
and S
enso
ry A
ugm
ente
d C
om
puti
ng
Co
mp
ute
r V
isio
n W
S 0
8/0
9
The blue data points were traversed by the windows towards the mode.
Slide by Y. Ukrainitz & B. Sarel
Real Modality Analysis
Perc
eptu
al
and S
enso
ry A
ugm
ente
d C
om
puti
ng
Co
mp
ute
r V
isio
n W
S 0
8/0
9
Mean-Shift Clustering
• Cluster: all data points in the attraction basin of a mode
• Attraction basin: the region for which all trajectories
lead to the same mode
Slide by Y. Ukrainitz & B. Sarel
Perc
eptu
al
and S
enso
ry A
ugm
ente
d C
om
puti
ng
Co
mp
ute
r V
isio
n W
S 0
8/0
9
Mean-Shift Clustering/Segmentation
• Choose features (color, gradients, texture, etc)
• Initialize windows at individual pixel locations
• Start mean-shift from each window until convergence
• Merge windows that end up near the same “peak” or
mode
Slide adapted from Svetlana Lazebnik
Perc
eptu
al
and S
enso
ry A
ugm
ente
d C
om
puti
ng
Co
mp
ute
r V
isio
n W
S 0
8/0
9
Mean-Shift Segmentation Results
http://www.caip.rutgers.edu/~comanici/MSPAMI/msPamiResults.html
Slide credit: Svetlana Lazebnik
Perc
eptu
al
and S
enso
ry A
ugm
ente
d C
om
puti
ng
Co
mp
ute
r V
isio
n W
S 0
8/0
9
More Results
Slide credit: Svetlana Lazebnik
Perc
eptu
al
and S
enso
ry A
ugm
ente
d C
om
puti
ng
Co
mp
ute
r V
isio
n W
S 0
8/0
9
Summary Mean-Shift
• Pros
General, application-independent tool
Model-free, does not assume any prior shape (spherical,
elliptical, etc.) on data clusters
Just a single parameter (window size h)
– h has a physical meaning (unlike k-means) == scale of clustering
Finds variable number of modes given the same h
Robust to outliers
• Cons
Output depends on window size h
Window size (bandwidth) selection is not trivial
Computationally rather expensive
Does not scale well with dimension of feature space
Slide adapted from Svetlana Lazebnik
Perc
eptu
al
and S
enso
ry A
ugm
ente
d C
om
puti
ng
Co
mp
ute
r V
isio
n W
S 0
8/0
9
Topics of This Lecture
• Introduction
Gestalt principles
Image segmentation
• Segmentation as clustering
k-Means
Feature spaces
Mixture of Gaussians, EM
• Model-free clustering: Mean-Shift
• Graph theoretic segmentation: Normalized Cuts
• Interactive Segmentation with GraphCuts
Perc
eptu
al
and S
enso
ry A
ugm
ente
d C
om
puti
ng
Co
mp
ute
r V
isio
n W
S 0
8/0
9
Images as Graphs
• Fully-connected graph
Node (vertex) for every pixel
Edge between every pair of pixels (p,q)
Affinity weight wpq for each edge
– wpq measures similarity
– Similarity is inversely proportional to difference
(in color, texture, position, …)
q
p
wpq
w
Slide adapted from Steve Seitz
Perc
eptu
al
and S
enso
ry A
ugm
ente
d C
om
puti
ng
Co
mp
ute
r V
isio
n W
S 0
8/0
9
Segmentation by Graph Cuts
• Break Graph into Segments
Delete edges crossing between segments
Easiest to break edges with low similarity (low weight)
– Similar pixels should be in the same segments
– Dissimilar pixels should be in different segments
w
A B C
Slide adapted from Steve Seitz
Perc
eptu
al
and S
enso
ry A
ugm
ente
d C
om
puti
ng
Co
mp
ute
r V
isio
n W
S 0
8/0
9
Measuring Affinity
• Distance
• Intensity
• Color
• Texture
2
21
2( , ) exp
d
aff x y x y
2
21
2( , ) exp ( ) ( )
d
aff x y I x I y
(some suitable color space distance)
2
21
2( , ) exp ( ), ( )
d
aff x y dist c x c y
Source: Forsyth & Ponce
2
21
2( , ) exp ( ) ( )
d
aff x y f x f y
(vectors of filter outputs)
Perc
eptu
al
and S
enso
ry A
ugm
ente
d C
om
puti
ng
Co
mp
ute
r V
isio
n W
S 0
8/0
9
Scale Affects Affinity
• Small σ: group only nearby points
• Large σ: group far-away points
Slide adapted from Svetlana Lazebnik
Small Medium Large
Image Source: Forsyth & Ponce
small σ
large σ
data points
Perc
eptu
al
and S
enso
ry A
ugm
ente
d C
om
puti
ng
Co
mp
ute
r V
isio
n W
S 0
8/0
9
Graph Cut (GC)
• GC = edges whose removal partitions a graph in two
• Cost of a cut
Sum of weights of cut edges:
• A graph cut gives us a segmentation
What is a “good” graph cut and how do we find one?
Slide adapted from Steve Seitz
AB
BqAp
qpwBAcut,
,),(
Perc
eptu
al
and S
enso
ry A
ugm
ente
d C
om
puti
ng
Co
mp
ute
r V
isio
n W
S 0
8/0
9
Graph Cut
Image Source: Forsyth & Ponce
Here, the cut is nicely
defined by the block-diagonal
structure of the affinity matrix.
How can this be generalized?
Slide credit: B. Leibe
Perc
eptu
al
and S
enso
ry A
ugm
ente
d C
om
puti
ng
Co
mp
ute
r V
isio
n W
S 0
8/0
9
Minimum Cut
• We can do segmentation by finding the minimum cut in
a graph
Efficient algorithms exist for doing this
• Drawback:
Weight of cut proportional to number of edges in the cut
Minimum cut tends to cut off very small, isolated components
Ideal Cut
Cuts with
lesser weight
than the
ideal cut
Slide credit: Khurram Hassan-Shafique
Perc
eptu
al
and S
enso
ry A
ugm
ente
d C
om
puti
ng
Co
mp
ute
r V
isio
n W
S 0
8/0
9
Normalized Cut (NCut)
• Min-cut has bias toward partitioning out small segments
• This can be fixed by normalizing for size of segments
• The normalized cut cost is:
• The exact solution is NP-hard but an approximation can
be computed by solving a generalized eigenvalue
problem.
assoc(A,V) = sum of weights from A to all nodes to graph
cut(A,B)
assoc(A,V)cut(A,B)
assoc(B,V)
J. Shi and J. Malik. Normalized cuts and image segmentation. PAMI 2000
Slide adapted from Svetlana Lazebnik
Perc
eptu
al
and S
enso
ry A
ugm
ente
d C
om
puti
ng
Co
mp
ute
r V
isio
n W
S 0
8/0
9
Interpretation as a Dynamical System
• Treat the edges as springs and „shake‟ the system Elasticity proportional to cost
Vibration “modes” correspond to segments
– Can compute these by solving a generalized eigenvector problem
Slide adapted from Steve Seitz
Perc
eptu
al
and S
enso
ry A
ugm
ente
d C
om
puti
ng
Co
mp
ute
r V
isio
n W
S 0
8/0
9
NCuts as a Generalized Eigenvector Problem
• Definitions
• Rewriting Normalized Cut in matrix form:
,: ( , ) ;
: ( , ) ( , );
: {1, 1} , ( ) 1 .
the affinity matrix,
the diag. matrix,
a vector in
i j
j
N
W W i j w
D D i i W i j
x x i i A
Slide credit: Jitendra Malik
0
(A,B) (A,B)(A,B)
(A,V) (B,V)
( , )(1 ) ( )(1 ) (1 ) ( )(1 ) ;
1 1 (1 )1 1 ( , )
...
i
T Tx
T T
i
cut cutNCut
assoc assoc
D i ix D W x x D W xk
k D k D D i i
Perc
eptu
al
and S
enso
ry A
ugm
ente
d C
om
puti
ng
Co
mp
ute
r V
isio
n W
S 0
8/0
9
Some More Math…
Slide credit: Jitendra Malik
Perc
eptu
al
and S
enso
ry A
ugm
ente
d C
om
puti
ng
Co
mp
ute
r V
isio
n W
S 0
8/0
9
NCuts as a Generalized Eigenvalue Problem
• After simplification, we get
• This is a Rayleigh Quotient
Solution given by the “generalized” eigenvalue problem
Solved by converting to standard eigenvalue problem
• Subtleties
Smallest eigenvector is with eigenvalue 0 (and )
Optimal solution is second smallest eigenvector
Gives continuous result—must convert into discrete values of y
( )( , ) , with {1, }, 1 0.
TT
iT
y D W yNCut A B y b y D
y Dy
DyyWD )(
Slide adapted from Alyosha Efros
This is hard,
as y is discrete!
Relaxation:
y is continuous.with
Perc
eptu
al
and S
enso
ry A
ugm
ente
d C
om
puti
ng
Co
mp
ute
r V
isio
n W
S 0
8/0
9
NCuts Example
Smallest eigenvectors
Image source: Shi & Malik
NCuts segments
Slide credit: B. Leibe
Perc
eptu
al
and S
enso
ry A
ugm
ente
d C
om
puti
ng
Co
mp
ute
r V
isio
n W
S 0
8/0
9
NCuts: Overall Procedure
1. Construct a weighted graph G=(V,E) from an image.
2. Connect each pair of pixels, and assign graph edge weights,
3. Solve for the smallest few eigenvectors. This yields a continuous solution.
4. Threshold eigenvectors to get a discrete cut This is where the approximation is made (we‟re not solving NP).
5. Recursively subdivide if NCut value is below a pre-specified value.
( )D W y Dy
( , ) Prob. that and belong to the same region.W i j i j
Slide credit: Jitendra Malik
NCuts Matlab code available at
http://www.cis.upenn.edu/~jshi/software/
Perc
eptu
al
and S
enso
ry A
ugm
ente
d C
om
puti
ng
Co
mp
ute
r V
isio
n W
S 0
8/0
9
Color Image Segmentation with NCuts
Image Source: Shi & MalikSlide credit: Steve Seitz
Perc
eptu
al
and S
enso
ry A
ugm
ente
d C
om
puti
ng
Co
mp
ute
r V
isio
n W
S 0
8/0
9
Results with Color & Texture
Perc
eptu
al
and S
enso
ry A
ugm
ente
d C
om
puti
ng
Co
mp
ute
r V
isio
n W
S 0
8/0
9
Summary: Normalized Cuts
• Pros:
Generic framework, flexible to choice of function that computes
weights (“affinities”) between nodes
Does not require any model of the data distribution
• Cons:
Time and memory complexity can be high
– Dense, highly connected graphs many affinity computations
– Solving eigenvalue problem
Preference for balanced partitions
– If a region is uniform, NCuts will find the
modes of vibration of the image dimensions
Slide credit: Kristen Grauman
Perc
eptu
al
and S
enso
ry A
ugm
ente
d C
om
puti
ng
Co
mp
ute
r V
isio
n W
S 0
8/0
9
Segmentation: Caveats
• We‟ve looked at bottom-up ways to segment an image
into regions, yet finding meaningful segments is
intertwined with the recognition problem.
• Often want to avoid making hard decisions too soon
• Difficult to evaluate; when is a segmentation successful?
Slide credit: Kristen Grauman
Perc
eptu
al
and S
enso
ry A
ugm
ente
d C
om
puti
ng
Co
mp
ute
r V
isio
n W
S 0
8/0
9
Topics of This Lecture
• Introduction
Gestalt principles
Image segmentation
• Segmentation as clustering
k-Means
Feature spaces
Mixture of Gaussians, EM
• Model-free clustering: Mean-Shift
• Graph theoretic segmentation: Normalized Cuts
• Interactive Segmentation with GraphCuts
Perc
eptu
al
and S
enso
ry A
ugm
ente
d C
om
puti
ng
Co
mp
ute
r V
isio
n W
S 0
8/0
9
Markov Random Fields
• Allow rich probabilistic models for images
• But built in a local, modular way
Learn local effects, get global effects out
Slide credit: William Freeman
Observed evidence
Hidden “true states”
Neighborhood relations
Perc
eptu
al
and S
enso
ry A
ugm
ente
d C
om
puti
ng
Co
mp
ute
r V
isio
n W
S 0
8/0
9
MRF Nodes as Pixels (or Patches)
Image
Image pixels
states (e.g. foreground/background)
Slide adapted from William Freeman
( , )i ix y
( , )i jx x
Perc
eptu
al
and S
enso
ry A
ugm
ente
d C
om
puti
ng
Co
mp
ute
r V
isio
n W
S 0
8/0
9
Network Joint Probability
,
( , ) ( , ) ( , )i i i j
i i j
P x y x y x x
states
Image
Slide adapted from William Freeman
Image-state
compatibility
function
state-state
compatibility
function
Neighboring
nodesLocal
observations
Perc
eptu
al
and S
enso
ry A
ugm
ente
d C
om
puti
ng
Co
mp
ute
r V
isio
n W
S 0
8/0
9
Energy Formulation
• Joint probability
• Maximizing the joint probability is the same as
minimizing the log
• This is similar to free-energy problems in statistical
mechanics (spin glass theory). We therefore draw the
analogy and call E an energy function.
• and are called potentials.
,
( , ) ( , ) ( , )i i i j
i i j
P x y x y x x
,
,
log ( , ) log ( , ) log ( , )
( , ) ( , ) ( , )
i i i ji i j
i i i ji i j
P x y x y x x
E x y x y x x
Slide credit: B. Leibe
Perc
eptu
al
and S
enso
ry A
ugm
ente
d C
om
puti
ng
Co
mp
ute
r V
isio
n W
S 0
8/0
9
Energy Formulation
• Energy function
• Unary potentials
Encode local information about the given pixel/patch
How likely is a pixel/patch to be in a certain state ?
(e.g. foreground/background)?
• Pairwise potentials
Encode neighborhood information
How different is a pixel/patch‟s label from that of its neighbor?
(e.g. here independent of image data, but later based on
intensity/color/texture difference)
Pairwise
potentials
Unary
potentials
( , )i ix y
( , )i jx x,
( , ) ( , ) ( , )i i i ji i j
E x y x y x x
Slide adapted from B. Leibe
Perc
eptu
al
and S
enso
ry A
ugm
ente
d C
om
puti
ng
Co
mp
ute
r V
isio
n W
S 0
8/0
9
Energy Minimization
• Goal:
Infer the optimal labeling of the MRF.
• Many inference algorithms are available, e.g. Gibbs sampling, simulated annealing
Iterated conditional modes (ICM)
Variational methods
Belief propagation
Graph cuts
• Recently, Graph Cuts have become a popular tool Only suitable for a certain class of energy functions
But the solution can be obtained very fast for typical vision problems (~1MPixel/sec).
( , )i ix y
( , )i jx x
Slide credit: B. Leibe
Perc
eptu
al
and S
enso
ry A
ugm
ente
d C
om
puti
ng
Co
mp
ute
r V
isio
n W
S 0
8/0
9
Graph Cuts for Optimal Boundary Detection
• Idea: convert MRF into source-sink graph
n-links
s
t a cuthard
constraint
hard
constraint
Minimum cost cut can be
computed in polynomial time
(max-flow/min-cut algorithms)
[Boykov & Jolly, ICCV’01]Slide adapted from Yuri Boykov
Perc
eptu
al
and S
enso
ry A
ugm
ente
d C
om
puti
ng
Co
mp
ute
r V
isio
n W
S 0
8/0
9
Simple Example of Energy
Npq
qppqp
pp LLwLDLE )()()(
},{ tsLp
t-links n-links
Boundary termRegional term
(binary segmentation)
Slide credit: Yuri Boykov
22exp
pq
pq
Iw
pqI
s
t a cut
)(sDp
)(tDp
Perc
eptu
al
and S
enso
ry A
ugm
ente
d C
om
puti
ng
Co
mp
ute
r V
isio
n W
S 0
8/0
9
Adding Regional Properties
pqw
n-links
s
t a cut)(tDp
)(sDp
NOTE: hard constrains are not required, in general.
Regional bias example
Suppose are given
“expected” intensities
of object and background
ts II and 22 2/||||exp)( spp IIsD
22 2/||||exp)( tpp IItD
[Boykov & Jolly, ICCV’01]Slide credit: Yuri Boykov
Perc
eptu
al
and S
enso
ry A
ugm
ente
d C
om
puti
ng
Co
mp
ute
r V
isio
n W
S 0
8/0
9
Adding Regional Properties
pqw
n-links
s
t a cut)(tDp
)(sDp
22 2/||||exp)( spp IIsD
22 2/||||exp)( tpp IItD
EM-style optimization
“expected” intensities of
object and background
can be re-estimated
ts II and
[Boykov & Jolly, ICCV’01]Slide credit: Yuri Boykov
Perc
eptu
al
and S
enso
ry A
ugm
ente
d C
om
puti
ng
Co
mp
ute
r V
isio
n W
S 0
8/0
9
Adding Regional Properties
• More generally, regional bias can be based on any
intensity models of object and background
a cut ( ) logPr( | )p p p pD L I L
given object and background intensity
histograms
)(sDp
)(tDp s
t
I)|Pr( sI p
)|Pr( tI p
pI
[Boykov & Jolly, ICCV’01]Slide credit: Yuri Boykov
Perc
eptu
al
and S
enso
ry A
ugm
ente
d C
om
puti
ng
Co
mp
ute
r V
isio
n W
S 0
8/0
9
How to Set the Potentials? Some Examples
• Color potentials
e.g. modeled with a Mixture of Gaussians
• Edge potentials
e.g. a “contrast sensitive Potts model”
where
• Parameters , need to be learned, too!
[Shotton & Winn, ECCV’06]
( , , ( ); ) ( ) ( )T
i j ij ij i jx x g y g y x x
2
2 i javg y y 2
( ) i jy y
ijg y e
( , ; ) log ( , ) ( | ) ( ; , )i i i i i k kk
x y x k P k x N y y
Slide credit: B. Leibe
Perc
eptu
al
and S
enso
ry A
ugm
ente
d C
om
puti
ng
Co
mp
ute
r V
isio
n W
S 0
8/0
9
How Does it Work? The s-t-Mincut Problem
Source
Sink
v1 v2
2
5
9
42
1
Graph (V, E, C)
Vertices V = {v1, v2 ... vn}
Edges E = {(v1, v2) ....}
Costs C = {c(1, 2) ....}
Slide credit: Pushmeet Kohli
Perc
eptu
al
and S
enso
ry A
ugm
ente
d C
om
puti
ng
Co
mp
ute
r V
isio
n W
S 0
8/0
9
The s-t-Mincut Problem
Source
Sink
v1 v2
2
5
9
42
1
Slide credit: Pushmeet Kohli
What is an st-cut?
What is the cost of a st-cut?
An st-cut (S,T) divides the nodes
between source and sink.
Sum of cost of all edges
going from S to T
5 + 2 + 9 = 16
Perc
eptu
al
and S
enso
ry A
ugm
ente
d C
om
puti
ng
Co
mp
ute
r V
isio
n W
S 0
8/0
9
The s-t-Mincut Problem
Source
Sink
v1 v2
2
5
9
42
1
Slide credit: Pushmeet Kohli
What is an st-cut?
What is the cost of a st-cut?
An st-cut (S,T) divides the nodes
between source and sink.
Sum of cost of all edges
going from S to T
st-cut with the
minimum cost
What is the st-mincut?
2 + 1 + 4 = 7
Perc
eptu
al
and S
enso
ry A
ugm
ente
d C
om
puti
ng
Co
mp
ute
r V
isio
n W
S 0
8/0
9
History of Maxflow Algorithms
Augmenting Path and Push-Relabel
n: #nodes
m: #edges
U: maximum
edge weight
Algorithms
assume non-
negative edge
weights
Slide credit: Andrew Goldberg
Perc
eptu
al
and S
enso
ry A
ugm
ente
d C
om
puti
ng
Co
mp
ute
r V
isio
n W
S 0
8/0
9
How to Compute the s-t-Mincut?
Source
Sink
v1 v2
2
5
9
42
1
Solve the dual maximum flow problem
In every network, the maximum flow
equals the cost of the st-mincut
Min-cut/Max-flow Theorem
Compute the maximum flow
between Source and Sink
Constraints
Edges: Flow < Capacity
Nodes: Flow in = Flow out
Slide credit: Pushmeet Kohli
Perc
eptu
al
and S
enso
ry A
ugm
ente
d C
om
puti
ng
Co
mp
ute
r V
isio
n W
S 0
8/0
9
Maxflow Algorithms
Source
Sink
v1 v2
2
5
9
42
1
Slide credit: Pushmeet Kohli
Augmenting Path Based
Algorithms
1. Find path from source to sink
with positive capacity
2. Push maximum possible flow
through this path
3. Repeat until no path can be
found
Algorithms assume non-negative capacity
Flow = 0
Perc
eptu
al
and S
enso
ry A
ugm
ente
d C
om
puti
ng
Co
mp
ute
r V
isio
n W
S 0
8/0
9
Maxflow Algorithms
Source
Sink
v1 v2
9
42
1
Slide credit: Pushmeet Kohli
Augmenting Path Based
Algorithms
1. Find path from source to sink
with positive capacity
2. Push maximum possible flow
through this path
3. Repeat until no path can be
found
Algorithms assume non-negative capacity
Flow = 0
2
5
Perc
eptu
al
and S
enso
ry A
ugm
ente
d C
om
puti
ng
Co
mp
ute
r V
isio
n W
S 0
8/0
9
Maxflow Algorithms
Source
Sink
v1 v2
9
42
1
Slide credit: Pushmeet Kohli
Augmenting Path Based
Algorithms
1. Find path from source to sink
with positive capacity
2. Push maximum possible flow
through this path
3. Repeat until no path can be
found
Algorithms assume non-negative capacity
Flow = 0 + 2
5-2
2-2
Perc
eptu
al
and S
enso
ry A
ugm
ente
d C
om
puti
ng
Co
mp
ute
r V
isio
n W
S 0
8/0
9
Maxflow Algorithms
Source
Sink
v1 v2
9
42
1
Slide credit: Pushmeet Kohli
Augmenting Path Based
Algorithms
1. Find path from source to sink
with positive capacity
2. Push maximum possible flow
through this path
3. Repeat until no path can be
found
Algorithms assume non-negative capacity
Flow = 2
0
3
Perc
eptu
al
and S
enso
ry A
ugm
ente
d C
om
puti
ng
Co
mp
ute
r V
isio
n W
S 0
8/0
9
Maxflow Algorithms
Source
Sink
v1 v2
0
3
9
42
1
Slide credit: Pushmeet Kohli
Augmenting Path Based
Algorithms
1. Find path from source to sink
with positive capacity
2. Push maximum possible flow
through this path
3. Repeat until no path can be
found
Algorithms assume non-negative capacity
Flow = 2
Perc
eptu
al
and S
enso
ry A
ugm
ente
d C
om
puti
ng
Co
mp
ute
r V
isio
n W
S 0
8/0
9
Maxflow Algorithms
Source
Sink
v1 v2
0
32
1
Slide credit: Pushmeet Kohli
Augmenting Path Based
Algorithms
1. Find path from source to sink
with positive capacity
2. Push maximum possible flow
through this path
3. Repeat until no path can be
found
Algorithms assume non-negative capacity
Flow = 2
9
4
Perc
eptu
al
and S
enso
ry A
ugm
ente
d C
om
puti
ng
Co
mp
ute
r V
isio
n W
S 0
8/0
9
Maxflow Algorithms
Source
Sink
v1 v2
0
32
1
Slide credit: Pushmeet Kohli
Augmenting Path Based
Algorithms
1. Find path from source to sink
with positive capacity
2. Push maximum possible flow
through this path
3. Repeat until no path can be
found
Algorithms assume non-negative capacity
Flow = 2 + 4
5
0
Perc
eptu
al
and S
enso
ry A
ugm
ente
d C
om
puti
ng
Co
mp
ute
r V
isio
n W
S 0
8/0
9
Maxflow Algorithms
Source
Sink
v1 v2
0
3
5
02
1
Slide credit: Pushmeet Kohli
Augmenting Path Based
Algorithms
1. Find path from source to sink
with positive capacity
2. Push maximum possible flow
through this path
3. Repeat until no path can be
found
Algorithms assume non-negative capacity
Flow = 6
Perc
eptu
al
and S
enso
ry A
ugm
ente
d C
om
puti
ng
Co
mp
ute
r V
isio
n W
S 0
8/0
9
Maxflow Algorithms
Source
Sink
v1 v2
0
02
Slide credit: Pushmeet Kohli
Augmenting Path Based
Algorithms
1. Find path from source to sink
with positive capacity
2. Push maximum possible flow
through this path
3. Repeat until no path can be
found
Algorithms assume non-negative capacity
Flow = 6
3
5
1
Perc
eptu
al
and S
enso
ry A
ugm
ente
d C
om
puti
ng
Co
mp
ute
r V
isio
n W
S 0
8/0
9
Maxflow Algorithms
Source
Sink
v1 v2
0
02
Slide credit: Pushmeet Kohli
Augmenting Path Based
Algorithms
1. Find path from source to sink
with positive capacity
2. Push maximum possible flow
through this path
3. Repeat until no path can be
found
Algorithms assume non-negative capacity
Flow = 6 + 1
2
4
1-1
Perc
eptu
al
and S
enso
ry A
ugm
ente
d C
om
puti
ng
Co
mp
ute
r V
isio
n W
S 0
8/0
9
Maxflow Algorithms
Source
Sink
v1 v2
3
5
2
Slide credit: Pushmeet Kohli
Augmenting Path Based
Algorithms
1. Find path from source to sink
with positive capacity
2. Push maximum possible flow
through this path
3. Repeat until no path can be
found
Algorithms assume non-negative capacity
Flow = 7
0
0
0
Perc
eptu
al
and S
enso
ry A
ugm
ente
d C
om
puti
ng
Co
mp
ute
r V
isio
n W
S 0
8/0
9
Maxflow Algorithms
Source
Sink
v1 v2
3
5
2
Slide credit: Pushmeet Kohli
Augmenting Path Based
Algorithms
1. Find path from source to sink
with positive capacity
2. Push maximum possible flow
through this path
3. Repeat until no path can be
found
Algorithms assume non-negative capacity
Flow = 7
0
0
0
Perc
eptu
al
and S
enso
ry A
ugm
ente
d C
om
puti
ng
Co
mp
ute
r V
isio
n W
S 0
8/0
9
Maxflow in Computer Vision
• Specialized algorithms for vision
problems
Grid graphs
Low connectivity (m ~ O(n))
• Dual search tree augmenting path
algorithm
[Boykov and Kolmogorov PAMI 2004]
Finds approximate shortest augmenting
paths efficiently
High worst-case time complexity
Empirically outperforms other
algorithms on vision problems
Efficient code available on the web
http://www.adastral.ucl.ac.uk/~vladkolm/software.html
Slide credit: Pushmeet Kohli
Perc
eptu
al
and S
enso
ry A
ugm
ente
d C
om
puti
ng
Co
mp
ute
r V
isio
n W
S 0
8/0
9
When Can s-t Graph Cuts Be Applied?
• s-t graph cuts can only globally minimize binary energies
that are submodular.
• Non-submodular cases can still be addressed with some
optimality guarantees.
Current research topic
Npq
qpp
pp LLELELE ),()()(
},{ tsLp t-links n-links
Boundary termRegional term
E(L) can be minimized
by s-t graph cuts),(),(),(),( stEtsEttEssE
Submodularity (“convexity”)
[Boros & Hummer, 2002, Kolmogorov & Zabih, 2004]
Slide credit: B. Leibe
Perc
eptu
al
and S
enso
ry A
ugm
ente
d C
om
puti
ng
Co
mp
ute
r V
isio
n W
S 0
8/0
9
Dealing with Non-Binary Cases
• For image segmentation, the limitation to binary
energies is a nuisance.
Binary segmentation only
• We would like to solve also multi-label problems.
NP-hard problem with 3 or more labels
• There exist some approximation algorithms which
extend graph cuts to the multi-label case
-Expansion
-Swap
• They are no longer guaranteed to return the globally
optimal result.
But -Expansion has a guaranteed approximation quality (2-
approx) and converges in a few iterations.
Slide credit: B. Leibe
Perc
eptu
al
and S
enso
ry A
ugm
ente
d C
om
puti
ng
Co
mp
ute
r V
isio
n W
S 0
8/0
9
-Expansion Move
• Basic idea:
Break multi-way cut computation into a sequence of
binary s-t cuts.
other labels
Slide credit: Yuri Boykov
Perc
eptu
al
and S
enso
ry A
ugm
ente
d C
om
puti
ng
Co
mp
ute
r V
isio
n W
S 0
8/0
9
-Expansion Algorithm
1. Start with any initial solution
2. For each label “” in any order
1. Compute optimal -expansion move (s-t graph cuts): set the
label of each node to alpha or leave to current label (so s =
alpha and t = current)
2. Decline the move if there is no energy decrease
1. iterate to 2.
• Stop when no expansion move would decrease energy
• why good ? each move is optimal within a very large
set of possible segmentations (2N)
Slide adapted from Yuri Boykov
Perc
eptu
al
and S
enso
ry A
ugm
ente
d C
om
puti
ng
Co
mp
ute
r V
isio
n W
S 0
8/0
9
-Expansion Moves
• In each a-expansion a given label “” grabs space from
other labels
initial solution
-expansion
-expansion
-expansion
-expansion
-expansion
-expansion
-expansion
For each move we choose the expansion that gives the largest
decrease in the energy: binary optimization problem
Slide credit: Yuri Boykov
Perc
eptu
al
and S
enso
ry A
ugm
ente
d C
om
puti
ng
Co
mp
ute
r V
isio
n W
S 0
8/0
9
GraphCut Applications: “GrabCut”
User segmentation cues
Additional
segmentation
cues
• Interactive Image Segmentation [Boykov & Jolly, ICCV‟01]
Rough region cues sufficient
Segmentation boundary can be extracted from edges
• Procedure User marks foreground and background regions with a brush
get initial segmentation correct by additional brush strokes
Slide adapted from Matthieu Bray
Perc
eptu
al
and S
enso
ry A
ugm
ente
d C
om
puti
ng
Co
mp
ute
r V
isio
n W
S 0
8/0
9
• Obtained from interactive user input
User marks foreground and background regions with a brush
Alternatively, user can specify a bounding box
GrabCut: Data Model
Global optimum of
the unary energy
Background
color
Foreground
color
Slide adapted from Carsten Rother
Perc
eptu
al
and S
enso
ry A
ugm
ente
d C
om
puti
ng
Co
mp
ute
r V
isio
n W
S 0
8/0
9
GrabCut: Coherence Model
• An object is a coherent set of pixels:
How to choose ?
Slide credit: Carsten Rother
Error (%) over training set:
25
2
( , )
( , ) e m ny y
n mm n C
x y x x
Perc
eptu
al
and S
enso
ry A
ugm
ente
d C
om
puti
ng
Co
mp
ute
r V
isio
n W
S 0
8/0
9
Iterated Graph Cuts
Energy after
each iteration
Result
Foreground &
Background
Background G
R
Foreground
Background G
R
1 2 3 4
Color model
(Mixture of Gaussians)
Slide credit: Carsten Rother
Perc
eptu
al
and S
enso
ry A
ugm
ente
d C
om
puti
ng
Co
mp
ute
r V
isio
n W
S 0
8/0
9
GrabCut: Example Results
Perc
eptu
al
and S
enso
ry A
ugm
ente
d C
om
puti
ng
Co
mp
ute
r V
isio
n W
S 0
8/0
9
Improving Efficiency of Segmentation
• Problem: Images contain many pixels
Even with efficient graph cuts, an MRF
formulation has too many nodes for
interactive results.
• Efficiency trick: Superpixels
Group together similar-looking
pixels for efficiency of further
processing.
Cheap, local oversegmentation
Important to ensure that superpixels
do not cross boundaries
• Several different approaches possible
Superpixel code available here
http://www.cs.sfu.ca/~mori/research/superpixels/
Image source: Greg MoriSlide credit: B. Leibe
Perc
eptu
al
and S
enso
ry A
ugm
ente
d C
om
puti
ng
Co
mp
ute
r V
isio
n W
S 0
8/0
9
Superpixels for Pre-Segmentation
SpeedupGraph structureSlide credit: B. Leibe
Perc
eptu
al
and S
enso
ry A
ugm
ente
d C
om
puti
ng
Co
mp
ute
r V
isio
n W
S 0
8/0
9
Summary: Graph Cuts Segmentation
• Pros
Powerful technique, based on probabilistic model (MRF).
Applicable for a wide range of problems.
Very efficient algorithms available for vision problems.
Becoming a de-facto standard for many segmentation tasks.
• Cons/Issues
Graph cuts can only solve a limited class of models
– Submodular energy functions
– Can capture only part of the expressiveness of MRFs
Only approximate algorithms available for multi-label case
Slide credit: B. Leibe