Image Segmentation - CVGg9 Gestalt Theory •Gestalt: whole or group Whole is greater than sum of its parts Relationships among parts can yield new properties/features •Psychologists

Perc

eptu

al

and S

enso

ry A

ugm

ente

d C

om

puti

ng

Co

mp

ute

r V

isio

n W

S 0

/09

Image Segmentation

Vittorio Ferrari

ETH Zurich

Slide credits:

K. Grauman, B. Leibe, S. Lazebnik, S. Seitz,

Y Boykov, W. Freeman, P. Kohli

Perc

eptu

al

and S

enso

ry A

ugm

ente

d C

om

puti

ng

Topics of This Lecture

• Introduction

Gestalt principles

Image segmentation

• Segmentation as clustering

k-Means

Feature spaces

Mixture of Gaussians, EM

• Model-free clustering: Mean-Shift

• Graph theoretic segmentation: Normalized Cuts

• Interactive Segmentation with GraphCuts

Perc

eptu

al

and S

enso

ry A

ugm

ente

d C

om

puti

ng

Examples of Grouping in Vision

Determining image regions

Grouping video frames into shots

Object-level grouping

Figure-ground

Slide credit: Kristen Grauman

What things should

be grouped?

What cues

indicate groups?

Perc

eptu

al

and S

enso

ry A

ugm

ente

d C

om

puti

ng

Co

mp

ute

r V

isio

n W

S 0

8/0

9

Similarity in appearance

http://chicagoist.com/attachments/chicagoist_alicia/GEESE.jpg, http://wwwdelivery.superstock.com/WI/223/1532/PreviewComp/SuperStock_1532R-0831.jpg

Slide adapted from Kristen Grauman

http://chicagoist.com/attachments/chicagoist_alicia/GEESE.jpg

http://wwwdelivery.superstock.com/WI/223/1532/PreviewComp/SuperStock_1532R-0831.jpg



Perc

eptu

al

and S

enso

ry A

ugm

ente

d C

om

puti

ng

Co

mp

ute

r V

isio

n W

S 0

8/0

9

Symmetry

http://seedmagazine.com/news/2006/10/beauty_is_in_the_processingtim.php


Perc

eptu

al

and S

enso

ry A

ugm

ente

d C

om

puti

ng

Co

mp

ute

r V

isio

n W

S 0

8/0

9

Common Fate

Image credit: Arthus-Bertrand (via F. Durand)


Perc

eptu

al

and S

enso

ry A

ugm

ente

d C

om

puti

ng

Co

mp

ute

r V

isio

n W

S 0

8/0

9

Proximity

http://www.capital.edu/Resources/Images/outside6_035.jpg


http://www.capital.edu/Resources/Images/outside6_035.jpg

Perc

eptu

al

and S

enso

ry A

ugm

ente

d C

om

puti

ng

Co

mp

ute

r V

isio

n W

S 0

8/0

9

The Gestalt School

• Grouping is key to visual perception

• Elements in a collection can have properties that result

from relationships

“The whole is greater than the sum of its parts”

Illusory/subjective

contoursOcclusion

Familiar configuration

http://en.wikipedia.org/wiki/Gestalt_psychology

Slide credit: Svetlana Lazebnik Image source: Steve Lehar

http://en.wikipedia.org/wiki/Gestalt_psychology

Perc

eptu

al

and S

enso

ry A

ugm

ente

d C

om

puti

ng

Co

mp

ute

r V

isio

n W

S 0

8/0

9

Gestalt Theory

• Gestalt: whole or group

Whole is greater than sum of its parts

Relationships among parts can yield new properties/features

• Psychologists identified series of factors that predispose

set of elements to be grouped (by human visual system)

Untersuchungen zur Lehre von der Gestalt,

Psychologische Forschung, Vol. 4, pp. 301-350, 1923

http://psy.ed.asu.edu/~classics/Wertheimer/Forms/forms.htm

“I stand at the window and see a house, trees, sky.

Theoretically I might say there were 327 brightnesses

and nuances of colour. Do I have "327"? No. I have sky,

house, and trees.”

Max Wertheimer(1880-1943)

Slide credit: B. Leibe

http://psy.ed.asu.edu/~classics/Wertheimer/Forms/forms.htm

Perc

eptu

al

and S

enso

ry A

ugm

ente

d C

om

puti

ng

Co

mp

ute

r V

isio

n W

S 0

8/0

9

Gestalt Factors

These factors make intuitive sense, but are very difficult to translate

into algorithms.

Image source: Forsyth & PonceSlide credit: B. Leibe

Perc

eptu

al

and S

enso

ry A

ugm

ente

d C

om

puti

ng

Co

mp

ute

r V

isio

n W

S 0

8/0

9

Continuity through Occlusion Cues


Perc

eptu

al

and S

enso

ry A

ugm

ente

d C

om

puti

ng

Co

mp

ute

r V

isio

n W

S 0

8/0

9


Continuity, explanation by occlusion


Perc

eptu

al

and S

enso

ry A

ugm

ente

d C

om

puti

ng

Co

mp

ute

r V

isio

n W

S 0

8/0

9



Perc

eptu

al

and S

enso

ry A

ugm

ente

d C

om

puti

ng

Co

mp

ute

r V

isio

n W

S 0

8/0

9



Perc

eptu

al

and S

enso

ry A

ugm

ente

d C

om

puti

ng

Co

mp

ute

r V

isio

n W

S 0

8/0

9

Figure-Ground Discrimination


Perc

eptu

al

and S

enso

ry A

ugm

ente

d C

om

puti

ng

Co

mp

ute

r V

isio

n W

S 0

8/0

9

The Ultimate Gestalt test

Slide adapted from B. Leibe

Perc

eptu

al

and S

enso

ry A

ugm

ente

d C

om

puti

ng

Co

mp

ute

r V

isio

n W

S 0

8/0

9

Image Segmentation

• Goal: identify groups of pixels that go together

Slide credit: Steve Seitz, Kristen Grauman

Perc

eptu

al

and S

enso

ry A

ugm

ente

d C

om

puti

ng

Co

mp

ute

r V

isio

n W

S 0

8/0

9

The Goals of Segmentation

• Separate image into objects

Image Human segmentation

Slide credit: Svetlana Lazebnik

Perc

eptu

al

and S

enso

ry A

ugm

ente

d C

om

puti

ng

Co

mp

ute

r V

isio

n W

S 0

8/0

9

The Goals of Segmentation

• Separate image into objects

• Group together similar-looking pixels for efficiency of

further processing

X. Ren and J. Malik. Learning a classification model for segmentation. ICCV 2003.

“superpixels”


http://ttic.uchicago.edu/~xren/research/iccv2003/

Perc

eptu

al

and S

enso

ry A

ugm

ente

d C

om

puti

ng

Co

mp

ute

r V

isio

n W

S 0

8/0

9


• Introduction

Gestalt principles

Image segmentation


k-Means

Feature spaces





Perc

eptu

al

and S

enso

ry A

ugm

ente

d C

om

puti

ng

Co

mp

ute

r V

isio

n W

S 0

8/0

9

Image Segmentation: Toy Example

• These intensities define the three groups.

• We could label every pixel in the image according to

which of these it is.

i.e. segment the image based on the intensity feature.

• What if the image isn‟t quite so simple?

intensity

input image

black pixelsgray

pixels

white

pixels

1 23


Perc

eptu

al

and S

enso

ry A

ugm

ente

d C

om

puti

ng

Co

mp

ute

r V

isio

n W

S 0

8/0

9

Pix

el count

Input image

Input imageIntensity

Pix

el count

Intensity


Perc

eptu

al

and S

enso

ry A

ugm

ente

d C

om

puti

ng

Co

mp

ute

r V

isio

n W

S 0

8/0

9 • Now how to determine the three main intensities that

define our groups?

• We need to cluster.

Input imageIntensity

Pix

el count


Perc

eptu

al

and S

enso

ry A

ugm

ente

d C

om

puti

ng

Co

mp

ute

r V

isio

n W

S 0

8/0

9

• Goal: choose three “centers” as the representative

intensities, and label every pixel according to which of

these centers it is nearest to.

• Best cluster centers are those that minimize SSD

between all points and their nearest cluster center ci:


0 190 255

1 23

Intensity

Perc

eptu

al

and S

enso

ry A

ugm

ente

d C

om

puti

ng

Co

mp

ute

r V

isio

n W

S 0

8/0

9

Clustering

• With this objective, it is a “chicken and egg” problem:

If we knew the cluster centers, we could allocate points to

groups by assigning each to its closest center.

If we knew the group memberships, we could get the centers by

computing the mean per group.


Perc

eptu

al

and S

enso

ry A

ugm

ente

d C

om

puti

ng

Co

mp

ute

r V

isio

n W

S 0

8/0

9

K-Means Clustering

• Basic idea: randomly initialize the k cluster centers, and

iterate between the two steps we just saw.

1. Randomly initialize the cluster centers, c1, ..., cK

2. Given cluster centers, determine points in each cluster

– For each point p, find the closest ci. Put p into cluster i

3. Given points in each cluster, solve for ci

– Set ci to be the mean of points in cluster i

4. If ci have changed, repeat Step 2

• Properties Will always converge to some solution

Can be a “local minimum”

– Does not always find the global minimum of objective function:

Slide credit: Steve Seitz

Perc

eptu

al

and S

enso

ry A

ugm

ente

d C

om

puti

ng

Co

mp

ute

r V

isio

n W

S 0

8/0

9

Segmentation as Clustering

K=2

K=3

img_as_col = double(im(:));

cluster_membs = kmeans(img_as_col, K);

labelim = zeros(size(im));

for i=1:k

inds = find(cluster_membs==i);

meanval = mean(img_as_column(inds));

labelim(inds) = meanval;

end


Perc

eptu

al

and S

enso

ry A

ugm

ente

d C

om

puti

ng

Co

mp

ute

r V

isio

n W

S 0

8/0

9

K-Means Clustering

• Java demo:

http://home.dei.polimi.it/matteucc/Clustering/tutorial_html/AppletKM.html

http://home.dei.polimi.it/matteucc/Clustering/tutorial_html/AppletKM.html

Perc

eptu

al

and S

enso

ry A

ugm

ente

d C

om

puti

ng

Co

mp

ute

r V

isio

n W

S 0

8/0

9

Feature Space

• Depending on what we choose as the feature space, we

can group pixels in different ways.

• Grouping pixels based on

intensity similarity

• Feature space: intensity value (1D)


Perc

eptu

al

and S

enso

ry A

ugm

ente

d C

om

puti

ng

Co

mp

ute

r V

isio

n W

S 0

8/0

9

Feature Space



• Grouping pixels based

on color similarity

• Feature space: color value (3D)

R=255

G=200

B=250

R=245

G=220

B=248

R=15

G=189

B=2

R=3

G=12

B=2R

G

B


Perc

eptu

al

and S

enso

ry A

ugm

ente

d C

om

puti

ng

Co

mp

ute

r V

isio

n W

S 0

8/0

9

Segmentation as Clustering



• Grouping pixels based

on texture similarity

• Feature space: filter bank responses (e.g. 24D)

Filter bank

of 24 filters

F24

F2

F1

…


Perc

eptu

al

and S

enso

ry A

ugm

ente

d C

om

puti

ng

Co

mp

ute

r V

isio

n W

S 0

8/0

9

Spatial coherence

• Assign a cluster label per pixel possible discontinuities

• How can we ensure they

are spatially smooth?

1 2

3

?

Original Labeled by cluster center‟s

intensity


Perc

eptu

al

and S

enso

ry A

ugm

ente

d C

om

puti

ng

Co

mp

ute

r V

isio

n W

S 0

8/0

9

Spatial coherence



• Grouping pixels based on

intensity+position similarity

Way to encode both similarity and proximity.


X

Intensity

Y

Perc

eptu

al

and S

enso

ry A

ugm

ente

d C

om

puti

ng

Co

mp

ute

r V

isio

n W

S 0

8/0

9

K-Means without spatial information

• K-means clustering based on intensity or color is

essentially vector quantization of the image attributes

Clusters don‟t have to be spatially coherent

Image Intensity-based clusters Color-based clusters

Slide adapted from Svetlana Lazebnik Image source: Forsyth & Ponce

Perc

eptu

al

and S

enso

ry A

ugm

ente

d C

om

puti

ng

Co

mp

ute

r V

isio

n W

S 0

8/0

9

K-Means with spatial information

• K-means clustering based on intensity or color is

essentially vector quantization of the image attributes

Clusters don‟t have to be spatially coherent

• Clustering based on (r,g,b,x,y) values enforces more

spatial coherence

Slide adapted from Svetlana Lazebnik Image source: Forsyth & Ponce

Perc

eptu

al

and S

enso

ry A

ugm

ente

d C

om

puti

ng

Co

mp

ute

r V

isio

n W

S 0

8/0

9

Summary K-Means

• Pros

Simple, fast to compute

Converges to local minimum

of within-cluster squared error

• Cons/issues

Setting k?

Sensitive to initial centers

Sensitive to outliers

Detects spherical clusters only

Assuming means can be

computed


Perc

eptu

al

and S

enso

ry A

ugm

ente

d C

om

puti

ng

Co

mp

ute

r V

isio

n W

S 0

8/0

9

Probabilistic Clustering

• Basic questions

What‟s the probability that a point x is in cluster m?

What‟s the shape of each cluster?

• K-means doesn‟t answer these questions.

• Basic idea

Instead of treating the data as a bunch of points, assume that

they are all generated by sampling a continuous function.

This function is called a generative model.

Defined by a vector of parameters θ


Perc

eptu

al

and S

enso

ry A

ugm

ente

d C

om

puti

ng

Co

mp

ute

r V

isio

n W

S 0

8/0

9

Mixture of Gaussians

• One generative model is a mixture of Gaussians (MoG)

K Gaussian blobs with means μb covariance matrices Vb, dimension d

– Blob b defined by:

Blob b is selected with probability

The likelihood of observing x is a weighted mixture of Gaussians

,

Slide adapted from Steve Seitz

Perc

eptu

al

and S

enso

ry A

ugm

ente

d C

om

puti

ng

Co

mp

ute

r V

isio

n W

S 0

8/0

9

Expectation Maximization (EM)

• Goal

Find blob parameters θ that maximize the likelihood function

overall all N datapoints

• Approach:

1. E-step: given current guess of blobs, compute probabilistic ownership

of each point

2. M-step: given ownership probabilities, update blobs to maximize

likelihood function

3. Repeat until convergence


Perc

eptu

al

and S

enso

ry A

ugm

ente

d C

om

puti

ng

Co

mp

ute

r V

isio

n W

S 0

8/0

9

EM Details

• E-step

Compute probability that point x is in blob b, given current

guess of θ

• M-step

Compute overall probability that blob b is selected

Mean of blob b

Covariance of blob b

(N data points)


Perc

eptu

al

and S

enso

ry A

ugm

ente

d C

om

puti

ng

Co

mp

ute

r V

isio

n W

S 0

8/0

9

Applications of EM

• Turns out this is useful for all sorts of problems

Any clustering problem

Any model estimation problem

Missing data problems

Finding outliers

Segmentation problems

– Segmentation based on color

– Segmentation based on motion

– Foreground/background separation

...

• EM demo

http://lcn.epfl.ch/tutorial/english/gaussian/html/index.html


http://lcn.epfl.ch/tutorial/english/gaussian/html/index.html

Perc

eptu

al

and S

enso

ry A

ugm

ente

d C

om

puti

ng

Co

mp

ute

r V

isio

n W

S 0

8/0

9

Segmentation with EM

Image source: Serge Belongie

k=2 k=3 k=4 k=5

EM segmentation results

Original image


Perc

eptu

al

and S

enso

ry A

ugm

ente

d C

om

puti

ng

Co

mp

ute

r V

isio

n W

S 0

8/0

9

Summary: Mixtures of Gaussians, EM

• Pros

Probabilistic interpretation

Soft assignments between data points and clusters

Generative model, can predict novel data points

Relatively compact storage ( )

• Cons

Initialization

– often a good idea to start from output of k-means

Local minima

Need to know number of components K

– solutions: model selection (AIC, BIC), Dirichlet process mixture

Need to choose generative model (math form of a cluster ?)

Numerical problems are often a nuisance


Perc

eptu

al

and S

enso

ry A

ugm

ente

d C

om

puti

ng

Co

mp

ute

r V

isio

n W

S 0

8/0

9


• Introduction

Gestalt principles

Image segmentation


k-Means

Feature spaces





Perc

eptu

al

and S

enso

ry A

ugm

ente

d C

om

puti

ng

Co

mp

ute

r V

isio

n W

S 0

8/0

9

Finding Modes in a Histogram

• How many modes are there?

Mode = local maximum of a given distribution

Easy to see, hard to compute


Perc

eptu

al

and S

enso

ry A

ugm

ente

d C

om

puti

ng

Co

mp

ute

r V

isio

n W

S 0

8/0

9

Mean-Shift Segmentation

• An advanced and versatile technique for clustering-

based segmentation

http://www.caip.rutgers.edu/~comanici/MSPAMI/msPamiResults.html

D. Comaniciu and P. Meer, Mean Shift: A Robust Approach toward Feature Space Analysis,

PAMI 2002.



http://www.caip.rutgers.edu/~comanici/Papers/MsRobustApproach.pdf

Perc

eptu

al

and S

enso

ry A

ugm

ente

d C

om

puti

ng

Co

mp

ute

r V

isio

n W

S 0

8/0

9

Mean-Shift Algorithm

• Iterative Mode Search1. Initialize random seed center and window W

2. Calculate center of gravity (the “mean”) of W:

3. Shift the search window to the mean

4. Repeat steps 2+3 until convergence


Perc

eptu

al

and S

enso

ry A

ugm

ente

d C

om

puti

ng

Co

mp

ute

r V

isio

n W

S 0

8/0

9

Region of

interest

Center of

mass

Mean Shift

vector

Mean-Shift

Slide by Y. Ukrainitz & B. Sarel

Perc

eptu

al

and S

enso

ry A

ugm

ente

d C

om

puti

ng

Co

mp

ute

r V

isio

n W

S 0

8/0

9

Region of

interest

Center of

mass

Mean Shift

vector

Mean-Shift


Perc

eptu

al

and S

enso

ry A

ugm

ente

d C

om

puti

ng

Co

mp

ute

r V

isio

n W

S 0

8/0

9

Region of

interest

Center of

mass

Mean Shift

vector

Mean-Shift


Perc

eptu

al

and S

enso

ry A

ugm

ente

d C

om

puti

ng

Co

mp

ute

r V

isio

n W

S 0

8/0

9

Region of

interest

Center of

mass

Mean Shift

vector

Mean-Shift


Perc

eptu

al

and S

enso

ry A

ugm

ente

d C

om

puti

ng

Co

mp

ute

r V

isio

n W

S 0

8/0

9

Region of

interest

Center of

mass

Mean Shift

vector

Mean-Shift


Perc

eptu

al

and S

enso

ry A

ugm

ente

d C

om

puti

ng

Co

mp

ute

r V

isio

n W

S 0

8/0

9

Region of

interest

Center of

mass

Mean Shift

vector

Mean-Shift


Perc

eptu

al

and S

enso

ry A

ugm

ente

d C

om

puti

ng

Co

mp

ute

r V

isio

n W

S 0

8/0

9

Region of

interest

Center of

mass

Mean-Shift


Perc

eptu

al

and S

enso

ry A

ugm

ente

d C

om

puti

ng

Co

mp

ute

r V

isio

n W

S 0

8/0

9

Tessellate the space

with windowsRun the procedure in parallel


Real Modality Analysis

Perc

eptu

al

and S

enso

ry A

ugm

ente

d C

om

puti

ng

Co

mp

ute

r V

isio

n W

S 0

8/0

9

The blue data points were traversed by the windows towards the mode.


Real Modality Analysis

Perc

eptu

al

and S

enso

ry A

ugm

ente

d C

om

puti

ng

Co

mp

ute

r V

isio

n W

S 0

8/0

9

Mean-Shift Clustering

• Cluster: all data points in the attraction basin of a mode

• Attraction basin: the region for which all trajectories

lead to the same mode


Perc

eptu

al

and S

enso

ry A

ugm

ente

d C

om

puti

ng

Co

mp

ute

r V

isio

n W

S 0

8/0

9

Mean-Shift Clustering/Segmentation

• Choose features (color, gradients, texture, etc)

• Initialize windows at individual pixel locations

• Start mean-shift from each window until convergence

• Merge windows that end up near the same “peak” or

mode

Slide adapted from Svetlana Lazebnik

Perc

eptu

al

and S

enso

ry A

ugm

ente

d C

om

puti

ng

Co

mp

ute

r V

isio

n W

S 0

8/0

9

Mean-Shift Segmentation Results




Perc

eptu

al

and S

enso

ry A

ugm

ente

d C

om

puti

ng

Co

mp

ute

r V

isio

n W

S 0

8/0

9

More Results


Perc

eptu

al

and S

enso

ry A

ugm

ente

d C

om

puti

ng

Co

mp

ute

r V

isio

n W

S 0

8/0

9

Summary Mean-Shift

• Pros

General, application-independent tool

Model-free, does not assume any prior shape (spherical,

elliptical, etc.) on data clusters

Just a single parameter (window size h)

– h has a physical meaning (unlike k-means) == scale of clustering

Finds variable number of modes given the same h

Robust to outliers

• Cons

Output depends on window size h

Window size (bandwidth) selection is not trivial

Computationally rather expensive

Does not scale well with dimension of feature space


Perc

eptu

al

and S

enso

ry A

ugm

ente

d C

om

puti

ng

Co

mp

ute

r V

isio

n W

S 0

8/0

9


• Introduction

Gestalt principles

Image segmentation


k-Means

Feature spaces





Perc

eptu

al

and S

enso

ry A

ugm

ente

d C

om

puti

ng

Co

mp

ute

r V

isio

n W

S 0

8/0

9

Images as Graphs

• Fully-connected graph

Node (vertex) for every pixel

Edge between every pair of pixels (p,q)

Affinity weight wpq for each edge

– wpq measures similarity

– Similarity is inversely proportional to difference

(in color, texture, position, …)

q

p

wpq

w


Perc

eptu

al

and S

enso

ry A

ugm

ente

d C

om

puti

ng

Co

mp

ute

r V

isio

n W

S 0

8/0

9

Segmentation by Graph Cuts

• Break Graph into Segments

Delete edges crossing between segments

Easiest to break edges with low similarity (low weight)

– Similar pixels should be in the same segments

– Dissimilar pixels should be in different segments

w

A B C


Perc

eptu

al

and S

enso

ry A

ugm

ente

d C

om

puti

ng

Co

mp

ute

r V

isio

n W

S 0

8/0

9

Measuring Affinity

• Distance

• Intensity

• Color

• Texture

2

21

2( , ) exp

d

aff x y x y

2

21

2( , ) exp ( ) ( )

d

aff x y I x I y

(some suitable color space distance)

2

21

2( , ) exp ( ), ( )

d

aff x y dist c x c y

Source: Forsyth & Ponce

2

21

2( , ) exp ( ) ( )

d

aff x y f x f y

(vectors of filter outputs)

Perc

eptu

al

and S

enso

ry A

ugm

ente

d C

om

puti

ng

Co

mp

ute

r V

isio

n W

S 0

8/0

9

Scale Affects Affinity

• Small σ: group only nearby points

• Large σ: group far-away points


Small Medium Large

Image Source: Forsyth & Ponce

small σ

large σ

data points

Perc

eptu

al

and S

enso

ry A

ugm

ente

d C

om

puti

ng

Co

mp

ute

r V

isio

n W

S 0

8/0

9

Graph Cut (GC)

• GC = edges whose removal partitions a graph in two

• Cost of a cut

Sum of weights of cut edges:

• A graph cut gives us a segmentation

What is a “good” graph cut and how do we find one?


AB

BqAp

qpwBAcut,

,),(

Perc

eptu

al

and S

enso

ry A

ugm

ente

d C

om

puti

ng

Co

mp

ute

r V

isio

n W

S 0

8/0

9

Graph Cut

Image Source: Forsyth & Ponce

Here, the cut is nicely

defined by the block-diagonal

structure of the affinity matrix.

How can this be generalized?


Perc

eptu

al

and S

enso

ry A

ugm

ente

d C

om

puti

ng

Co

mp

ute

r V

isio

n W

S 0

8/0

9

Minimum Cut

• We can do segmentation by finding the minimum cut in

a graph

Efficient algorithms exist for doing this

• Drawback:

Weight of cut proportional to number of edges in the cut

Minimum cut tends to cut off very small, isolated components

Ideal Cut

Cuts with

lesser weight

than the

ideal cut

Slide credit: Khurram Hassan-Shafique

Perc

eptu

al

and S

enso

ry A

ugm

ente

d C

om

puti

ng

Co

mp

ute

r V

isio

n W

S 0

8/0

9

Normalized Cut (NCut)

• Min-cut has bias toward partitioning out small segments

• This can be fixed by normalizing for size of segments

• The normalized cut cost is:

• The exact solution is NP-hard but an approximation can

be computed by solving a generalized eigenvalue

problem.

assoc(A,V) = sum of weights from A to all nodes to graph

cut(A,B)

assoc(A,V)cut(A,B)

assoc(B,V)

J. Shi and J. Malik. Normalized cuts and image segmentation. PAMI 2000


http://www.cs.berkeley.edu/~malik/papers/SM-ncut.pdf

Perc

eptu

al

and S

enso

ry A

ugm

ente

d C

om

puti

ng

Co

mp

ute

r V

isio

n W

S 0

8/0

9

Interpretation as a Dynamical System

• Treat the edges as springs and „shake‟ the system Elasticity proportional to cost

Vibration “modes” correspond to segments

– Can compute these by solving a generalized eigenvector problem


Perc

eptu

al

and S

enso

ry A

ugm

ente

d C

om

puti

ng

Co

mp

ute

r V

isio

n W

S 0

8/0

9

NCuts as a Generalized Eigenvector Problem

• Definitions

• Rewriting Normalized Cut in matrix form:

,: ( , ) ;

: ( , ) ( , );

: {1, 1} , ( ) 1 .

the affinity matrix,

the diag. matrix,

a vector in

i j

j

N

W W i j w

D D i i W i j

x x i i A

Slide credit: Jitendra Malik

0

(A,B) (A,B)(A,B)

(A,V) (B,V)

( , )(1 ) ( )(1 ) (1 ) ( )(1 ) ;

1 1 (1 )1 1 ( , )

...

i

T Tx

T T

i

cut cutNCut

assoc assoc

D i ix D W x x D W xk

k D k D D i i

Perc

eptu

al

and S

enso

ry A

ugm

ente

d C

om

puti

ng

Co

mp

ute

r V

isio

n W

S 0

8/0

9

Some More Math…


Perc

eptu

al

and S

enso

ry A

ugm

ente

d C

om

puti

ng

Co

mp

ute

r V

isio

n W

S 0

8/0

9

NCuts as a Generalized Eigenvalue Problem

• After simplification, we get

• This is a Rayleigh Quotient

Solution given by the “generalized” eigenvalue problem

Solved by converting to standard eigenvalue problem

• Subtleties

Smallest eigenvector is with eigenvalue 0 (and )

Optimal solution is second smallest eigenvector

Gives continuous result—must convert into discrete values of y

( )( , ) , with {1, }, 1 0.

TT

iT

y D W yNCut A B y b y D

y Dy

DyyWD )(

Slide adapted from Alyosha Efros

This is hard,

as y is discrete!

Relaxation:

y is continuous.with

Perc

eptu

al

and S

enso

ry A

ugm

ente

d C

om

puti

ng

Co

mp

ute

r V

isio

n W

S 0

8/0

9

NCuts Example

Smallest eigenvectors

Image source: Shi & Malik

NCuts segments


Perc

eptu

al

and S

enso

ry A

ugm

ente

d C

om

puti

ng

Co

mp

ute

r V

isio

n W

S 0

8/0

9

NCuts: Overall Procedure

1. Construct a weighted graph G=(V,E) from an image.

2. Connect each pair of pixels, and assign graph edge weights,

3. Solve for the smallest few eigenvectors. This yields a continuous solution.

4. Threshold eigenvectors to get a discrete cut This is where the approximation is made (we‟re not solving NP).

5. Recursively subdivide if NCut value is below a pre-specified value.

( )D W y Dy

( , ) Prob. that and belong to the same region.W i j i j


NCuts Matlab code available at

http://www.cis.upenn.edu/~jshi/software/

http://www.cis.upenn.edu/~jshi/software/

Perc

eptu

al

and S

enso

ry A

ugm

ente

d C

om

puti

ng

Co

mp

ute

r V

isio

n W

S 0

8/0

9

Color Image Segmentation with NCuts

Image Source: Shi & MalikSlide credit: Steve Seitz

Perc

eptu

al

and S

enso

ry A

ugm

ente

d C

om

puti

ng

Co

mp

ute

r V

isio

n W

S 0

8/0

9

Results with Color & Texture

Perc

eptu

al

and S

enso

ry A

ugm

ente

d C

om

puti

ng

Co

mp

ute

r V

isio

n W

S 0

8/0

9

Summary: Normalized Cuts

• Pros:

Generic framework, flexible to choice of function that computes

weights (“affinities”) between nodes

Does not require any model of the data distribution

• Cons:

Time and memory complexity can be high

– Dense, highly connected graphs many affinity computations

– Solving eigenvalue problem

Preference for balanced partitions

– If a region is uniform, NCuts will find the

modes of vibration of the image dimensions


Perc

eptu

al

and S

enso

ry A

ugm

ente

d C

om

puti

ng

Co

mp

ute

r V

isio

n W

S 0

8/0

9

Segmentation: Caveats

• We‟ve looked at bottom-up ways to segment an image

into regions, yet finding meaningful segments is

intertwined with the recognition problem.

• Often want to avoid making hard decisions too soon

• Difficult to evaluate; when is a segmentation successful?


Perc

eptu

al

and S

enso

ry A

ugm

ente

d C

om

puti

ng

Co

mp

ute

r V

isio

n W

S 0

8/0

9


• Introduction

Gestalt principles

Image segmentation


k-Means

Feature spaces





Perc

eptu

al

and S

enso

ry A

ugm

ente

d C

om

puti

ng

Co

mp

ute

r V

isio

n W

S 0

8/0

9

Markov Random Fields

• Allow rich probabilistic models for images

• But built in a local, modular way

Learn local effects, get global effects out

Slide credit: William Freeman

Observed evidence

Hidden “true states”

Neighborhood relations

Perc

eptu

al

and S

enso

ry A

ugm

ente

d C

om

puti

ng

Co

mp

ute

r V

isio

n W

S 0

8/0

9

MRF Nodes as Pixels (or Patches)

Image

Image pixels

states (e.g. foreground/background)

Slide adapted from William Freeman

( , )i ix y

( , )i jx x

Perc

eptu

al

and S

enso

ry A

ugm

ente

d C

om

puti

ng

Co

mp

ute

r V

isio

n W

S 0

8/0

9

Network Joint Probability

,

( , ) ( , ) ( , )i i i j

i i j

P x y x y x x

states

Image

Slide adapted from William Freeman

Image-state

compatibility

function

state-state

compatibility

function

Neighboring

nodesLocal

observations

Perc

eptu

al

and S

enso

ry A

ugm

ente

d C

om

puti

ng

Co

mp

ute

r V

isio

n W

S 0

8/0

9

Energy Formulation

• Joint probability

• Maximizing the joint probability is the same as

minimizing the log

• This is similar to free-energy problems in statistical

mechanics (spin glass theory). We therefore draw the

analogy and call E an energy function.

• and are called potentials.

,

( , ) ( , ) ( , )i i i j

i i j

P x y x y x x

,

,

log ( , ) log ( , ) log ( , )

( , ) ( , ) ( , )

i i i ji i j

i i i ji i j

P x y x y x x

E x y x y x x


Perc

eptu

al

and S

enso

ry A

ugm

ente

d C

om

puti

ng

Co

mp

ute

r V

isio

n W

S 0

8/0

9

Energy Formulation

• Energy function

• Unary potentials

Encode local information about the given pixel/patch

How likely is a pixel/patch to be in a certain state ?

(e.g. foreground/background)?

• Pairwise potentials

Encode neighborhood information

How different is a pixel/patch‟s label from that of its neighbor?

(e.g. here independent of image data, but later based on

intensity/color/texture difference)

Pairwise

potentials

Unary

potentials

( , )i ix y

( , )i jx x,

( , ) ( , ) ( , )i i i ji i j

E x y x y x x


Perc

eptu

al

and S

enso

ry A

ugm

ente

d C

om

puti

ng

Co

mp

ute

r V

isio

n W

S 0

8/0

9

Energy Minimization

• Goal:

Infer the optimal labeling of the MRF.

• Many inference algorithms are available, e.g. Gibbs sampling, simulated annealing

Iterated conditional modes (ICM)

Variational methods

Belief propagation

Graph cuts

• Recently, Graph Cuts have become a popular tool Only suitable for a certain class of energy functions

But the solution can be obtained very fast for typical vision problems (~1MPixel/sec).

( , )i ix y

( , )i jx x


Perc

eptu

al

and S

enso

ry A

ugm

ente

d C

om

puti

ng

Co

mp

ute

r V

isio

n W

S 0

8/0

9

Graph Cuts for Optimal Boundary Detection

• Idea: convert MRF into source-sink graph

n-links

s

t a cuthard

constraint

hard

constraint

Minimum cost cut can be

computed in polynomial time

(max-flow/min-cut algorithms)

[Boykov & Jolly, ICCV’01]Slide adapted from Yuri Boykov

Perc

eptu

al

and S

enso

ry A

ugm

ente

d C

om

puti

ng

Co

mp

ute

r V

isio

n W

S 0

8/0

9

Simple Example of Energy

Npq

qppqp

pp LLwLDLE )()()(

},{ tsLp

t-links n-links

Boundary termRegional term

(binary segmentation)

Slide credit: Yuri Boykov

22exp

pq

pq

Iw

pqI

s

t a cut

)(sDp

)(tDp

Perc

eptu

al

and S

enso

ry A

ugm

ente

d C

om

puti

ng

Co

mp

ute

r V

isio

n W

S 0

8/0

9

Adding Regional Properties

pqw

n-links

s

t a cut)(tDp

)(sDp

NOTE: hard constrains are not required, in general.

Regional bias example

Suppose are given

“expected” intensities

of object and background

ts II and 22 2/||||exp)( spp IIsD

22 2/||||exp)( tpp IItD

[Boykov & Jolly, ICCV’01]Slide credit: Yuri Boykov

Perc

eptu

al

and S

enso

ry A

ugm

ente

d C

om

puti

ng

Co

mp

ute

r V

isio

n W

S 0

8/0

9


pqw

n-links

s

t a cut)(tDp

)(sDp

22 2/||||exp)( spp IIsD

22 2/||||exp)( tpp IItD

EM-style optimization

“expected” intensities of

object and background

can be re-estimated

ts II and


Perc

eptu

al

and S

enso

ry A

ugm

ente

d C

om

puti

ng

Co

mp

ute

r V

isio

n W

S 0

8/0

9


• More generally, regional bias can be based on any

intensity models of object and background

a cut ( ) logPr( | )p p p pD L I L

given object and background intensity

histograms

)(sDp

)(tDp s

t

I)|Pr( sI p

)|Pr( tI p

pI


Perc

eptu

al

and S

enso

ry A

ugm

ente

d C

om

puti

ng

Co

mp

ute

r V

isio

n W

S 0

8/0

9

How to Set the Potentials? Some Examples

• Color potentials

e.g. modeled with a Mixture of Gaussians

• Edge potentials

e.g. a “contrast sensitive Potts model”

where

• Parameters , need to be learned, too!

[Shotton & Winn, ECCV’06]

( , , ( ); ) ( ) ( )T

i j ij ij i jx x g y g y x x

2

2 i javg y y 2

( ) i jy y

ijg y e

( , ; ) log ( , ) ( | ) ( ; , )i i i i i k kk

x y x k P k x N y y


Perc

eptu

al

and S

enso

ry A

ugm

ente

d C

om

puti

ng

Co

mp

ute

r V

isio

n W

S 0

8/0

9

How Does it Work? The s-t-Mincut Problem

Source

Sink

v1 v2

2

5

9

42

1

Graph (V, E, C)

Vertices V = {v1, v2 ... vn}

Edges E = {(v1, v2) ....}

Costs C = {c(1, 2) ....}

Slide credit: Pushmeet Kohli

Perc

eptu

al

and S

enso

ry A

ugm

ente

d C

om

puti

ng

Co

mp

ute

r V

isio

n W

S 0

8/0

9

The s-t-Mincut Problem

Source

Sink

v1 v2

2

5

9

42

1


What is an st-cut?

What is the cost of a st-cut?

An st-cut (S,T) divides the nodes

between source and sink.

Sum of cost of all edges

going from S to T

5 + 2 + 9 = 16

Perc

eptu

al

and S

enso

ry A

ugm

ente

d C

om

puti

ng

Co

mp

ute

r V

isio

n W

S 0

8/0

9

The s-t-Mincut Problem

Source

Sink

v1 v2

2

5

9

42

1


What is an st-cut?

What is the cost of a st-cut?

An st-cut (S,T) divides the nodes

between source and sink.

Sum of cost of all edges

going from S to T

st-cut with the

minimum cost

What is the st-mincut?

2 + 1 + 4 = 7

Perc

eptu

al

and S

enso

ry A

ugm

ente

d C

om

puti

ng

Co

mp

ute

r V

isio

n W

S 0

8/0

9

History of Maxflow Algorithms

Augmenting Path and Push-Relabel

n: #nodes

m: #edges

U: maximum

edge weight

Algorithms

assume non-

negative edge

weights

Slide credit: Andrew Goldberg

Perc

eptu

al

and S

enso

ry A

ugm

ente

d C

om

puti

ng

Co

mp

ute

r V

isio

n W

S 0

8/0

9

How to Compute the s-t-Mincut?

Source

Sink

v1 v2

2

5

9

42

1

Solve the dual maximum flow problem

In every network, the maximum flow

equals the cost of the st-mincut

Min-cut/Max-flow Theorem

Compute the maximum flow

between Source and Sink

Constraints

Edges: Flow < Capacity

Nodes: Flow in = Flow out


Perc

eptu

al

and S

enso

ry A

ugm

ente

d C

om

puti

ng

Co

mp

ute

r V

isio

n W

S 0

8/0

9

Maxflow Algorithms

Source

Sink

v1 v2

2

5

9

42

1


Augmenting Path Based

Algorithms

1. Find path from source to sink

with positive capacity

2. Push maximum possible flow

through this path

3. Repeat until no path can be

found

Algorithms assume non-negative capacity

Flow = 0

Perc

eptu

al

and S

enso

ry A

ugm

ente

d C

om

puti

ng

Co

mp

ute

r V

isio

n W

S 0

8/0

9

Maxflow Algorithms

Source

Sink

v1 v2

9

42

1



Algorithms




through this path


found


Flow = 0

2

5

Perc

eptu

al

and S

enso

ry A

ugm

ente

d C

om

puti

ng

Co

mp

ute

r V

isio

n W

S 0

8/0

9

Maxflow Algorithms

Source

Sink

v1 v2

9

42

1



Algorithms




through this path


found


Flow = 0 + 2

5-2

2-2

Perc

eptu

al

and S

enso

ry A

ugm

ente

d C

om

puti

ng

Co

mp

ute

r V

isio

n W

S 0

8/0

9

Maxflow Algorithms

Source

Sink

v1 v2

9

42

1



Algorithms




through this path


found


Flow = 2

0

3

Perc

eptu

al

and S

enso

ry A

ugm

ente

d C

om

puti

ng

Co

mp

ute

r V

isio

n W

S 0

8/0

9

Maxflow Algorithms

Source

Sink

v1 v2

0

3

9

42

1



Algorithms




through this path


found


Flow = 2

Perc

eptu

al

and S

enso

ry A

ugm

ente

d C

om

puti

ng

Co

mp

ute

r V

isio

n W

S 0

8/0

9

Maxflow Algorithms

Source

Sink

v1 v2

0

32

1



Algorithms




through this path


found


Flow = 2

9

4

Perc

eptu

al

and S

enso

ry A

ugm

ente

d C

om

puti

ng

Co

mp

ute

r V

isio

n W

S 0

8/0

9

Maxflow Algorithms

Source

Sink

v1 v2

0

32

1



Algorithms




through this path


found


Flow = 2 + 4

5

0

Perc

eptu

al

and S

enso

ry A

ugm

ente

d C

om

puti

ng

Co

mp

ute

r V

isio

n W

S 0

8/0

9

Maxflow Algorithms

Source

Sink

v1 v2

0

3

5

02

1



Algorithms




through this path


found


Flow = 6

Perc

eptu

al

and S

enso

ry A

ugm

ente

d C

om

puti

ng

Co

mp

ute

r V

isio

n W

S 0

8/0

9

Maxflow Algorithms

Source

Sink

v1 v2

0

02



Algorithms




through this path


found


Flow = 6

3

5

1

Perc

eptu

al

and S

enso

ry A

ugm

ente

d C

om

puti

ng

Co

mp

ute

r V

isio

n W

S 0

8/0

9

Maxflow Algorithms

Source

Sink

v1 v2

0

02



Algorithms




through this path


found


Flow = 6 + 1

2

4

1-1

Perc

eptu

al

and S

enso

ry A

ugm

ente

d C

om

puti

ng

Co

mp

ute

r V

isio

n W

S 0

8/0

9

Maxflow Algorithms

Source

Sink

v1 v2

3

5

2



Algorithms




through this path


found


Flow = 7

0

0

0

Perc

eptu

al

and S

enso

ry A

ugm

ente

d C

om

puti

ng

Co

mp

ute

r V

isio

n W

S 0

8/0

9

Maxflow Algorithms

Source

Sink

v1 v2

3

5

2



Algorithms




through this path


found


Flow = 7

0

0

0

Perc

eptu

al

and S

enso

ry A

ugm

ente

d C

om

puti

ng

Co

mp

ute

r V

isio

n W

S 0

8/0

9

Maxflow in Computer Vision

• Specialized algorithms for vision

problems

Grid graphs

Low connectivity (m ~ O(n))

• Dual search tree augmenting path

algorithm

[Boykov and Kolmogorov PAMI 2004]

Finds approximate shortest augmenting

paths efficiently

High worst-case time complexity

Empirically outperforms other

algorithms on vision problems

Efficient code available on the web

http://www.adastral.ucl.ac.uk/~vladkolm/software.html


http://www.adastral.ucl.ac.uk/~vladkolm/software.html

Perc

eptu

al

and S

enso

ry A

ugm

ente

d C

om

puti

ng

Co

mp

ute

r V

isio

n W

S 0

8/0

9

When Can s-t Graph Cuts Be Applied?

• s-t graph cuts can only globally minimize binary energies

that are submodular.

• Non-submodular cases can still be addressed with some

optimality guarantees.

Current research topic

Npq

qpp

pp LLELELE ),()()(

},{ tsLp t-links n-links

Boundary termRegional term

E(L) can be minimized

by s-t graph cuts),(),(),(),( stEtsEttEssE

Submodularity (“convexity”)

[Boros & Hummer, 2002, Kolmogorov & Zabih, 2004]


Perc

eptu

al

and S

enso

ry A

ugm

ente

d C

om

puti

ng

Co

mp

ute

r V

isio

n W

S 0

8/0

9

Dealing with Non-Binary Cases

• For image segmentation, the limitation to binary

energies is a nuisance.

Binary segmentation only

• We would like to solve also multi-label problems.

NP-hard problem with 3 or more labels

• There exist some approximation algorithms which

extend graph cuts to the multi-label case

-Expansion

-Swap

• They are no longer guaranteed to return the globally

optimal result.

But -Expansion has a guaranteed approximation quality (2-

approx) and converges in a few iterations.


Perc

eptu

al

and S

enso

ry A

ugm

ente

d C

om

puti

ng

Co

mp

ute

r V

isio

n W

S 0

8/0

9

-Expansion Move

• Basic idea:

Break multi-way cut computation into a sequence of

binary s-t cuts.

other labels


Perc

eptu

al

and S

enso

ry A

ugm

ente

d C

om

puti

ng

Co

mp

ute

r V

isio

n W

S 0

8/0

9

-Expansion Algorithm

1. Start with any initial solution

2. For each label “” in any order

1. Compute optimal -expansion move (s-t graph cuts): set the

label of each node to alpha or leave to current label (so s =

alpha and t = current)

2. Decline the move if there is no energy decrease

1. iterate to 2.

• Stop when no expansion move would decrease energy

• why good ? each move is optimal within a very large

set of possible segmentations (2N)

Slide adapted from Yuri Boykov

Perc

eptu

al

and S

enso

ry A

ugm

ente

d C

om

puti

ng

Co

mp

ute

r V

isio

n W

S 0

8/0

9

-Expansion Moves

• In each a-expansion a given label “” grabs space from

other labels

initial solution

-expansion

-expansion

-expansion

-expansion

-expansion

-expansion

-expansion

For each move we choose the expansion that gives the largest

decrease in the energy: binary optimization problem


Perc

eptu

al

and S

enso

ry A

ugm

ente

d C

om

puti

ng

Co

mp

ute

r V

isio

n W

S 0

8/0

9

GraphCut Applications: “GrabCut”

User segmentation cues

Additional

segmentation

cues

• Interactive Image Segmentation [Boykov & Jolly, ICCV‟01]

Rough region cues sufficient

Segmentation boundary can be extracted from edges

• Procedure User marks foreground and background regions with a brush

get initial segmentation correct by additional brush strokes

Slide adapted from Matthieu Bray

Perc

eptu

al

and S

enso

ry A

ugm

ente

d C

om

puti

ng

Co

mp

ute

r V

isio

n W

S 0

8/0

9

• Obtained from interactive user input

User marks foreground and background regions with a brush

Alternatively, user can specify a bounding box

GrabCut: Data Model

Global optimum of

the unary energy

Background

color

Foreground

color

Slide adapted from Carsten Rother

Perc

eptu

al

and S

enso

ry A

ugm

ente

d C

om

puti

ng

Co

mp

ute

r V

isio

n W

S 0

8/0

9

GrabCut: Coherence Model

• An object is a coherent set of pixels:

How to choose ?

Slide credit: Carsten Rother

Error (%) over training set:

25

2

( , )

( , ) e m ny y

n mm n C

x y x x

Perc

eptu

al

and S

enso

ry A

ugm

ente

d C

om

puti

ng

Co

mp

ute

r V

isio

n W

S 0

8/0

9

Iterated Graph Cuts

Energy after

each iteration

Result

Foreground &

Background

Background G

R

Foreground

Background G

R

1 2 3 4

Color model

(Mixture of Gaussians)

Slide credit: Carsten Rother

Perc

eptu

al

and S

enso

ry A

ugm

ente

d C

om

puti

ng

Co

mp

ute

r V

isio

n W

S 0

8/0

9

GrabCut: Example Results

Perc

eptu

al

and S

enso

ry A

ugm

ente

d C

om

puti

ng

Co

mp

ute

r V

isio

n W

S 0

8/0

9

Improving Efficiency of Segmentation

• Problem: Images contain many pixels

Even with efficient graph cuts, an MRF

formulation has too many nodes for

interactive results.

• Efficiency trick: Superpixels

Group together similar-looking

pixels for efficiency of further

processing.

Cheap, local oversegmentation

Important to ensure that superpixels

do not cross boundaries

• Several different approaches possible

Superpixel code available here

http://www.cs.sfu.ca/~mori/research/superpixels/

Image source: Greg MoriSlide credit: B. Leibe

http://www.cs.sfu.ca/~mori/research/superpixels/

Perc

eptu

al

and S

enso

ry A

ugm

ente

d C

om

puti

ng

Co

mp

ute

r V

isio

n W

S 0

8/0

9

Superpixels for Pre-Segmentation

SpeedupGraph structureSlide credit: B. Leibe

Perc

eptu

al

and S

enso

ry A

ugm

ente

d C

om

puti

ng

Co

mp

ute

r V

isio

n W

S 0

8/0

9

Summary: Graph Cuts Segmentation

• Pros

Powerful technique, based on probabilistic model (MRF).

Applicable for a wide range of problems.

Very efficient algorithms available for vision problems.

Becoming a de-facto standard for many segmentation tasks.

• Cons/Issues

Graph cuts can only solve a limited class of models

– Submodular energy functions

– Can capture only part of the expressiveness of MRFs

Only approximate algorithms available for multi-label case


Documents

Image Segmentation - CVGg9 Gestalt Theory •Gestalt: whole or group Whole is greater than sum of its parts Relationships among parts can yield new properties/features •Psychologists