Q&Aof GrabCut$ - Artificial Intelligencevision.stanford.edu/teaching/cs231b_spring1213/slides/segmentatio… · Lecture 1 - !!! Philipp Krähenbühl! 5 69May913$ Initialisation •

Lecture 1 - !!!

Philipp Krähenbühl!

Q&A of GrabCut

Philipp Krähenbühl

6-‐May-‐13 1

Lecture 1 - !!!


Goal

6-‐May-‐13 2

•  Bounding Box – provided

Lecture 1 - !!!


Goal

6-‐May-‐13 3

•  SegmentaFon of object within bounding box

Lecture 1 - !!!


Overview

•  Boykov & Jolly segmentaFon model

– SegmentaFon using Graph Cuts •  GMM foreground and background model

–  IteraFve opFmizaFon •  Graph Cuts •  GMM esFmaFon

6-‐May-‐13 4

E(α, z) = D(αn, zn )n∑ +γ wn,m[αn ≠αm ]

n,m∑

D(αn,θ, zn )

Lecture 1 - !!!

Philipp Krähenbühl! 6-‐May-‐13 5

Initialisation• User initialises trimap T by supplying only TB. The fore-ground is set to TF = /0; TU = TB, complement of the back-ground.

• Initialise αn = 0 for n ! TB and αn = 1 for n ! TU .• Background and foreground GMMs initialised from setsαn = 0 and αn = 1 respectively.

Iterative minimisation1. Assign GMM components to pixels: for each n in TU ,

kn := argminkn

Dn(αn,kn,θ ,zn).

2. Learn GMM parameters from data z:θ := argmin

θU(α,k,θ ,z)

3. Estimate segmentation: use min cut to solve:min

{αn: n!TU}minkE(α,k,θ ,z).

4. Repeat from step 1, until convergence.

5. Apply border matting (section 4).

User editing• Edit: fix some pixels either to αn = 0 (background brush)or αn = 1 (foreground brush); update trimap T accord-ingly. Perform step 3 above, just once.

• Refine operation: [optional] perform entire iterative min-imisation algorithm.

Figure 3: Iterative image segmentation in GrabCut

1 4 8 12

Energy E

RED

GREE

N

RED

GREE

N

(a) (b) (c)Figure 4: Convergence of iterative minimization for the data offig. 2f. (a) The energy E for the llama example converges over12 iterations. The GMM in RGB colour space (side-view showingR,G) at initialization (b) and after convergence (c). K = 5 mixturecomponents were used for both background (red) and foreground(blue). Initially (b) both GMMs overlap considerably, but are bet-ter separated after convergence (c), as the foreground/backgroundlabelling has become accurate.

3.3 User Interaction and incomplete trimaps

Incomplete trimaps. The iterative minimisation algorithm al-lows increased versatility of user interaction. In particular, incom-plete labelling becomes feasible where, in place of the full trimapT , the user needs only specify, say, the background region TB, leav-ing TF = 0. No hard foreground labelling is done at all. Iterativeminimisation (fig. 3) deals with this incompleteness by allowingprovisional labels on some pixels (in the foreground) which cansubsequently be retracted; only the background labels TB are takento be firm— guaranteed not to be retracted later. (Of course a com-plementary scheme, with firm labels for the foreground only, is alsoa possibility.) In our implementation, the initial TB is determined bythe user as a strip of pixels around the outside of the marked rect-angle (marked in red in fig. 2f).

Automatic�

Segmentation�

Automatic�

Segmentation�

U�s�e�r�

I�n�t�e�r�a�c�t�i�o�n�

Figure 5: User editing. After the initial user interaction and seg-mentation (top row), further user edits (fig. 3) are necessary. Mark-ing roughly with a foreground brush (white) and a backgroundbrush (red) is sufficient to obtain the desired result (bottom row).

Further user editing. The initial, incomplete user-labelling is of-ten sufficient to allow the entire segmentation to be completed au-tomatically, but by no means always. If not, further user editingis needed [Boykov and Jolly 2001], as shown in fig.5. It takes theform of brushing pixels, constraining them either to be firm fore-ground or firm background; then the minimisation step 3. in fig. 3is applied. Note that it is sufficient to brush, roughly, just part of awrongly labeled area. In addition, the optional “refine” operation offig. 3 updates the colour models, following user edits. This prop-agates the effect of edit operations which is frequently beneficial.Note that for efficiency the optimal flow, computed by Graph Cut,can be re-used during user edits.

4 Transparency

Given that a matting tool should be able to produce continuous al-pha values, we now describe a mechanism by which hard segmenta-tion, as described above, can be augmented by “border matting”, inwhich full transparency is allowed in a narrow strip around the hardsegmentation boundary. This is sufficient to deal with the problemof matting in the presence of blur and mixed pixels along smoothobject boundaries. The technical issues are: Estimating an alpha-map for the strip without generating artefacts, and recovering theforeground colour, free of colour bleeding from the background.

4.1 Border Matting

Border matting begins with a closed contourC, obtained by fitting apolyline to the segmentation boundary from the iterative hard seg-mentation of the previous section. A new trimap {TB,TU ,TF} iscomputed, in which TU is the set of pixels in a ribbon of width ±wpixels either side of C (we use w = 6). The goal is to compute themap αn, n ! TU , and in order to do this robustly, a strong modelis assumed for the shape of the α-profile within TU . The form ofthe model is based on [Mortensen and Barrett 1999] but with twoimportant additions: regularisation to enhance the quality of the es-timated α-map; and a dynamic programming (DP) algorithm forestimating α throughout TU .Let t = 1, . . . ,T be a parameterization of contourC, periodic with

period T , as curve C is closed. An index t(n) is assigned to eachpixel n ! TU , as in fig. 6(c). The α-profile is taken to be a soft step-function g (fig. 6c): αn = g

!

rn;Δt(n),σt(n)"

, where rn is a signeddistance from pixel n to contour C. Parameters Δ,σ determine thecentre and width respectively of the transition from 0 to 1 in the

Project 2

Not required

OpFonal

Lecture 1 - !!!


NotaFon

•  Trimaps – Sets of pixels – TB: background – TF: foreground – TU: undecided

•  SegmentaFon – αi for each pixel i

6-‐May-‐13 6




kn := argminkn

Dn(αn,kn,θ ,zn).


θU(α,k,θ ,z)








1 4 8 12

Energy E

RED

GREE

N

RED

GREE

N




Automatic�

Segmentation�

Automatic�

Segmentation�

U�s�e�r�




4 Transparency


4.1 Border Matting



!

rn;Δt(n),σt(n)"


Lecture 1 - !!!


IniFalizaFon

•  Outside –  Background (fixed) –  αn=0

•  Inside –  IniFal foreground –  αn=1 –  updated

6-‐May-‐13 7




kn := argminkn

Dn(αn,kn,θ ,zn).


θU(α,k,θ ,z)








1 4 8 12

Energy E

RED

GREE

N

RED

GREE

N




Automatic�

Segmentation�

Automatic�

Segmentation�

U�s�e�r�




4 Transparency


4.1 Border Matting



!

rn;Δt(n),σt(n)"


Lecture 1 - !!!


GMM IniFalizaFon

•  Find parameters θ •  Standard method –  random + EM

•  Re-‐esFmate θ in step 2 –  IniFalizaFon only needed for step 1

– Trick: IniFalize k instead

6-‐May-‐13 8




kn := argminkn

Dn(αn,kn,θ ,zn).


θU(α,k,θ ,z)








1 4 8 12

Energy E

RED

GREE

N

RED

GREE

N




Automatic�

Segmentation�

Automatic�

Segmentation�

U�s�e�r�




4 Transparency


4.1 Border Matting



!

rn;Δt(n),σt(n)"


Lecture 1 - !!!


GMM IniFalizaFon

•  Trick: IniFalize k – k-‐means clustering – skip step 1 in first iteraFon

6-‐May-‐13 9




kn := argminkn

Dn(αn,kn,θ ,zn).


θU(α,k,θ ,z)








1 4 8 12

Energy E

RED

GREE

N

RED

GREE

N




Automatic�

Segmentation�

Automatic�

Segmentation�

U�s�e�r�




4 Transparency


4.1 Border Matting



!

rn;Δt(n),σt(n)"


Lecture 1 - !!!


GMM assignment

•  Gaussian log prob.

•  Enumerate all kn •  Each pixel already assigned to FG or BG (αn fixed)

6-‐May-‐13 10




kn := argminkn

Dn(αn,kn,θ ,zn).


θU(α,k,θ ,z)








1 4 8 12

Energy E

RED

GREE

N

RED

GREE

N




Automatic�

Segmentation�

Automatic�

Segmentation�

U�s�e�r�




4 Transparency


4.1 Border Matting



!

rn;Δt(n),σt(n)"


Dn (αn,kn,θ, zn ) =

− logπ (αn,kn )+12logdetΣ(αn,kn )

+12zn −µ(αn,kn )[ ]T Σ(αn,kn )

−1 zn −µ(αn,kn )[ ]

Lecture 1 - !!!


GMM leaning

•  Mixture param.

•  F(k) set of FG pixels assigned to comp. k

6-‐May-‐13 11




kn := argminkn

Dn(αn,kn,θ ,zn).


θU(α,k,θ ,z)








1 4 8 12

Energy E

RED

GREE

N

RED

GREE

N




Automatic�

Segmentation�

Automatic�

Segmentation�

U�s�e�r�




4 Transparency


4.1 Border Matting



!

rn;Δt(n),σt(n)"


π (α =1,k) =F(k)F(k)

k∑

µ(α =1,k) =meann∈F (k )

(zn )

Σ(α =1,k) = covn∈F (k )

(zn )

Lecture 1 - !!!


SegmentaFon

•  Find segmentaFon that minimizes

– Reduces to Boykov & Jolly

– Solved using GraphCut

6-‐May-‐13 12




kn := argminkn

Dn(αn,kn,θ ,zn).


θU(α,k,θ ,z)








1 4 8 12

Energy E

RED

GREE

N

RED

GREE

N




Automatic�

Segmentation�

Automatic�

Segmentation�

U�s�e�r�




4 Transparency


4.1 Border Matting



!

rn;Δt(n),σt(n)"


minkE(α,k,θ, z) =

minknD(αn,kn,θ, zn )

D(αn ,θ ,zn ) n

∑ +γ wn,m[αn ≠αm ]n,m∑

Lecture 1 - !!!


Use energy.h and maxflow.cpp // initialization !std::vector<Energy::Var> vars(N); !Energy e; !!// add a node !vars[i] = e.add_variable(); !!// add the unary term for a node !e.add_term1(vars[i], u0, u1); !// add the pairwise term for an edge !e.add_term2(vars[i], vars[j], p00, p01, p10, p11); !!// perform energy minimization !Energy::TotalValue mnE = e.minimize(); !!// get new labels !if (e.get_var(vars[i])) label[i] = 1; !else label[i] = 0; !

6-‐May-‐13 13




kn := argminkn

Dn(αn,kn,θ ,zn).


θU(α,k,θ ,z)








1 4 8 12

Energy E

RED

GREE

N

RED

GREE

N




Automatic�

Segmentation�

Automatic�

Segmentation�

U�s�e�r�




4 Transparency


4.1 Border Matting



!

rn;Δt(n),σt(n)"


minkE(α,k,θ, z) =

D(αn,θ, zn )n∑ +γ wn,m[αn ≠αm ]

n,m∑

D(0,θ, zn ) D(1,θ, zn )

Lecture 1 - !!!


Use energy.h and maxflow.cpp // initialization !std::vector<Energy::Var> vars(N); !Energy e; !!// add a node !vars[i] = e.add_variable(); !!// add the unary term for a node !e.add_term1(vars[i], u0, u1); !// add the pairwise term for an edge !e.add_term2(vars[i], vars[j], p00, p01, p10, p11); !!// perform energy minimization !Energy::TotalValue mnE = e.minimize(); !!// get new labels !if (e->get_var(vars[i])) label[i] = 1; !else label[i] = 0; !

6-‐May-‐13 14




kn := argminkn

Dn(αn,kn,θ ,zn).


θU(α,k,θ ,z)








1 4 8 12

Energy E

RED

GREE

N

RED

GREE

N




Automatic�

Segmentation�

Automatic�

Segmentation�

U�s�e�r�




4 Transparency


4.1 Border Matting



!

rn;Δt(n),σt(n)"


minkE(α,k,θ, z) =


n,m∑

? ? ? ?

Lecture 1 - !!!


Use energy.h and maxflow.cpp // initialization !std::vector<Energy::Var> vars(N); !Energy e; !!// add a node !vars[i] = e.add_variable(); !!// add the unary term for a node !e.add_term1(vars[i], u0, u1); !// add the pairwise term for an edge !e.add_term2(vars[i], vars[j], 0, vij, vij, 0); !!// perform energy minimization !Energy::TotalValue mnE = e.minimize(); !!// get new labels !if (e->get_var(vars[i])) label[i] = 1; !else label[i] = 0; !

6-‐May-‐13 15




kn := argminkn

Dn(αn,kn,θ ,zn).


θU(α,k,θ ,z)








1 4 8 12

Energy E

RED

GREE

N

RED

GREE

N




Automatic�

Segmentation�

Automatic�

Segmentation�

U�s�e�r�




4 Transparency


4.1 Border Matting



!

rn;Δt(n),σt(n)"


minkE(α,k,θ, z) =


n,m∑

vij = γ exp(−β zi − zj2)

Lecture 1 - !!!





kn := argminkn

Dn(αn,kn,θ ,zn).


θU(α,k,θ ,z)








1 4 8 12

Energy E

RED

GREE

N

RED

GREE

N




Automatic�

Segmentation�

Automatic�

Segmentation�

U�s�e�r�




4 Transparency


4.1 Border Matting



!

rn;Δt(n),σt(n)"


vij = γ exp(−β zi − zj2)

Lecture 1 - !!!





kn := argminkn

Dn(αn,kn,θ ,zn).


θU(α,k,θ ,z)








1 4 8 12

Energy E

RED

GREE

N

RED

GREE

N




Automatic�

Segmentation�

Automatic�

Segmentation�

U�s�e�r�




4 Transparency


4.1 Border Matting



!

rn;Δt(n),σt(n)"


Lecture 1 - !!!


OpFmizaFons

•  Use MEX files for Graph Cut – Call C/C++ code from matlab – complile: “mex a.cpp b.cpp c.cpp …”

•  Vectorize –  f: Nx3 matrix of RGB color values – a: N-‐dimensional binary vector (segmentaFon) –  f(a==1,:): foreground features –  f(a==0,:): background features

6-‐May-‐13 18

Lecture 1 - !!!


ImplementaFon QuesFons?

6-‐May-‐13 19

Lecture 1 - !!!


Extensions

•  User interacFon or border manng •  Play with GMMs – Vary number of components – Different iniFalizaFon – Different color space (Lab)

•  Different Color model – Histogram based model

6-‐May-‐13 20

Lecture 1 - !!!


Extensions •  Different Neighborhood System –  4 connected –  8 connected –  fully connected (DenseCRF)

•  Efficient Inference in Fully Connected CRFs with Gaussian Edge PotenFals [Krähenbühl and Koltun 2011]

•  Different “affinity” –  Lab color difference –  Contour detector gPb

•  Contour DetecFon and Hierarchical Image SegmentaFon [Arbelaez etal 2010]

6-‐May-‐13 21

Lecture 1 - !!!


QuesFons?

6-‐May-‐13 22

Documents

Q&Aof GrabCut$ - Artificial Intelligencevision.stanford.edu/teaching/cs231b_spring1213/slides/segmentatio… · Lecture 1 - !!! Philipp Krähenbühl! 5 69May913$ Initialisation •