101
Probabilistic Models for Parsing Images Xiaofeng Ren University of California, Berkeley

Probabilistic Models for Parsing Images Xiaofeng Ren University of California, Berkeley

  • View
    225

  • Download
    5

Embed Size (px)

Citation preview

Page 1: Probabilistic Models for Parsing Images Xiaofeng Ren University of California, Berkeley

Probabilistic Modelsfor

Parsing Images

Xiaofeng RenUniversity of California, Berkeley

Page 2: Probabilistic Models for Parsing Images Xiaofeng Ren University of California, Berkeley

Parsing Images

TigerGrass

Water

Sand

outdoorwildlife

Tiger

tail

eye

legs

head

back

shadow

mouse

Page 3: Probabilistic Models for Parsing Images Xiaofeng Ren University of California, Berkeley

A Classical View of Visual Processing

Pixels &

Pixel Features

Contours &

Regions

TigerGrass

Water

Sand

Objects &

Scenes

Low-level

Image Processing

Mid-level

Perceptual Organization

High-level

Recognition

Page 4: Probabilistic Models for Parsing Images Xiaofeng Ren University of California, Berkeley

Models for Parsing Images

Pixels Contours

& Regions

Objects &

Scenes

Low-level

Image Processing

Mid-level

Perceptual Organization

High-level

Recognition

A unified framework incorporating all levels of abstraction

Page 5: Probabilistic Models for Parsing Images Xiaofeng Ren University of California, Berkeley

Probabilistic Models for Images

Markov Random Fields[Geman & Geman 84]

Pixels

Labels

very limited representational power

Image restoration Edge detection Texture synthesis Segmentation Super-resolution Contour completion

………

Empirical evidence against pixel-based MRF [Ren & Malik 02]

Page 6: Probabilistic Models for Parsing Images Xiaofeng Ren University of California, Berkeley

Where is Structure?

Our perception of structure is disrupted.

We cannot efficiently reason about structure if we cannot represent it.

Page 7: Probabilistic Models for Parsing Images Xiaofeng Ren University of California, Berkeley

Outline

Parsing Images Building a Mid-level Representation Probabilistic Models for Mid-level Vision

Contour Completion Figure/Ground Organization

Combining Mid- and High-level Vision Object Segmentation Finding People

Conclusion & Future Work

Page 8: Probabilistic Models for Parsing Images Xiaofeng Ren University of California, Berkeley

Outline

Parsing Images Building a Mid-level Representation Probabilistic Models for Mid-level Vision

Contour Completion Figure/Ground Organization

Combining Mid- and High-level Vision Object Segmentation Finding People

Conclusion & Future Work

Page 9: Probabilistic Models for Parsing Images Xiaofeng Ren University of California, Berkeley

Local Edge Detection

Use the Pb (probability of boundary) edge detector: combining local brightness, texture and color contrasts.

Page 10: Probabilistic Models for Parsing Images Xiaofeng Ren University of California, Berkeley

Piece-wise Linear Approximation

Recursively split the boundaries (using angles) until each piece is approximately straight

Page 11: Probabilistic Models for Parsing Images Xiaofeng Ren University of California, Berkeley

Constrained Delaunay Triangulation (CDT) A variant of the standard Delaunay Triangulation Keeps a given set of edges in the triangulation Widely used in geometric modeling and finite elements.

Page 12: Probabilistic Models for Parsing Images Xiaofeng Ren University of California, Berkeley
Page 13: Probabilistic Models for Parsing Images Xiaofeng Ren University of California, Berkeley

Scale Invariance of CDT

Page 14: Probabilistic Models for Parsing Images Xiaofeng Ren University of California, Berkeley

The CDT Graph: Summary

• millions of pixels 1000 edges• fast to compute• scale-invariant• completes gaps• little loss of structure

Pixels

Superpixels

Principle of Uniform Connectedness: use homogenous regions as entry-level units in perceptual organization. [Palmer and Rock 94]

• longer ranges of

interaction

[Ren & Malik; ICCV 2003][Ren, Fowlkes & Malik; ICCV 2005]

Page 15: Probabilistic Models for Parsing Images Xiaofeng Ren University of California, Berkeley

Analogy with Natural Language Parsing

Sentences&

Paragraphs

Phrases

Words

Letters

Contours&

Regions

Objects&

Scenes

Pixels

Contours&

Regions

Objects&

Scenes

Pixels

Superpixels

Page 16: Probabilistic Models for Parsing Images Xiaofeng Ren University of California, Berkeley

Outline

Parsing Images Building a Mid-level Representation Probabilistic Models for Mid-level Vision

Contour Completion Figure/Ground Organization

Combining Mid- and High-level Vision Object Segmentation Finding People

Conclusion & Future Work

Page 17: Probabilistic Models for Parsing Images Xiaofeng Ren University of California, Berkeley

Mid-level Vision

It is not low-level vision (which can be computed

independently in a local neighborhood). It is not high-level vision (which assumes knowledge

of particular object categories & scenes).

Problems in mid-level vision

Curvilinear grouping Figure/groundorganization

Region segmentation

Page 18: Probabilistic Models for Parsing Images Xiaofeng Ren University of California, Berkeley

Mid-level Vision

Curvilinear grouping Figure/groundorganization

Region segmentation

Problems in mid-level vision

Page 19: Probabilistic Models for Parsing Images Xiaofeng Ren University of California, Berkeley

Curvilinear Grouping

Boundaries are smooth in nature!

A number of associated visual phenomena

Good continuation Visual completion Illusory contours

Page 20: Probabilistic Models for Parsing Images Xiaofeng Ren University of California, Berkeley

Beyond Local Edge Detection

There is psychophysical evidence that we are approaching the limit of local edge detection

Smoothness of boundaries in natural images provides an important contextual cue.

Page 21: Probabilistic Models for Parsing Images Xiaofeng Ren University of California, Berkeley

Inference on the CDT Graph

Xe

Xe

Xe

Xe

Xe

Xe

Xe

Xe

Xe

XeXeXe

Xe

Xe

Xe

Xe

Xe

Xe

Xe{0,1} 1: boundary

0: non-boundary

Estimate the marginal P(Xe)

Random Field:

which defines a joint probability distribution on all {Xe}

Page 22: Probabilistic Models for Parsing Images Xiaofeng Ren University of California, Berkeley

Conditional Random Fields (CRF)

Edge potentials exp(ii)

Junction potentials exp(jj)

[Pietra, Pietra & Lafferty 97][Lafferty, McCallum & Pereira 01]

Z

XX

XPj

jji

ii exp

X j

jji

ii XXZ expwhereX={X1,X2,…,Xm}

Undirected graphical model with potential functions in the exponential family

Page 23: Probabilistic Models for Parsing Images Xiaofeng Ren University of California, Berkeley

Edge Potential: Local Contrast

potentials exp(ii)

= average contrast on each edge e

Page 24: Probabilistic Models for Parsing Images Xiaofeng Ren University of California, Berkeley

Junction Potential: Degree

XeXe

XeThe degree of the junction depends on the assignments of {Xe}

deg=0(no lines)

00

0

deg=1(line ending)

10

0

deg=2(continuation)

10

1

deg=3(T-junction)

11

1

j = ( deg=j )potentials exp(jj)

Page 25: Probabilistic Models for Parsing Images Xiaofeng Ren University of California, Berkeley

Junction Potential: Continuity

deg=2(continuation)

10

1

= g()·( deg=2 )

Page 26: Probabilistic Models for Parsing Images Xiaofeng Ren University of California, Berkeley

Learning the Parameters

2.46 0.87 1.14 0.01

mid-level representation + probabilistic framework + large annotated datasets

Compare to [Geman and Geman 84]

Page 27: Probabilistic Models for Parsing Images Xiaofeng Ren University of California, Berkeley

Evaluation: Precision vs Recall

Pre

cisi

on

Recallmatch to groundtruth

Precision =matched pairs

total detections

total groundtruth Recall =

matched pairs

High threshold; few detections

Low threshold; lots of detections

Page 28: Probabilistic Models for Parsing Images Xiaofeng Ren University of California, Berkeley

Curvilinear grouping improves boundary detection, both for low-recall and high-recall

Horse dataset of [Borenstein and Ullman 02], 175 images training, 175 testing

“Mid-level vision is useful”

[Ren, Fowlkes & Malik; ICCV 2005]

Page 29: Probabilistic Models for Parsing Images Xiaofeng Ren University of California, Berkeley

Image Pb CRF

Page 30: Probabilistic Models for Parsing Images Xiaofeng Ren University of California, Berkeley

Image Pb CRF

Page 31: Probabilistic Models for Parsing Images Xiaofeng Ren University of California, Berkeley

Mid-level Vision

Curvilinear grouping Figure/groundorganization

Region segmentation

Problems in mid-level vision

Page 32: Probabilistic Models for Parsing Images Xiaofeng Ren University of California, Berkeley

Mid-level Vision

Curvilinear grouping Figure/groundorganization

Region segmentation

Problems in mid-level vision

Page 33: Probabilistic Models for Parsing Images Xiaofeng Ren University of California, Berkeley

Figure/Ground Organization

A contour belongs to one of the two (but not both) abutting regions.

Figure(face)

Ground(shapeless)

Figure(Goblet)Ground

(Shapeless)

Important for the perception of shape

Page 34: Probabilistic Models for Parsing Images Xiaofeng Ren University of California, Berkeley

Inference on the CDT Graph

Xe

Xe

Xe

Xe

Xe

Xe

Xe

Xe

Xe

XeXeXe

Xe

Xe

Xe

Xe

Xe

Xe

Xe{-1,1} 1: Left is Figure

-1: Right is Figure

Local Model: Convexity, Parallelism,…

Global Model: Consistency at T-junctions

Page 35: Probabilistic Models for Parsing Images Xiaofeng Ren University of California, Berkeley

Results

Chance 50.0%

Baseline Size/Convexity 55.6%

Local Shapemes 64.8%

Averaging shapemes on segmentation boundaries

72.0%

Shapemes + CRF 78.3%

Dataset Consistency 88.0%

Using human segmentations

[Ren, Fowlkes & Malik; ECCV 2006]

Page 36: Probabilistic Models for Parsing Images Xiaofeng Ren University of California, Berkeley

Models for Contour Labeling

TigerGrass

Water

Sand

Labels {Xe}

Curvilinear GroupingFigure/Ground Assignment

Contours&

Regions

Objects&

Scenes

Pixels

Superpixels CRF

Page 37: Probabilistic Models for Parsing Images Xiaofeng Ren University of California, Berkeley

Line Labeling

> : contour direction+ : convex edge - : concave edge

Reviving the old tradition with modern technologies, for more realistic applications

possible junctions(constraints)

CSP

[Clowes 1971, Huffman 1971; Waltz 1972]

Page 38: Probabilistic Models for Parsing Images Xiaofeng Ren University of California, Berkeley

Parsing Images

TigerGrass

Water

Sand

Add region-based variables and cues

Joint contour and region inference

Add high-level knowledge (objects)

Contours&

Regions

Objects&

Scenes

Pixels

Superpixels

Page 39: Probabilistic Models for Parsing Images Xiaofeng Ren University of California, Berkeley

Object Segmentation

… Object-specific cues: Shape Region

support Color/Texture

Page 40: Probabilistic Models for Parsing Images Xiaofeng Ren University of California, Berkeley

Inference on the CDT Graph

Xe

Xe

Xe

Xe

Xe

Xe

Xe

Xe

Xe

XeXeXe

Xe

Xe

Xe

Xe

Xe

Xe

Yt

Yt

Yt

Yt

Yt

Yt

Yt

Yt

YtYt

ZContour variables {Xe}

Region variables {Yt}

Object variable {Z}

Integrating {Xe},{Yt} and{Z}: low/mid/high-level cues

Xe

Xe

Xe

Xe

Xe

Xe

Xe

Xe

Xe

XeXeXe

Xe

Xe

Xe

Xe

Xe

Xe

Yt

Yt

Yt

Yt

Yt

Yt

Yt

Yt

YtYt

Z

Encoding location, scale, pose, etc.

Page 41: Probabilistic Models for Parsing Images Xiaofeng Ren University of California, Berkeley

Grouping Cues

Low-level Cues Edge energy along edge e Brightness/texture similarity between two

regions s and t

Mid-level Cues Edge collinearity and junction frequency at

vertex V Consistency between edge e and two

adjoining regions s and t

High-level Cues Texture similarity of region t to exemplars Compatibility of region support with pose Compatibility of local edge shape with pose

Low-level Cues Edge energy along edge e Brightness/texture similarity between two

regions s and t

Mid-level Cues Edge collinearity and junction frequency at

vertex V Consistency between edge e and two

adjoining regions s and t

High-level Cues Texture similarity of region t to exemplars Compatibility of region support with pose Compatibility of local edge shape with pose

L1(Xe|I)L2(Ys,Yt|I)

M1(XV|I)

M2(Xe,Ys,Yt)

H1(Yt|I)H2(Yt,Z|I)H3(Xe,Z|I)

Page 42: Probabilistic Models for Parsing Images Xiaofeng Ren University of California, Berkeley

Cue Integration in CRF

,,|,exp),(

1,,, IZYXE

IZIZYXP

ts

tse

e IYYLIXLE,

21 |,|

ts

etsV

V XYYMIXM,

21 ,,|

e

et

tt

t IZXHIZYHIYH |,|,| 321

Estimate the marginal posteriors of X, Y and Z

Page 43: Probabilistic Models for Parsing Images Xiaofeng Ren University of California, Berkeley

Object knowledge helps a lot

Mid-level Cues still useful

[Ren, Fowlkes & Malik; NIPS 2005]

Page 44: Probabilistic Models for Parsing Images Xiaofeng Ren University of California, Berkeley

Input Input Pb Output Contour Output Figure

Page 45: Probabilistic Models for Parsing Images Xiaofeng Ren University of California, Berkeley

Input Input Pb Output Contour Output Figure

Page 46: Probabilistic Models for Parsing Images Xiaofeng Ren University of California, Berkeley

Finding People

The challenges:

Posearticulation + self-occlusion

Clothing Lighting Clutter

……

Page 47: Probabilistic Models for Parsing Images Xiaofeng Ren University of California, Berkeley

Finding People: Top-Down

Objects&

Scenes

Pixels

Top-down approaches

3D model-basedfails most of the time

2D template-basedneeds lots of training data

Contours&

Regions

Objects&

Scenes

Pixels

Superpixels

Page 48: Probabilistic Models for Parsing Images Xiaofeng Ren University of California, Berkeley

Finding People: Bottom-Up

Objects&

Scenes

Pixels

Objects&

Scenes

Pixels

Superpixels

Contours&

Regions

Pixels

Superpixels

Contours&

Regions

Objects&

Scenes

Pixels

Superpixels

Page 49: Probabilistic Models for Parsing Images Xiaofeng Ren University of California, Berkeley

[Ren, Berg & Malik; ICCV 2005]

Page 50: Probabilistic Models for Parsing Images Xiaofeng Ren University of California, Berkeley
Page 51: Probabilistic Models for Parsing Images Xiaofeng Ren University of California, Berkeley

Tracking People as Blobs

Blob tracking != Rectangle tracking

… k-1, k, k+1, …

Figure/GroundSegmentation

ObjectBackground

AppearanceModel

TemporalCoherence

Page 52: Probabilistic Models for Parsing Images Xiaofeng Ren University of California, Berkeley
Page 53: Probabilistic Models for Parsing Images Xiaofeng Ren University of California, Berkeley
Page 54: Probabilistic Models for Parsing Images Xiaofeng Ren University of California, Berkeley

Preliminary Results

Tracking = Repeated Segmentation

(video)

Page 55: Probabilistic Models for Parsing Images Xiaofeng Ren University of California, Berkeley

Conclusion

Constrained Delaunay Triangulation (CDT)

Conditional Random Fields (CRF)

Quantitative evaluations

Integration of mid-level with high-level vision

Page 56: Probabilistic Models for Parsing Images Xiaofeng Ren University of California, Berkeley

Future Work

Contours&

Regions

Objects&

Scenes

Pixels

Superpixels

A richer and more consistent mid-level representation

Higher-order potential functions

Using mid-level representation for general object recognition

A high-fidelity tracking system

Finding people in static images

Page 57: Probabilistic Models for Parsing Images Xiaofeng Ren University of California, Berkeley

Thank You

Page 58: Probabilistic Models for Parsing Images Xiaofeng Ren University of California, Berkeley

Acknowledgements Joint work with Charless Fowlkes, Alex Berg, and Jitendra Malik.

References X. Ren, C. Fowlkes and J. Malik. Figure/Ground Assignment in

Natural Images. In ECCV 2006. X. Ren, C. Fowlkes and J. Malik. Cue Integration in

Figure/Ground Labeling. In NIPS 2005. X. Ren, A. Berg and J. Malik. Recovering Human Body

Configurations using Pairwise Constraints between Parts. In ICCV 2005.

X. Ren, C. Fowlkes and J. Malik. Scale-Invariant Contour Completion using Conditional Random Fields. In ICCV 2005.

X. Ren and J. Malik. Learning a Classification Model for Segmentation. In ICCV 2003.

X. Ren and J. Malik. A Multi-Scale Probability Model for Contour Completion based on Image Statistics. In ECCV 2002.

Page 59: Probabilistic Models for Parsing Images Xiaofeng Ren University of California, Berkeley
Page 60: Probabilistic Models for Parsing Images Xiaofeng Ren University of California, Berkeley
Page 61: Probabilistic Models for Parsing Images Xiaofeng Ren University of California, Berkeley
Page 62: Probabilistic Models for Parsing Images Xiaofeng Ren University of California, Berkeley

Finding People from Bottom-Up

Detecting parts

Superpixels

Assembling parts

Integer Quadratic Programming (IQP)

Objects&

Scenes

Pixels

Superpixels

Page 63: Probabilistic Models for Parsing Images Xiaofeng Ren University of California, Berkeley

Finding People in Video

Contours&

Regions

Pixels

Superpixels

Additional information: Motion Appearance Temporal consistency

How much can we do without object model (blob tracking)?

…………

Page 64: Probabilistic Models for Parsing Images Xiaofeng Ren University of California, Berkeley
Page 65: Probabilistic Models for Parsing Images Xiaofeng Ren University of California, Berkeley

I stand at the window and see a house, trees, sky. Theoretically I might say there were 327 brightnesses and nuances of colour.

Do I have "327"?

No. I have sky, house, and trees.

---- Max Wertheimer, 1923

Page 66: Probabilistic Models for Parsing Images Xiaofeng Ren University of California, Berkeley

Learning the Parameters

Maximum-likelihood estimation in CRFLet denote the groundtruth labeling on the CDT graph

Maximum-likelihood estimation in CRFLet denote the groundtruth labeling on the CDT graph

XXq

qtt

ti

iss

s XfXgZ

XL~factor edge

exp1~

Xtqqt

t ZFfXL exp

1~~log

factor

Gradient descent works well Gradient descent works well

X~

)|()(~

XPtXPt ff

Page 67: Probabilistic Models for Parsing Images Xiaofeng Ren University of California, Berkeley

Global Consistency

F

G

F

F

GG

common

F

G

F

G

GF

uncommonUse junction potentials to encode junction type

Page 68: Probabilistic Models for Parsing Images Xiaofeng Ren University of California, Berkeley

Image Groundtruth Local Global

Page 69: Probabilistic Models for Parsing Images Xiaofeng Ren University of California, Berkeley

Results

Chance 50.0%

Baseline Size/Convexity N/A

Local Shapemes 64.9%

Averaging shapemes on segmentation boundaries

66.5%

Shapemes + CRF 68.9%

Dataset Consistency 88.0%

Without human segmentations

Page 70: Probabilistic Models for Parsing Images Xiaofeng Ren University of California, Berkeley

Image Pb Local Global

Page 71: Probabilistic Models for Parsing Images Xiaofeng Ren University of California, Berkeley

Outline

Parsing Images Building a Mid-level Representation Probabilistic Models for Mid-level Vision

Contour Completion Figure/Ground Organization

Combining Mid- and High-level Vision Object Segmentation Finding People

Conclusion & Future Work

Page 72: Probabilistic Models for Parsing Images Xiaofeng Ren University of California, Berkeley

Detecting Parts: CDT

Candidate parts as parallel line segments (Ebenbreite)

Automatic scale selection from bottom-up

Feature combination with a logistic classifier

Candidate parts as parallel line segments (Ebenbreite)

Automatic scale selection from bottom-up

Feature combination with a logistic classifier

Page 73: Probabilistic Models for Parsing Images Xiaofeng Ren University of California, Berkeley

Assembling Parts: IQP

Candidates {Ci} Parts {Lj}

(Lj1,Ci1=(Lj1))

(Lj2,Ci2=(Lj2))

Cost for a partial assignment {(Lj1,Ci1), (Lj2,Ci2)}:

assignment

2

)22)(11( 2,1

2,1)22(),11(

kijij jj

k

jjk

ijijkf

H

Page 74: Probabilistic Models for Parsing Images Xiaofeng Ren University of California, Berkeley

Testing the Markov Assumption

The Markov Model for Contours:

Curvature = white noise (independent) Tangent direction t = random walk

P( t(s+1) | t(s),…) = P( t(s+1) | t(s) )

Dynamic Programming

t(s)

t(s+1)

s

s+1

[Mumford 1994, Williams & Jacobs 1995]

Page 75: Probabilistic Models for Parsing Images Xiaofeng Ren University of California, Berkeley

Testing the Markov Assumption

Segment the contours at high-curvature positions

If the Markov assumption holds, Each step, a high curvature event happens w/ probability p; High curvature events are independent from step to step;

Therefore if L is the length of contour segment between high curvature points,

P(L=k) = p(1-p)k

Page 76: Probabilistic Models for Parsing Images Xiaofeng Ren University of California, Berkeley

Berkeley Segmentation Dataset

[Martin, Fowlkes, Tal and Malik, ICCV 2001]1,000 images, >14,000 segmentations

Page 77: Probabilistic Models for Parsing Images Xiaofeng Ren University of California, Berkeley

Exponential vs Power Law

Contour segment length L

Pro

babi

lity Power Law

Scale Invariance

Markov Assumption

Exponential Law40.2

1

LP

Page 78: Probabilistic Models for Parsing Images Xiaofeng Ren University of California, Berkeley

Scale Invariance

Arbitrary viewing distance

Hierarchy of Parts

Finger

LegTorso

Page 79: Probabilistic Models for Parsing Images Xiaofeng Ren University of California, Berkeley

A Scale-Invariant Representation

TigerGrass

Water

Sand

Scale Space

Re-scale

? A scale-invariant representation for contours

Page 80: Probabilistic Models for Parsing Images Xiaofeng Ren University of California, Berkeley

Gap-Filling Property of CDT A typical scenario of contour completion A typical scenario of contour completion

low contrast

high contrasthigh contrast

CDT picks the “right” edge, completing the gap CDT picks the “right” edge, completing the gap

Page 81: Probabilistic Models for Parsing Images Xiaofeng Ren University of California, Berkeley

No Loss of Structure

Use Phuman the soft groundtruthlabel defined on CDT graphs:precision close to 100%

Pb averaged over CDT edges: no worse than the orignal Pb

Increase in asymptotic recall rate: completion of gradientless contours

Page 82: Probabilistic Models for Parsing Images Xiaofeng Ren University of California, Berkeley

Uniform Connectedness

Connected regions of homogeneous properties (brightness, color, texture) are perceived as entry-level units. [Palmer & Rock, 1994]

“Classical principles of grouping operate after UC, creating superordinate units consisting of two or more entry-level units.”

“… UC (uniform connectedness) cannot be reduced to grouping principles, because it is not a form of grouping at all…”

Page 83: Probabilistic Models for Parsing Images Xiaofeng Ren University of California, Berkeley

Local Model

“Bi-gram” model:

contrast + continuity

binary classification (0,0) vs (1,1)

logistic classifier

“Tri-gram” model:

1 2

L LPbL

=

Xe

Page 84: Probabilistic Models for Parsing Images Xiaofeng Ren University of California, Berkeley

Building a CRF Model

What are the features? edge features:

low-level “edgeness” (Pb) junction features:

Junction type Continuity

How to make inference? Loopy Belief Propagation

How to learn the parameters? Gradient Descent on Max. Likelihood

What are the features? edge features:

low-level “edgeness” (Pb) junction features:

Junction type Continuity

How to make inference? Loopy Belief Propagation

How to learn the parameters? Gradient Descent on Max. Likelihood

X={X1,X2,…,Xm}

Estimate P(Xi|)

Page 85: Probabilistic Models for Parsing Images Xiaofeng Ren University of California, Berkeley

Junction and Continuity

Junction types (degg,degc): Junction types (degg,degc):

degg=1,degc=0 degg=0,degc=2 degg=1,degc=2

Continuity term for degree-2 junctions Continuity term for degree-2 junctions

baXf cgba degdeg),( ),(),( exp baba f

2degdeg exp cgg

degg+degc=2

degg=0,degc=0

Page 86: Probabilistic Models for Parsing Images Xiaofeng Ren University of California, Berkeley

Interpreting the Parameters

=2.46 =0.87 =1.14 =0.01

=-0.59

=-0.98

Line endings and junctions are rare

Completed edges are weak

Page 87: Probabilistic Models for Parsing Images Xiaofeng Ren University of California, Berkeley

Continuity improves boundary detection in both low-recall and high-recall ranges

Global inference helps; mostly in low-recall/high-precision

Roughly speaking,

CRF>Local>CDT only>Pb

Page 88: Probabilistic Models for Parsing Images Xiaofeng Ren University of California, Berkeley
Page 89: Probabilistic Models for Parsing Images Xiaofeng Ren University of California, Berkeley
Page 90: Probabilistic Models for Parsing Images Xiaofeng Ren University of California, Berkeley

Image Pb Local Global

Page 91: Probabilistic Models for Parsing Images Xiaofeng Ren University of California, Berkeley

Figure/Ground Principles

Convexity

Parallelism

Surroundedness Symmetry Common Fate Familiar Configuration

……

F G

F GG

Page 92: Probabilistic Models for Parsing Images Xiaofeng Ren University of California, Berkeley

Figure/Ground Dataset

Page 93: Probabilistic Models for Parsing Images Xiaofeng Ren University of California, Berkeley

Figure/Ground Assignment in Natural Images

Local Model Use shapemes (prototypical local shapes) to

capture contextual information

Global Model Use CRF to enforce consistency at junctions

Page 94: Probabilistic Models for Parsing Images Xiaofeng Ren University of California, Berkeley

Shapemes: Prototypical Local Shapes

……

local shapes

collect

cluster

Average shape in each shapeme cluster

Page 95: Probabilistic Models for Parsing Images Xiaofeng Ren University of California, Berkeley

Shapemes for F/G Discrimination

L R

L:93.84%

L:49.80%

L:89.59%

L:11.69%

L:66.52%

L: 4.98%

Which side is Figure?

Train a logistic classifer to linearly combine the shapeme cues

Page 96: Probabilistic Models for Parsing Images Xiaofeng Ren University of California, Berkeley

CRF for Figure/Ground

F={F1,F2,…,Fm}

Fi{Left,Right}

• Put potential functions at junctions• One feature for each junction type

F

G

F

F

GG

F

G

F

G

G F

FG

F

G

{ (F,G),(G,F),(F,G) }

{ (G,F),(F,G) } { (F,G),(F,G),(F,G) }

Page 97: Probabilistic Models for Parsing Images Xiaofeng Ren University of California, Berkeley

Results

Page 98: Probabilistic Models for Parsing Images Xiaofeng Ren University of California, Berkeley

CDT vs K-Neighbor

An alternative scheme for completion: connect to k-nearest neighbor vertices, subject to visibility

CDT achieves higher asymptotic recall rates

Page 99: Probabilistic Models for Parsing Images Xiaofeng Ren University of California, Berkeley

Inference w/ Belief Propagation

Loopy Belief Propagation just like belief propagation iterates message passing until

convergence lack of theoretical foundations and

known to have convergence issues however becoming popular in practice typically applied on pixel-grid

Works well on CDT graphs converges fast (<10 iterations) produces empirically sound results

Loopy Belief Propagation just like belief propagation iterates message passing until

convergence lack of theoretical foundations and

known to have convergence issues however becoming popular in practice typically applied on pixel-grid

Works well on CDT graphs converges fast (<10 iterations) produces empirically sound results

Page 100: Probabilistic Models for Parsing Images Xiaofeng Ren University of California, Berkeley

Shape Context

Count the number of edge points inside each bin

“log-polar”

count=4

count=6

[Belongie, Malik & Punicha, ICCV 2001][Berg & Malik, CVPR 2001]

Page 101: Probabilistic Models for Parsing Images Xiaofeng Ren University of California, Berkeley

Compare to DDMCMC We try to solve the same problem

A unified framework for image parsing Mid-level representation

CDT vs “atomic regions” Probabilistic Model

Discriminative vs generative Inference mechanism

Belief propagation vs MCMC Quantitative evaluation

We try to develop models step by step