140
Factoring Scenes into 3D Structure and Style David Fouhey Thesis Committee: Abhinav Gupta (Co-Chair) Martial Hebert (Co-Chair) Deva Ramanan William T. Freeman, Massachusetts Institute of Technology Andrew Zisserman, University of Oxford

Factoring Scenes into 3D Structure and Style · Dissertation Contributions Image (3D Structure x Style) 3D Structure Style 1. Local image-based cues 2. Local style-based cues “Planar”

  • Upload
    others

  • View
    8

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Factoring Scenes into 3D Structure and Style · Dissertation Contributions Image (3D Structure x Style) 3D Structure Style 1. Local image-based cues 2. Local style-based cues “Planar”

Factoring Scenes into 3D Structure and Style

David Fouhey

Thesis Committee:

Abhinav Gupta (Co-Chair)

Martial Hebert (Co-Chair)

Deva Ramanan

William T. Freeman, Massachusetts Institute of Technology

Andrew Zisserman, University of Oxford

Page 2: Factoring Scenes into 3D Structure and Style · Dissertation Contributions Image (3D Structure x Style) 3D Structure Style 1. Local image-based cues 2. Local style-based cues “Planar”
Page 3: Factoring Scenes into 3D Structure and Style · Dissertation Contributions Image (3D Structure x Style) 3D Structure Style 1. Local image-based cues 2. Local style-based cues “Planar”

= x

Image 3D Structure

What surfaces are where / Underlying scene geometry

Style

Viewpoint-independent/canonical texture(fronto-parallel)

Page 4: Factoring Scenes into 3D Structure and Style · Dissertation Contributions Image (3D Structure x Style) 3D Structure Style 1. Local image-based cues 2. Local style-based cues “Planar”

Example

= x

3D Structure StyleImage

Page 5: Factoring Scenes into 3D Structure and Style · Dissertation Contributions Image (3D Structure x Style) 3D Structure Style 1. Local image-based cues 2. Local style-based cues “Planar”

You See…

Page 6: Factoring Scenes into 3D Structure and Style · Dissertation Contributions Image (3D Structure x Style) 3D Structure Style 1. Local image-based cues 2. Local style-based cues “Planar”

Unfortunately…

Page 7: Factoring Scenes into 3D Structure and Style · Dissertation Contributions Image (3D Structure x Style) 3D Structure Style 1. Local image-based cues 2. Local style-based cues “Planar”

Why Can We Solve It?

Not all factorizations are equally likely!

3D Structure Style

Page 8: Factoring Scenes into 3D Structure and Style · Dissertation Contributions Image (3D Structure x Style) 3D Structure Style 1. Local image-based cues 2. Local style-based cues “Planar”

Why Can We Solve It?

Page 9: Factoring Scenes into 3D Structure and Style · Dissertation Contributions Image (3D Structure x Style) 3D Structure Style 1. Local image-based cues 2. Local style-based cues “Planar”

The Problem

?? ?

??

3D Structure StyleImage

= x

Page 10: Factoring Scenes into 3D Structure and Style · Dissertation Contributions Image (3D Structure x Style) 3D Structure Style 1. Local image-based cues 2. Local style-based cues “Planar”

Representations/Visualization

Surface Normal Legend

Sample Room

3D Structure Style

Page 11: Factoring Scenes into 3D Structure and Style · Dissertation Contributions Image (3D Structure x Style) 3D Structure Style 1. Local image-based cues 2. Local style-based cues “Planar”

Contributions

Page 12: Factoring Scenes into 3D Structure and Style · Dissertation Contributions Image (3D Structure x Style) 3D Structure Style 1. Local image-based cues 2. Local style-based cues “Planar”

Our First Contribution

StyleImage

= x

3D Structure

Data-Driven 3D Primitives for Single Image Understanding. Fouhey, Gupta, Hebert. In ICCV ‘13.

Page 13: Factoring Scenes into 3D Structure and Style · Dissertation Contributions Image (3D Structure x Style) 3D Structure Style 1. Local image-based cues 2. Local style-based cues “Planar”

Supervised Approach

Data-Driven 3D Primitives for Single Image Understanding. Fouhey, Gupta, Hebert. In ICCV ‘13.

Page 14: Factoring Scenes into 3D Structure and Style · Dissertation Contributions Image (3D Structure x Style) 3D Structure Style 1. Local image-based cues 2. Local style-based cues “Planar”

Supervised Approach

Data-Driven 3D Primitives for Single Image Understanding. Fouhey, Gupta, Hebert. In ICCV ‘13.

Page 15: Factoring Scenes into 3D Structure and Style · Dissertation Contributions Image (3D Structure x Style) 3D Structure Style 1. Local image-based cues 2. Local style-based cues “Planar”

Supervised Approach

Data-Driven 3D Primitives for Single Image Understanding. Fouhey, Gupta, Hebert. In ICCV ‘13.

Page 16: Factoring Scenes into 3D Structure and Style · Dissertation Contributions Image (3D Structure x Style) 3D Structure Style 1. Local image-based cues 2. Local style-based cues “Planar”

Issue #1 – Data

Wasteful: no cross-viewpoint sharing

Page 17: Factoring Scenes into 3D Structure and Style · Dissertation Contributions Image (3D Structure x Style) 3D Structure Style 1. Local image-based cues 2. Local style-based cues “Planar”

Solution

Single Image 3D Without a Single 3D Image. Fouhey, Hussain, Gupta, Hebert. In ICCV ‘15.

Style Element Detections

Explicit factorization via style elements:cross-viewpoint and do not require training data

Page 18: Factoring Scenes into 3D Structure and Style · Dissertation Contributions Image (3D Structure x Style) 3D Structure Style 1. Local image-based cues 2. Local style-based cues “Planar”

Issue #2

When do we apply domain knowledge/constraints?

Page 19: Factoring Scenes into 3D Structure and Style · Dissertation Contributions Image (3D Structure x Style) 3D Structure Style 1. Local image-based cues 2. Local style-based cues “Planar”

Solution

“Planar”

“Point Contact”

“Cylindrical”

3D Shape Attributes.Fouhey, Gupta, Zisserman. In CVPR ’16.

Higher-order Shape Properties

Page 20: Factoring Scenes into 3D Structure and Style · Dissertation Contributions Image (3D Structure x Style) 3D Structure Style 1. Local image-based cues 2. Local style-based cues “Planar”

Issue #3

World is much more constrained than per-pixel but more detailed than global properties.

Page 21: Factoring Scenes into 3D Structure and Style · Dissertation Contributions Image (3D Structure x Style) 3D Structure Style 1. Local image-based cues 2. Local style-based cues “Planar”

Solution

Unfolding an Indoor Origami World.Fouhey, Gupta, Hebert. In ECCV ’14.

Mid-level constraints, discrete scene parses

Page 22: Factoring Scenes into 3D Structure and Style · Dissertation Contributions Image (3D Structure x Style) 3D Structure Style 1. Local image-based cues 2. Local style-based cues “Planar”

Dissertation Contributions

3D Structure StyleImage (3D Structure x Style)

1. Local image-based cues

2. Local style-based cues

“Planar”

3. Cues for higher-order 3D structure

4. Constraints on 3D structure

5. Data-driven dense normal estimation as a scene understanding task

Page 23: Factoring Scenes into 3D Structure and Style · Dissertation Contributions Image (3D Structure x Style) 3D Structure Style 1. Local image-based cues 2. Local style-based cues “Planar”

RELATED WORK

Page 24: Factoring Scenes into 3D Structure and Style · Dissertation Contributions Image (3D Structure x Style) 3D Structure Style 1. Local image-based cues 2. Local style-based cues “Planar”

Human Vision

• Monocular cues are integral to “normal” vision

• Monocular can override binocular: monocular illusions persist under binocular conditions

Gehringer and Engel, Journal of Experimental Psychology: Human Perception and Performance, 1986

Page 25: Factoring Scenes into 3D Structure and Style · Dissertation Contributions Image (3D Structure x Style) 3D Structure Style 1. Local image-based cues 2. Local style-based cues “Planar”

Human Vision

Higher order properties are not obtained from depthmaps

“It is rather unlikely that the attitudes [i.e.,normals] are derived from a pictorial depthmap”-Koenderink, van Doorn, Kappers ‘96

“Judgements about the curvature of local surface patches were too precise to be based on a symbolic representation of surface orientation ”-Johnston and Passamore, ‘93

Koenderink, van Doorn, Kappers, Pictorial surface attitude and local depth comparisons. Perception and Psychophysics, 1996Johnston and Passamore, . Independent encoding of surface orientation and surface curvature, Vision Research, 1994

Page 26: Factoring Scenes into 3D Structure and Style · Dissertation Contributions Image (3D Structure x Style) 3D Structure Style 1. Local image-based cues 2. Local style-based cues “Planar”

Recovering 3D Structure

Roberts 1963, Guzman 1968, Huffman 1971, Clowes 1971, Waltz, 1975, Kanade

1980, Sugihara 1986, Malik 1987, etc.

Line-Based Primitives

Binford 1971, Brooks 1979,Biederman 1987, etc.

Volumetric Primitives

Page 27: Factoring Scenes into 3D Structure and Style · Dissertation Contributions Image (3D Structure x Style) 3D Structure Style 1. Local image-based cues 2. Local style-based cues “Planar”

Recovering 3D Structure

Hoiem et al., 2005 Saxena et al., 2005

Qualitative Orientation Quantitative Depth

Page 28: Factoring Scenes into 3D Structure and Style · Dissertation Contributions Image (3D Structure x Style) 3D Structure Style 1. Local image-based cues 2. Local style-based cues “Planar”

Image Factorization

Barrow and Tenenbaum 1978

Shape-from-X

Tappen et al., 2002, 2006, Grosse et al. 2009, Barron et al.

2012, etc.

Malik et al. 1997, Criminisi et al., 2000,

Forsyth 2002, Zhang et al. 2014, etc.

Tenenbaum et al., 1997

Elgammal et al. 2004, Wang et al., 2007, Pirsiavash 2009, etc.

Content & Style

Page 29: Factoring Scenes into 3D Structure and Style · Dissertation Contributions Image (3D Structure x Style) 3D Structure Style 1. Local image-based cues 2. Local style-based cues “Planar”

SURFACENORMALS

Page 30: Factoring Scenes into 3D Structure and Style · Dissertation Contributions Image (3D Structure x Style) 3D Structure Style 1. Local image-based cues 2. Local style-based cues “Planar”

Surface Normals

0.82-0.210.53

Quantitative Orientation

Page 31: Factoring Scenes into 3D Structure and Style · Dissertation Contributions Image (3D Structure x Style) 3D Structure Style 1. Local image-based cues 2. Local style-based cues “Planar”

Obtaining Normals

NormalsColor Image

Page 32: Factoring Scenes into 3D Structure and Style · Dissertation Contributions Image (3D Structure x Style) 3D Structure Style 1. Local image-based cues 2. Local style-based cues “Planar”

Evaluating NormalsInput GT Prediction

10°Aggregate over the entire dataset, compute:mean(E), median(E), sqrt(mean(E)), mean(E < t), t = 11.25, 22.5, 30

Page 33: Factoring Scenes into 3D Structure and Style · Dissertation Contributions Image (3D Structure x Style) 3D Structure Style 1. Local image-based cues 2. Local style-based cues “Planar”

Why Normals?

• Direct modeling produces better results

• Observable from perspective cues as opposed to scaling

• Fewer ambiguities than depth

Page 34: Factoring Scenes into 3D Structure and Style · Dissertation Contributions Image (3D Structure x Style) 3D Structure Style 1. Local image-based cues 2. Local style-based cues “Planar”

DATA-DRIVEN 3D PRIMITIVES

Local image-based cues

3D Structure StyleImage (3D Structure x Style)

Page 35: Factoring Scenes into 3D Structure and Style · Dissertation Contributions Image (3D Structure x Style) 3D Structure Style 1. Local image-based cues 2. Local style-based cues “Planar”

Previous Primitives

Segments Rooms Cuboids

Hoiem et al. 2005, Saxena et al. 2005, Ramalingam et al.

2008, etc.

Hedau et al. 2009, Flint et al. 2010, Flint et al. 2011,

Satkin et al. 2012, Schwing et al. 2012,

etc.

Lee et al. 2010, Gupta et al. 2010, Gupta et al. 2011, Xiao et al. 2012,

Schwing et al. 2013 etc.

Kanade 1981, Sugihara 1986,

Liebowitz et al. 1998, Criminisi et al. 1999, Lee et al., 2009, etc.

Lines + Planes

Page 36: Factoring Scenes into 3D Structure and Style · Dissertation Contributions Image (3D Structure x Style) 3D Structure Style 1. Local image-based cues 2. Local style-based cues “Planar”

Objective

VisuallyDiscriminative

Image

GeometricallyInformative

Surface Normals

Similar ideas presented concurrently at ICCV ‘13: Owens et al., Shape Anchors for Data-Driven Multi-view ReconstructionDollar et al., Structured Forests for Fast Edge Detection ;

Page 37: Factoring Scenes into 3D Structure and Style · Dissertation Contributions Image (3D Structure x Style) 3D Structure Style 1. Local image-based cues 2. Local style-based cues “Planar”

Representation

InstancesDetector

Canonical Form

Page 38: Factoring Scenes into 3D Structure and Style · Dissertation Contributions Image (3D Structure x Style) 3D Structure Style 1. Local image-based cues 2. Local style-based cues “Planar”

Representation

InstancesDetector

Canonical Formw

Page 39: Factoring Scenes into 3D Structure and Style · Dissertation Contributions Image (3D Structure x Style) 3D Structure Style 1. Local image-based cues 2. Local style-based cues “Planar”

Representation

N Canonical Form

InstancesDetector

Page 40: Factoring Scenes into 3D Structure and Style · Dissertation Contributions Image (3D Structure x Style) 3D Structure Style 1. Local image-based cues 2. Local style-based cues “Planar”

Representation

y Canonical Form

InstancesDetector

Page 41: Factoring Scenes into 3D Structure and Style · Dissertation Contributions Image (3D Structure x Style) 3D Structure Style 1. Local image-based cues 2. Local style-based cues “Planar”

Objective

Primitive Patch

Page 42: Factoring Scenes into 3D Structure and Style · Dissertation Contributions Image (3D Structure x Style) 3D Structure Style 1. Local image-based cues 2. Local style-based cues “Planar”

Objective

Primitive Patch

Regularized classifier; loss for labels determined by geometry

Page 43: Factoring Scenes into 3D Structure and Style · Dissertation Contributions Image (3D Structure x Style) 3D Structure Style 1. Local image-based cues 2. Local style-based cues “Planar”

Objective

Primitive Patch

Minimize intra-cluster geometric distance

Page 44: Factoring Scenes into 3D Structure and Style · Dissertation Contributions Image (3D Structure x Style) 3D Structure Style 1. Local image-based cues 2. Local style-based cues “Planar”

Objective

Primitive Patch

Solve with an approach similar to block-coordinate descent

Page 45: Factoring Scenes into 3D Structure and Style · Dissertation Contributions Image (3D Structure x Style) 3D Structure Style 1. Local image-based cues 2. Local style-based cues “Planar”

Learned Primitives

Page 46: Factoring Scenes into 3D Structure and Style · Dissertation Contributions Image (3D Structure x Style) 3D Structure Style 1. Local image-based cues 2. Local style-based cues “Planar”

Interpretation from Primitives

Page 47: Factoring Scenes into 3D Structure and Style · Dissertation Contributions Image (3D Structure x Style) 3D Structure Style 1. Local image-based cues 2. Local style-based cues “Planar”

Interpretation from Primitives

Page 48: Factoring Scenes into 3D Structure and Style · Dissertation Contributions Image (3D Structure x Style) 3D Structure Style 1. Local image-based cues 2. Local style-based cues “Planar”

Interpretation from Primitives

Page 49: Factoring Scenes into 3D Structure and Style · Dissertation Contributions Image (3D Structure x Style) 3D Structure Style 1. Local image-based cues 2. Local style-based cues “Planar”

Interpretation from Primitives

Page 50: Factoring Scenes into 3D Structure and Style · Dissertation Contributions Image (3D Structure x Style) 3D Structure Style 1. Local image-based cues 2. Local style-based cues “Planar”

Interpretation from Primitives

Page 51: Factoring Scenes into 3D Structure and Style · Dissertation Contributions Image (3D Structure x Style) 3D Structure Style 1. Local image-based cues 2. Local style-based cues “Planar”

Interpretation from Primitives

Page 52: Factoring Scenes into 3D Structure and Style · Dissertation Contributions Image (3D Structure x Style) 3D Structure Style 1. Local image-based cues 2. Local style-based cues “Planar”

Interpretation from Primitives

Page 53: Factoring Scenes into 3D Structure and Style · Dissertation Contributions Image (3D Structure x Style) 3D Structure Style 1. Local image-based cues 2. Local style-based cues “Planar”

Results – Quantitative% Good Pixels(Higher Better)

11.25° 22.5° 30°

Summary Stats (°)(Lower Better)

Mean Median RMSE

Karsch et al., ECCV 2012; Hoiem et al., ICCV 2005; Saxena et al. NIPS 2005

Karsch et al. 8.1 25.9 38.240.7 37.8 46.9

Hoiem et al. 9.0 31.2 43.541.2 35.1 49.2

Saxena et al. 10.7 27.0 36.348.0 43.1 57.0

RF+SIFT 11.4 31.4 44.536.0 33.4 41.7

3DP 18.6 38.6 49.934.2 30.0 41.4

Fouhey, Gupta, Hebert, ICCV ‘13.

Page 54: Factoring Scenes into 3D Structure and Style · Dissertation Contributions Image (3D Structure x Style) 3D Structure Style 1. Local image-based cues 2. Local style-based cues “Planar”

Issues

Pure memorization: no sharing between views

Learning requires a specialized sensor

Page 55: Factoring Scenes into 3D Structure and Style · Dissertation Contributions Image (3D Structure x Style) 3D Structure Style 1. Local image-based cues 2. Local style-based cues “Planar”

STYLE ELEMENTS

Local style-based cues

3D Structure StyleImage (3D Structure x Style)

Page 56: Factoring Scenes into 3D Structure and Style · Dissertation Contributions Image (3D Structure x Style) 3D Structure Style 1. Local image-based cues 2. Local style-based cues “Planar”

A Different Idea

= x

Image 3D Structure Style

These are easy to get in bulk

We can put priors on this

Page 57: Factoring Scenes into 3D Structure and Style · Dissertation Contributions Image (3D Structure x Style) 3D Structure Style 1. Local image-based cues 2. Local style-based cues “Planar”

Style Elements

Page 58: Factoring Scenes into 3D Structure and Style · Dissertation Contributions Image (3D Structure x Style) 3D Structure Style 1. Local image-based cues 2. Local style-based cues “Planar”

Factorization

= x

Image 3D Structure Style

Page 59: Factoring Scenes into 3D Structure and Style · Dissertation Contributions Image (3D Structure x Style) 3D Structure Style 1. Local image-based cues 2. Local style-based cues “Planar”

Solving for Style

Image 3D Structure

Vanishing Points

Style

Page 60: Factoring Scenes into 3D Structure and Style · Dissertation Contributions Image (3D Structure x Style) 3D Structure Style 1. Local image-based cues 2. Local style-based cues “Planar”

Solving for 3D Structure

Style Element

InputImage

HOG, Dalal and Triggs ’05; ELDA from Hariharan et al. ‘12

Page 61: Factoring Scenes into 3D Structure and Style · Dissertation Contributions Image (3D Structure x Style) 3D Structure Style 1. Local image-based cues 2. Local style-based cues “Planar”

Rectified Images

Solving for 3D Structure

Style Element

InputImage Detection +

Orientation

Page 62: Factoring Scenes into 3D Structure and Style · Dissertation Contributions Image (3D Structure x Style) 3D Structure Style 1. Local image-based cues 2. Local style-based cues “Planar”

Solving for 3D Structure over a Dataset

Set of Images

Style Element

Detection + Orientation

Page 63: Factoring Scenes into 3D Structure and Style · Dissertation Contributions Image (3D Structure x Style) 3D Structure Style 1. Local image-based cues 2. Local style-based cues “Planar”

2 Key Assumptions

Style and 3D structure are independent

On average, 3D structure is a box

Page 64: Factoring Scenes into 3D Structure and Style · Dissertation Contributions Image (3D Structure x Style) 3D Structure Style 1. Local image-based cues 2. Local style-based cues “Planar”

Plotting Detections

Surface Orientation

X Location

Page 65: Factoring Scenes into 3D Structure and Style · Dissertation Contributions Image (3D Structure x Style) 3D Structure Style 1. Local image-based cues 2. Local style-based cues “Planar”

Box Assumption

Surface Orientation

X Location

Page 66: Factoring Scenes into 3D Structure and Style · Dissertation Contributions Image (3D Structure x Style) 3D Structure Style 1. Local image-based cues 2. Local style-based cues “Planar”

Verifying Style Elements

Surface Orientation

X Location

Page 67: Factoring Scenes into 3D Structure and Style · Dissertation Contributions Image (3D Structure x Style) 3D Structure Style 1. Local image-based cues 2. Local style-based cues “Planar”

Verifying Style Elements

Surface Orientation

X Location

Page 68: Factoring Scenes into 3D Structure and Style · Dissertation Contributions Image (3D Structure x Style) 3D Structure Style 1. Local image-based cues 2. Local style-based cues “Planar”

Verifying Style Elements

X Location

𝑖=1

𝑊

𝑃𝑟𝑖𝑜𝑟𝑖 −𝐷𝑎𝑡𝑎𝑖

Page 69: Factoring Scenes into 3D Structure and Style · Dissertation Contributions Image (3D Structure x Style) 3D Structure Style 1. Local image-based cues 2. Local style-based cues “Planar”

Verifying Style Elements

Surface Orientation

X Location

Page 70: Factoring Scenes into 3D Structure and Style · Dissertation Contributions Image (3D Structure x Style) 3D Structure Style 1. Local image-based cues 2. Local style-based cues “Planar”

Verifying Style Elements

Surface Orientation

X Location

Page 71: Factoring Scenes into 3D Structure and Style · Dissertation Contributions Image (3D Structure x Style) 3D Structure Style 1. Local image-based cues 2. Local style-based cues “Planar”

Hypothesize and Verify Pipeline

Page 72: Factoring Scenes into 3D Structure and Style · Dissertation Contributions Image (3D Structure x Style) 3D Structure Style 1. Local image-based cues 2. Local style-based cues “Planar”

Discovered Style Elements

Element Detections Element Detections

Ver

tica

lH

ori

zon

tal

Page 73: Factoring Scenes into 3D Structure and Style · Dissertation Contributions Image (3D Structure x Style) 3D Structure Style 1. Local image-based cues 2. Local style-based cues “Planar”

Interpreting

Page 74: Factoring Scenes into 3D Structure and Style · Dissertation Contributions Image (3D Structure x Style) 3D Structure Style 1. Local image-based cues 2. Local style-based cues “Planar”

Results

Input

GT

Output

Page 75: Factoring Scenes into 3D Structure and Style · Dissertation Contributions Image (3D Structure x Style) 3D Structure Style 1. Local image-based cues 2. Local style-based cues “Planar”

Results

Input

GT

Output

Page 76: Factoring Scenes into 3D Structure and Style · Dissertation Contributions Image (3D Structure x Style) 3D Structure Style 1. Local image-based cues 2. Local style-based cues “Planar”

Quantitative Results

All Pixels

Pixels < 30°

MedianError

Style Elements 21.7° 55.4%

Pixels < 11.25°

36.8%3DP 19.2° 57.8%39.2%

Origami World 17.9° 58.9%Disc. Coding 23.5° 58.7%

40.5%27.7%

3DP: Fouhey et al. ICCV ’13; Origami World: Fouhey et al. ECCV ’14; Disc. Coding: Ladicky et al. ECCV ‘14

(Lower Better) (Higher Better)

Vertical

Pixels < 30°

58.8%

59.7%

(Higher Better)

Page 77: Factoring Scenes into 3D Structure and Style · Dissertation Contributions Image (3D Structure x Style) 3D Structure Style 1. Local image-based cues 2. Local style-based cues “Planar”

Scaling Up To The World

RGBD Datasets Internet Images

Images from Places-205, Zhou et al. NPS ‘15

Page 78: Factoring Scenes into 3D Structure and Style · Dissertation Contributions Image (3D Structure x Style) 3D Structure Style 1. Local image-based cues 2. Local style-based cues “Planar”

Results on Internet Images

Supermarket

Museum

Laundromat

Locker Room

Automatically Discovered Style Elements

Page 79: Factoring Scenes into 3D Structure and Style · Dissertation Contributions Image (3D Structure x Style) 3D Structure Style 1. Local image-based cues 2. Local style-based cues “Planar”

Quantitative Results

3DP: Fouhey et al. ICCV ‘13. Images from Places-205, Zhou et al. NPS ‘15

59.2%

62.9%

3DP

Pixels < 30 Degrees

Style Elements

10 categories from Places-205 DatasetImages sparsely manually annotated

Page 80: Factoring Scenes into 3D Structure and Style · Dissertation Contributions Image (3D Structure x Style) 3D Structure Style 1. Local image-based cues 2. Local style-based cues “Planar”

The Story So Far

Unconstrained Outputs Constrained Outputs

Page 81: Factoring Scenes into 3D Structure and Style · Dissertation Contributions Image (3D Structure x Style) 3D Structure Style 1. Local image-based cues 2. Local style-based cues “Planar”

3D SHAPE ATTRIBUTES

Cues for higher-order 3D structure

3D Structure StyleImage (3D Structure x Style)

Planar

Page 82: Factoring Scenes into 3D Structure and Style · Dissertation Contributions Image (3D Structure x Style) 3D Structure Style 1. Local image-based cues 2. Local style-based cues “Planar”

Goal: 3D Shape Attributes

Not PlanarSmooth surface1 point of contactNot point contactHas HoleNot thin structures…

Page 83: Factoring Scenes into 3D Structure and Style · Dissertation Contributions Image (3D Structure x Style) 3D Structure Style 1. Local image-based cues 2. Local style-based cues “Planar”

Data

Page 84: Factoring Scenes into 3D Structure and Style · Dissertation Contributions Image (3D Structure x Style) 3D Structure Style 1. Local image-based cues 2. Local style-based cues “Planar”

3D Shape Attributes

PlanarSurfaces

CylindricalSurfaces

Point or Line

Multiple

ThinStructures

HasHole

Curvature(4 Total)

Contact(2 Total)

Occupancy(6 Total)

Page 85: Factoring Scenes into 3D Structure and Style · Dissertation Contributions Image (3D Structure x Style) 3D Structure Style 1. Local image-based cues 2. Local style-based cues “Planar”

Examples

Positives: Has Planar Surfaces

Page 86: Factoring Scenes into 3D Structure and Style · Dissertation Contributions Image (3D Structure x Style) 3D Structure Style 1. Local image-based cues 2. Local style-based cues “Planar”

Examples

Negatives: Has Planar Surfaces

Page 87: Factoring Scenes into 3D Structure and Style · Dissertation Contributions Image (3D Structure x Style) 3D Structure Style 1. Local image-based cues 2. Local style-based cues “Planar”

Examples

Positives: Has Point/Line Contact

Page 88: Factoring Scenes into 3D Structure and Style · Dissertation Contributions Image (3D Structure x Style) 3D Structure Style 1. Local image-based cues 2. Local style-based cues “Planar”

Examples

Negatives: Has Point/Line Contact

Page 89: Factoring Scenes into 3D Structure and Style · Dissertation Contributions Image (3D Structure x Style) 3D Structure Style 1. Local image-based cues 2. Local style-based cues “Planar”

Examples

Positives: Has Thin Structures

Page 90: Factoring Scenes into 3D Structure and Style · Dissertation Contributions Image (3D Structure x Style) 3D Structure Style 1. Local image-based cues 2. Local style-based cues “Planar”

Examples

Negatives: Has Thin Structures

Page 91: Factoring Scenes into 3D Structure and Style · Dissertation Contributions Image (3D Structure x Style) 3D Structure Style 1. Local image-based cues 2. Local style-based cues “Planar”

Data

Princeton Columbus Toronto

YorkshireMalagaLondon

Page 92: Factoring Scenes into 3D Structure and Style · Dissertation Contributions Image (3D Structure x Style) 3D Structure Style 1. Local image-based cues 2. Local style-based cues “Planar”

Data

Page 93: Factoring Scenes into 3D Structure and Style · Dissertation Contributions Image (3D Structure x Style) 3D Structure Style 1. Local image-based cues 2. Local style-based cues “Planar”

Data

R. Serra

H. Moore

A. Calder

Two Forms

The Arch

Knife Edge

5 Swords

Eagle

Gwenfritz …

…242 2187 143K

Page 94: Factoring Scenes into 3D Structure and Style · Dissertation Contributions Image (3D Structure x Style) 3D Structure Style 1. Local image-based cues 2. Local style-based cues “Planar”

Learning To Predict

12D Shape

Attributes

Conv. Layers FC LayersInput

VGG-M

1024D Shape

Embedding

Triplet loss as in Schults and Joachims ’04, Schroff et al. ’14, Wang et al. ‘15, Parkhi et al. ‘15

Page 95: Factoring Scenes into 3D Structure and Style · Dissertation Contributions Image (3D Structure x Style) 3D Structure Style 1. Local image-based cues 2. Local style-based cues “Planar”

Qualitative ResultsPoint/Line ContactMost Least

Rough Surface

Page 96: Factoring Scenes into 3D Structure and Style · Dissertation Contributions Image (3D Structure x Style) 3D Structure Style 1. Local image-based cues 2. Local style-based cues “Planar”

Indirect Baselines

Z

• SIRFS (Barron et al. ’15)• CNN (Eigen et al. ’14)

Planar = YesHoles = Yes…2+ Contacts = No…

• KDES+SVM (Bo et al. ‘11)• HHA+CNN (Gupta et al. ‘14)

Page 97: Factoring Scenes into 3D Structure and Style · Dissertation Contributions Image (3D Structure x Style) 3D Structure Style 1. Local image-based cues 2. Local style-based cues “Planar”

Quantitative Results

KDES

Eigen ‘14

58.5

HHA

Barron ‘15

59.461.2 62.5 72.3

End-to-end

KDES HHA

Criterion: mean AUC of ROC.

Page 98: Factoring Scenes into 3D Structure and Style · Dissertation Contributions Image (3D Structure x Style) 3D Structure Style 1. Local image-based cues 2. Local style-based cues “Planar”

PASCAL VOC Results

PlanarityMost Least

Planarity Least

Most

Page 99: Factoring Scenes into 3D Structure and Style · Dissertation Contributions Image (3D Structure x Style) 3D Structure Style 1. Local image-based cues 2. Local style-based cues “Planar”

PASCAL VOC Results

Rough SurfaceMost Least

Point/Line ContactMost Least

Page 100: Factoring Scenes into 3D Structure and Style · Dissertation Contributions Image (3D Structure x Style) 3D Structure Style 1. Local image-based cues 2. Local style-based cues “Planar”

The Story So Far

Planar

?

Page 101: Factoring Scenes into 3D Structure and Style · Dissertation Contributions Image (3D Structure x Style) 3D Structure Style 1. Local image-based cues 2. Local style-based cues “Planar”

CONSTRAINTS ON3D STRUCTURE

Mid-level constraints on 3D Structure

3D Structure StyleImage (3D Structure x Style)

Page 102: Factoring Scenes into 3D Structure and Style · Dissertation Contributions Image (3D Structure x Style) 3D Structure Style 1. Local image-based cues 2. Local style-based cues “Planar”

Mid-level in the Past

Huffman 71, Clowes 71, Kanade 80, 81 Sugihara 86, Malik 87, etc.

Page 103: Factoring Scenes into 3D Structure and Style · Dissertation Contributions Image (3D Structure x Style) 3D Structure Style 1. Local image-based cues 2. Local style-based cues “Planar”

Our Mid-Level Constraints

Page 104: Factoring Scenes into 3D Structure and Style · Dissertation Contributions Image (3D Structure x Style) 3D Structure Style 1. Local image-based cues 2. Local style-based cues “Planar”

Our OutputInput:

Single ImageOutput:

Discrete Scene Parse

Page 105: Factoring Scenes into 3D Structure and Style · Dissertation Contributions Image (3D Structure x Style) 3D Structure Style 1. Local image-based cues 2. Local style-based cues “Planar”

Parameterization

Page 106: Factoring Scenes into 3D Structure and Style · Dissertation Contributions Image (3D Structure x Style) 3D Structure Style 1. Local image-based cues 2. Local style-based cues “Planar”

Parameterization

vp1

vp2

vp3

VP Estimator from Hedau et al., 2009

Page 107: Factoring Scenes into 3D Structure and Style · Dissertation Contributions Image (3D Structure x Style) 3D Structure Style 1. Local image-based cues 2. Local style-based cues “Planar”

Parameterization

Two VPs give grid cell

Page 108: Factoring Scenes into 3D Structure and Style · Dissertation Contributions Image (3D Structure x Style) 3D Structure Style 1. Local image-based cues 2. Local style-based cues “Planar”

Encoding Surface Normals

Page 109: Factoring Scenes into 3D Structure and Style · Dissertation Contributions Image (3D Structure x Style) 3D Structure Style 1. Local image-based cues 2. Local style-based cues “Planar”

Encoding Surface Normals

Page 110: Factoring Scenes into 3D Structure and Style · Dissertation Contributions Image (3D Structure x Style) 3D Structure Style 1. Local image-based cues 2. Local style-based cues “Planar”

Encoding Surface Normals

Page 111: Factoring Scenes into 3D Structure and Style · Dissertation Contributions Image (3D Structure x Style) 3D Structure Style 1. Local image-based cues 2. Local style-based cues “Planar”

Encoding Surface Normals

x1,…, x400 x401,…, x800 x801,…, x1200

Page 112: Factoring Scenes into 3D Structure and Style · Dissertation Contributions Image (3D Structure x Style) 3D Structure Style 1. Local image-based cues 2. Local style-based cues “Planar”

Formulation

Page 113: Factoring Scenes into 3D Structure and Style · Dissertation Contributions Image (3D Structure x Style) 3D Structure Style 1. Local image-based cues 2. Local style-based cues “Planar”

Constraints

Page 114: Factoring Scenes into 3D Structure and Style · Dissertation Contributions Image (3D Structure x Style) 3D Structure Style 1. Local image-based cues 2. Local style-based cues “Planar”

Unaries

Page 115: Factoring Scenes into 3D Structure and Style · Dissertation Contributions Image (3D Structure x Style) 3D Structure Style 1. Local image-based cues 2. Local style-based cues “Planar”

Unaries

Low cUnary Evidence: (1) 3DP

(2) Room Box Fitting

High c

Page 116: Factoring Scenes into 3D Structure and Style · Dissertation Contributions Image (3D Structure x Style) 3D Structure Style 1. Local image-based cues 2. Local style-based cues “Planar”

Binaries

Page 117: Factoring Scenes into 3D Structure and Style · Dissertation Contributions Image (3D Structure x Style) 3D Structure Style 1. Local image-based cues 2. Local style-based cues “Planar”

Convex/Concave Constraints

Detected Concave (-)

Page 118: Factoring Scenes into 3D Structure and Style · Dissertation Contributions Image (3D Structure x Style) 3D Structure Style 1. Local image-based cues 2. Local style-based cues “Planar”

Convex/Concave Constraints

Detected Concave (-)

Page 119: Factoring Scenes into 3D Structure and Style · Dissertation Contributions Image (3D Structure x Style) 3D Structure Style 1. Local image-based cues 2. Local style-based cues “Planar”

Convex/Concave Constraints

Detected Concave (-)

Page 120: Factoring Scenes into 3D Structure and Style · Dissertation Contributions Image (3D Structure x Style) 3D Structure Style 1. Local image-based cues 2. Local style-based cues “Planar”

Convex/Concave Constraints

Detected Concave (-)

Page 121: Factoring Scenes into 3D Structure and Style · Dissertation Contributions Image (3D Structure x Style) 3D Structure Style 1. Local image-based cues 2. Local style-based cues “Planar”

Convex/Concave Constraints

Detected Concave (-)

Page 122: Factoring Scenes into 3D Structure and Style · Dissertation Contributions Image (3D Structure x Style) 3D Structure Style 1. Local image-based cues 2. Local style-based cues “Planar”

Detecting Convex/Concave

Ground-Truth Discontinuities similar to Gupta, Arbelaez, Malik, 20133DP from Fouhey, Gupta, Hebert, 2013

Input 3D Primitive Bank

Use 3DP to Transfer Convex/Concave

Page 123: Factoring Scenes into 3D Structure and Style · Dissertation Contributions Image (3D Structure x Style) 3D Structure Style 1. Local image-based cues 2. Local style-based cues “Planar”

Smoothness

Page 124: Factoring Scenes into 3D Structure and Style · Dissertation Contributions Image (3D Structure x Style) 3D Structure Style 1. Local image-based cues 2. Local style-based cues “Planar”

Solving the Model

Page 125: Factoring Scenes into 3D Structure and Style · Dissertation Contributions Image (3D Structure x Style) 3D Structure Style 1. Local image-based cues 2. Local style-based cues “Planar”

Qualitative Results

Page 126: Factoring Scenes into 3D Structure and Style · Dissertation Contributions Image (3D Structure x Style) 3D Structure Style 1. Local image-based cues 2. Local style-based cues “Planar”

Qualitative Results

Page 127: Factoring Scenes into 3D Structure and Style · Dissertation Contributions Image (3D Structure x Style) 3D Structure Style 1. Local image-based cues 2. Local style-based cues “Planar”

Qualitative Results

Page 128: Factoring Scenes into 3D Structure and Style · Dissertation Contributions Image (3D Structure x Style) 3D Structure Style 1. Local image-based cues 2. Local style-based cues “Planar”

Results – Quantitative

Summary Stats (°)(Lower Better)

% Good Pixels(Higher Better)

11.25° 22.5° 30°

Proposed 40.5 54.1 58.9

Mean Median

3DP 39.2 52.9 57.836.3 19.2

Ladicky ‘14 27.7 49.0 58.733.5 23.1

35.2 17.9

Fouhey et al. ICCV ’13; Ladicky et al. ECCV ‘14

Page 129: Factoring Scenes into 3D Structure and Style · Dissertation Contributions Image (3D Structure x Style) 3D Structure Style 1. Local image-based cues 2. Local style-based cues “Planar”

CONCLUSIONS & FUTURE WORK

Page 130: Factoring Scenes into 3D Structure and Style · Dissertation Contributions Image (3D Structure x Style) 3D Structure Style 1. Local image-based cues 2. Local style-based cues “Planar”

Today3D Structure StyleImage (3D Structure x Style)

Page 131: Factoring Scenes into 3D Structure and Style · Dissertation Contributions Image (3D Structure x Style) 3D Structure Style 1. Local image-based cues 2. Local style-based cues “Planar”

Today3D Structure StyleImage (3D Structure x Style)

Page 132: Factoring Scenes into 3D Structure and Style · Dissertation Contributions Image (3D Structure x Style) 3D Structure Style 1. Local image-based cues 2. Local style-based cues “Planar”

Today3D Structure StyleImage (3D Structure x Style)

Page 133: Factoring Scenes into 3D Structure and Style · Dissertation Contributions Image (3D Structure x Style) 3D Structure Style 1. Local image-based cues 2. Local style-based cues “Planar”

Today3D Structure StyleImage (3D Structure x Style)

Planar

Non-Planar

Cylindrical

Rough Surf

Pnt/L Contact

Mult. Contact

Empty

Mult. Pieces

Holes

Thin

Mirror Sym.

Cubic Aspect

“Planar”

Page 134: Factoring Scenes into 3D Structure and Style · Dissertation Contributions Image (3D Structure x Style) 3D Structure Style 1. Local image-based cues 2. Local style-based cues “Planar”

Today3D Structure StyleImage (3D Structure x Style)

Page 135: Factoring Scenes into 3D Structure and Style · Dissertation Contributions Image (3D Structure x Style) 3D Structure Style 1. Local image-based cues 2. Local style-based cues “Planar”

Future Work

Page 136: Factoring Scenes into 3D Structure and Style · Dissertation Contributions Image (3D Structure x Style) 3D Structure Style 1. Local image-based cues 2. Local style-based cues “Planar”

Further Factorization

Image

3D Structure

Style

Page 137: Factoring Scenes into 3D Structure and Style · Dissertation Contributions Image (3D Structure x Style) 3D Structure Style 1. Local image-based cues 2. Local style-based cues “Planar”

Further Factorization

Image

3D Structure

Style

Viewpoint

True 3D

Page 138: Factoring Scenes into 3D Structure and Style · Dissertation Contributions Image (3D Structure x Style) 3D Structure Style 1. Local image-based cues 2. Local style-based cues “Planar”

Reuniting 3Ds (Multiview)

+

E.g., Concha et al. Autonomous Robots ‘15, Hadfield et al. ICCV ‘15, Hane et al. CVPR ‘15

Monocular and multi-view cues

Supervised and unsupervised models

RGB

RGBD

Page 139: Factoring Scenes into 3D Structure and Style · Dissertation Contributions Image (3D Structure x Style) 3D Structure Style 1. Local image-based cues 2. Local style-based cues “Planar”

Reuniting 3Ds (Single View)

Texture

Top-down

Semantics

Occlusion

Shading

Page 140: Factoring Scenes into 3D Structure and Style · Dissertation Contributions Image (3D Structure x Style) 3D Structure Style 1. Local image-based cues 2. Local style-based cues “Planar”

Thank you

3D Structure StyleImage (3D Structure x Style)

“Planar”