1
Constructing Depth Information in Briefly Presented Scenes Talia Konkle, Elisa McDaniel, Michelle R. Greene, Aude Oliva Brain and Cognitive Sciences, MIT Question: How does pictoral depth information unfold within the glance? Experiment 1: How do depth planes “fill in”? Experiment 2: Is performance driven by pixel-distance or depth disparity? Experiment 3: Is global image structure obligatory in depth judgements? Conclusions: 1. How does depth information “fill in” as the viewing time increases? 2. Which factor influences two-point depth dis- crimination ability more: local pixel-distances or global depth structure? 3. Is there an obligatory contribution of global image structure in depth comparisons? Task and Image Specifics: - 6 images with 8 points selected ranked from foreground to back- ground surfaces confirmed inde- pendently by two of the authors. - Points were selected such that, across all 28 pairwise comparisons, the lowest dot along the y-axis was the closest in depth ~50% of the time. - Each trial consists of a depth com- parison of 2 from the 8 choose 2 = 28 possible comparisons displayed for 1 of 4 presentation durations [40, 80, 120, 240]. Thus, each image is viewed 112 times over the course of the experiment. - 13 Participants (1 excluded because he failed to complete the experi- ment) - Pictures size ~10 degrees, Target dots ~.6 degrees. + Fixation 200-400 ms Pre-Cue 800 ms Presentation Time [40, 80, 120, 240] ms Mask 100 ms Response: Which dot is closer? (auditory feedback for incorrect response) Overall Performance as a function of Presentation Time 0.50 0.60 0.70 0.80 0.90 1.00 0 50 100 150 200 250 300 Presentation Time Percent Correct 1 2 3 4 5 6 7 2 3 4 5 6 7 1 2 3 4 5 6 7 2 3 4 5 6 7 1 2 3 4 5 6 7 2 3 4 5 6 7 1 2 3 4 5 6 7 2 3 4 5 6 7 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1 2 4 5 6 7 8 3 40 ms 80 ms 120 ms 240 ms + Fixation 200-400 ms Pre-Cue 800 ms Presentation Time [40, 80, 120, 240] ms Mask 100 ms Response: Similar or Different in depth? (auditory feedback for incorrect response) Results: * Performance is not at ceiling by the end of the glance * Accurate depth comparisons emerge in a coarse-to-fine manner 1 2 3 4 5 6 7 2 3 4 5 6 7 similar in depth different in depth foreground background foreground background 8 8 8 8 8 Presentation Time Performance for all Depth-Plane Comparisons Respond whether the two points presented are similar or different in depth. Task Specifics: - 64 images, selected such that each image appears in only two of the six conditions. - 30 comparisons per cell (15 foreground / 15 background) - 10 Participants (2 excluded because overall performance was not significantly above chance) - Large pixel distance > ~6 degrees - Small pixel distance < ~2 degrees Similar in Depth Different in Depth Small Pixel Distance (<2 visual degrees) Large Pixel Distance (>5 visual degrees) 50% foreground 50% background 50% foreground 50% background 1) the histogram y-axis difference between the points is matched between the similar-in-depth comparisons and the different-in-depth comparison Performance over Time 0.40 0.50 0.60 0.70 0.80 0.90 1.00 0 40 80 120 160 200 240 280 Presentation Time Percent Correct different depth small pixel different depth large pixel similar depth small pixel similar depth large pixel Foreground and Background Comparisons 0.50 0.55 0.60 0.65 0.70 0.75 0.80 0.85 0.90 0.95 1.00 0 40 80 120 160 200 240 280 Presentation time Percent Correct foreground background different depth planes Results: * Depth judgments of different depth planes improve over time, indepen- dent of the pixel distance between the points. * Depth judgments of similar depth planes are easier when the points are near on the image. When the points are far apart on the image, perfor- mance remains poor, even with more viewing time. * Depth judgments in the foreground are performed more accurately than depth comparisons in the back- ground. 2) difference between the average value of a local patch around both points (~.5 degree) is approximately balanced for the foreground, background, and different-in-depth comparisons Two-Point Comparison Controls: Across all 6 conditions, the two-point comparisons were selected such that the following constraints were matched: Task Specifics and Point Selection - 240 ms presentation time - From 64 images, 15 point comparisons chosen for 8 conditions: {small/large pixel distance x original image similar/different depth x embedded image similar/different depth} - each point comparison was shown in the original, baseline, and embedded condition Baseline Embedded Original matched global structure mismatched global structure + Fixation 200-400 ms Pre-Cue 800 ms Presentation Time 240 ms Mask 100 ms Response: Similar or Different in depth? (auditory feedback for incorrect response) Overall Performance 0.50 0.60 0.70 0.80 0.90 1.00 original embedded Percent Correct baseline 77% Effect of Compatible and Incompatible global structure (difference of embedded condition to baseline) -0.15 -0.10 -0.05 0.00 0.05 matched global structure mismatched global structure Percent Correct presentation time Two possibilities: depth is built locally... depth is filled-in globally... * Global image structure which is incompatible with the embedded point comparison impairs perfor- mance. Performance is similar to baseline when the global image structure is compatible. Results: * The global image information obligatorily influences a com- parative depth judgement. Respond which dot is closer to you on the image by pressing the corresponding button on the keyboard. Task: Comparisons: Design: Explore the trends of perceived depth in more natural visual settings, rather than examining the contributions of single or interacting depth cues. Purpose: Comparisons: Design: Task: Task: Comparisons: Respond whether the two points presented are similar or different in depth. * Local two-point comparisons of depth in a natural image are necessarily influenced by the global structure of the image. * Depth comparison accuracy generally emerges in a coarse-to-fine, foreground to back- ground manner. Proximity of the comparison points in the two-dimensional image only benefits performance when the two points are similar in depth.

Constructing Depth Information in Briefly Presented Scenesweb.mit.edu/tkonkle/www/POSTERS/Konkle_2006_VSS.pdf · Presentation Time Performance for all Depth-Plane Comparisons Respond

  • Upload
    others

  • View
    5

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Constructing Depth Information in Briefly Presented Scenesweb.mit.edu/tkonkle/www/POSTERS/Konkle_2006_VSS.pdf · Presentation Time Performance for all Depth-Plane Comparisons Respond

Constructing Depth Information in Briefly Presented Scenes

Talia Konkle, Elisa McDaniel, Michelle R. Greene, Aude Oliva

Brain and Cognitive Sciences, MIT

Question: How does pictoral depth information unfold within the glance?

Experiment 1: How do depth planes “fill in”?

Experiment 2: Is performance driven by pixel-distance or depth disparity?

Experiment 3: Is global image structure obligatory in depth judgements?

Conclusions:

1. How does depth information “fill in” as the viewing time increases?

2. Which factor influences two-point depth dis-crimination ability more: local pixel-distances or global depth structure?

3. Is there an obligatory contribution of global image structure in depth comparisons?

Task and Image Specifics:

- 6 images with 8 points selected ranked from foreground to back-ground surfaces confirmed inde-pendently by two of the authors.

- Points were selected such that, across all 28 pairwise comparisons, the lowest dot along the y-axis was the closest in depth ~50% of the time.

- Each trial consists of a depth com-parison of 2 from the 8 choose 2 = 28 possible comparisons displayed for 1 of 4 presentation durations [40, 80, 120, 240]. Thus, each image is viewed 112 times over the course of the experiment.

- 13 Participants (1 excluded because he failed to complete the experi-ment)

- Pictures size ~10 degrees, Target dots ~.6 degrees.

++

Fixation200-400 ms

Pre-Cue800 ms

Presentation Time[40, 80, 120, 240] ms

Mask100 ms

Response:Which dot is closer?

(auditory feedback for incorrect response)

Overall Performance as a function of Presentation Time

0.50

0.60

0.70

0.80

0.90

1.00

0 50 100 150 200 250 300

Presentation Time

Per

cent

Cor

rect

1 2 3 4 5 6 7

2

3

4

5

6

7

1 2 3 4 5 6 7

2

3

4

5

6

7

1 2 3 4 5 6 7

2

3

4

5

6

7

1 2 3 4 5 6 7

2

3

4

5

6

7

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

1

24

5

6

7

8

3

40 ms 80 ms 120 ms 240 ms

++

Fixation200-400 ms

Pre-Cue800 ms

Presentation Time[40, 80, 120, 240] ms

Mask100 ms

Response:Similar or Different

in depth?

(auditory feedback for incorrect response)

Results:

* Performance is not at ceiling by the end of the glance

* Accurate depth comparisons emerge in a coarse-to-fine manner

1 2 3 4 5 6 72

3

4

5

6

7

sim

ilar i

n dep

th

different in depthforeground

background

foreground background

8 8 8 8 8

Presentation Time

Performance for all Depth-Plane Comparisons

Respond whether the two points presented are similar or different in depth.

Task Specifics:

- 64 images, selected such that each image appears in only two of the six conditions.

- 30 comparisons per cell (15 foreground / 15 background)

- 10 Participants (2 excluded because overall performance was not significantly above chance)

- Large pixel distance > ~6 degrees- Small pixel distance < ~2 degrees

Similar in Depth

Different in Depth

Small Pixel Distance(<2 visual degrees)

Large Pixel Distance(>5 visual degrees)

50% foreground 50% background

50% foreground 50% background

1) the histogram y-axis difference between the points is matched between the similar-in-depth comparisons and the different-in-depth comparison

Performance over Time

0.40

0.50

0.60

0.70

0.80

0.90

1.00

0 40 80 120 160 200 240 280

Presentation Time

Per

cent

Cor

rect different depth small pixel

different depth large pixelsimilar depth small pixelsimilar depth large pixel

Foreground and Background Comparisons

0.50

0.55

0.60

0.65

0.70

0.75

0.80

0.85

0.90

0.95

1.00

0 40 80 120 160 200 240 280

Presentation time

Perc

ent C

orre

ct

foregroundbackgrounddifferent depth planes

Results:

* Depth judgments of different depth planes improve over time, indepen-dent of the pixel distance between the points.

* Depth judgments of similar depth planes are easier when the points are near on the image. When the points are far apart on the image, perfor-mance remains poor, even with more viewing time.

* Depth judgments in the foreground are performed more accurately than depth comparisons in the back-ground.

2) difference between the average value of a local patch around both points (~.5 degree) is approximately balanced for the foreground, background, and different-in-depth comparisons

Two-Point Comparison Controls:

Across all 6 conditions, the two-point comparisons were selected such that the following constraints were matched:

Task Specifics and Point Selection

- 240 ms presentation time - From 64 images, 15 point comparisons chosen for 8 conditions: {small/large pixel distance x original image similar/different depth x embedded image similar/different depth}

- each point comparison was shown in the original, baseline, and embedded condition

Baseline EmbeddedOriginal

matchedglobal structure

mismatchedglobal structure

++

Fixation200-400 ms

Pre-Cue800 ms

Presentation Time240 ms

Mask100 ms

Response:Similar or Different

in depth?

(auditory feedback for incorrect response)

Overall Performance

0.50

0.60

0.70

0.80

0.90

1.00

original embedded

Perce

nt Co

rrect baseline

77%

Effect of Compatible and Incompatible global structure(difference of embedded condition to baseline)

-0.15

-0.10

-0.05

0.00

0.05

matched global structure mismatched global structure

Per

cent

Cor

rect

presentation time

Two possibilities:

depth is built locally...depth is filled-in globally...

* Global image structure which is incompatible with the embedded point comparison impairs perfor-mance. Performance is similar to baseline when the global image structure is compatible.

Results:

* The global image information obligatorily influences a com-parative depth judgement.

Respond which dot is closer to you on the image by pressing the corresponding button on the keyboard.

Task: Comparisons:Design:

Explore the trends of perceived depth in more natural visual settings, rather than examining the contributions of single or interacting depth cues.

Purpose:

Comparisons:

Design:Task: Task:

Comparisons:

Respond whether the two points presented are similar or different in depth.

* Local two-point comparisons of depth in a natural image are necessarily influenced by the global structure of the image.

* Depth comparison accuracy generally emerges in a coarse-to-fine, foreground to back-ground manner. Proximity of the comparison points in the two-dimensional image only benefits performance when the two points are similar in depth.