29
1 Patch Complexity, Finite Pixel Correlations and Optimal Denoising Anat Levin, Boaz Nadler, Fredo Durand and Bill Freeman Weizmann Institute, MIT CSAIL

1 Patch Complexity, Finite Pixel Correlations and Optimal Denoising Anat Levin, Boaz Nadler, Fredo Durand and Bill Freeman Weizmann Institute, MIT CSAIL

Embed Size (px)

Citation preview

Page 1: 1 Patch Complexity, Finite Pixel Correlations and Optimal Denoising Anat Levin, Boaz Nadler, Fredo Durand and Bill Freeman Weizmann Institute, MIT CSAIL

1

Patch Complexity, Finite Pixel Correlations and Optimal Denoising

Anat Levin, Boaz Nadler, Fredo Durand and Bill Freeman

Weizmann Institute, MIT CSAIL

Page 2: 1 Patch Complexity, Finite Pixel Correlations and Optimal Denoising Anat Levin, Boaz Nadler, Fredo Durand and Bill Freeman Weizmann Institute, MIT CSAIL

2

Many research efforts invested, and results harder and harder to improve: reaching saturation?

• What uncertainty is inherent in the problem?• How further can we improve results?

Image denoising

Page 3: 1 Patch Complexity, Finite Pixel Correlations and Optimal Denoising Anat Levin, Boaz Nadler, Fredo Durand and Bill Freeman Weizmann Institute, MIT CSAIL

3

What is the volume of all clean x images that can explain a noisy image y?

y

Denoising Uncertainty

Page 4: 1 Patch Complexity, Finite Pixel Correlations and Optimal Denoising Anat Levin, Boaz Nadler, Fredo Durand and Bill Freeman Weizmann Institute, MIT CSAIL

4

What is the volume of all clean images x that can explain a noisy image y?

Multiple clean images within noise level.

y

Denoising Uncertainty

Page 5: 1 Patch Complexity, Finite Pixel Correlations and Optimal Denoising Anat Levin, Boaz Nadler, Fredo Durand and Bill Freeman Weizmann Institute, MIT CSAIL

5

Denoising limits- prior work• Signal processing assumptions (Wiener filter, Gaussian

priors)

• Limits on super resolution- numerical arguments, no prior [Baker&Kanade 02]

• Sharp bounds for perfectly piecewise constant images [Korostelev&Tsybakov 93, Polzehl&Spokoiny 03]

• Non-local means- asymptotically optimal for infinitely large images. No analysis of finite size images. [Buades,Coll&Morel. 05]

• Natural image denoising limits, but many assumptions which may not hold in practice and affect conclusions. [Chatterjee and Milanfar 10]

Page 6: 1 Patch Complexity, Finite Pixel Correlations and Optimal Denoising Anat Levin, Boaz Nadler, Fredo Durand and Bill Freeman Weizmann Institute, MIT CSAIL

6

MMSE denoising bounds

MMSE= conditional variance, achieved by the conditional mean

MMSE with the exact p(x) (and not with heuristics used in practice), is the optimal possible denoising. By definition.

dxdyyxyxpypVyp cyx2

| ))()(|()( )(MMSE

Using internal image statistics or class specific information might provide practical benefits, but cannot perform better than the MMSE. By definition!

Page 7: 1 Patch Complexity, Finite Pixel Correlations and Optimal Denoising Anat Levin, Boaz Nadler, Fredo Durand and Bill Freeman Weizmann Institute, MIT CSAIL

7

MMSEd best possible result of any algorithm which can

utilize a d=k x k window wd around a pixel of interest

e.g. spatial kernel size in bilateral filter, patch size in non-parametric methods

Non Local Means: effective support = entire image

dxdyyxyxpyp

Vyp

dcwww

yxw

ddd

dwdwd

2

|d

))()(|()(

)(MMSE

MMSE with a finite support

d

d

Page 8: 1 Patch Complexity, Finite Pixel Correlations and Optimal Denoising Anat Levin, Boaz Nadler, Fredo Durand and Bill Freeman Weizmann Institute, MIT CSAIL

8Estimating denoising bounds in practice

Challenge: Compute MMSE without knowing p(x)?

The trick [Levin&Nadler CVPR11]:

We don’t know p(x) but we can sample from it

Evaluate MMSE non parametrically

dxdyyxyxpyp c2))()(|()(MMSE

)(~ xpxiSample mean:

i cii xxyp ,)|(

i ixyp )|()(ˆ y N

1

N

1

ix

Page 9: 1 Patch Complexity, Finite Pixel Correlations and Optimal Denoising Anat Levin, Boaz Nadler, Fredo Durand and Bill Freeman Weizmann Institute, MIT CSAIL

9

MMSE as a function of patch size

[Levin&Nadler CVPR11]:

For small patches/ large noise, non parametric approach can accurately estimate the MMSE.

53

patch size

Page 10: 1 Patch Complexity, Finite Pixel Correlations and Optimal Denoising Anat Levin, Boaz Nadler, Fredo Durand and Bill Freeman Weizmann Institute, MIT CSAIL

10MMSE as a function of patch size

How much better can we do by increasing window size?

53

patch size

Page 11: 1 Patch Complexity, Finite Pixel Correlations and Optimal Denoising Anat Levin, Boaz Nadler, Fredo Durand and Bill Freeman Weizmann Institute, MIT CSAIL

11Towards denoising bounds

Questions:

• For non-parametric methods:

How does the difficulty in finding nearest neighbors relates to the potential gain, and how can we make a better usage of a given database size?

• For any possible method:

Computational issues aside, what is the optimal possible restoration? Can we achieve zero error?

Page 12: 1 Patch Complexity, Finite Pixel Correlations and Optimal Denoising Anat Levin, Boaz Nadler, Fredo Durand and Bill Freeman Weizmann Institute, MIT CSAIL

12Patch Complexity

?

Page 13: 1 Patch Complexity, Finite Pixel Correlations and Optimal Denoising Anat Levin, Boaz Nadler, Fredo Durand and Bill Freeman Weizmann Institute, MIT CSAIL

13Patch Complexity

?

Page 14: 1 Patch Complexity, Finite Pixel Correlations and Optimal Denoising Anat Levin, Boaz Nadler, Fredo Durand and Bill Freeman Weizmann Institute, MIT CSAIL

14Patch Complexity

? Empty neighbors set

Page 15: 1 Patch Complexity, Finite Pixel Correlations and Optimal Denoising Anat Levin, Boaz Nadler, Fredo Durand and Bill Freeman Weizmann Institute, MIT CSAIL

15Patch Complexity

?

Page 16: 1 Patch Complexity, Finite Pixel Correlations and Optimal Denoising Anat Levin, Boaz Nadler, Fredo Durand and Bill Freeman Weizmann Institute, MIT CSAIL

16Patch complexity v.s. PSNR gain

Law of diminishing return: When an increase in patch width requires many more training samples, the performance gain is smaller.

Smooth regions: Easy to increase support, large gain

Textured regions: Hard to increase support, small gain

Adaptive patch size selection in denoising algorithms.See paper

Page 17: 1 Patch Complexity, Finite Pixel Correlations and Optimal Denoising Anat Levin, Boaz Nadler, Fredo Durand and Bill Freeman Weizmann Institute, MIT CSAIL

17Pixel Correlation and PSNR gain

1y

Page 18: 1 Patch Complexity, Finite Pixel Correlations and Optimal Denoising Anat Levin, Boaz Nadler, Fredo Durand and Bill Freeman Weizmann Institute, MIT CSAIL

18Pixel Correlation and PSNR gain

Independent Fully dependent

1y 2y 1y 2y

Page 19: 1 Patch Complexity, Finite Pixel Correlations and Optimal Denoising Anat Levin, Boaz Nadler, Fredo Durand and Bill Freeman Weizmann Institute, MIT CSAIL

19Pixel Correlation and PSNR gain

Independent Fully dependent

Many neighbors

When the samples are correlated, going from 1D to 2D does not spread the samples.

No gain from y2 y2 => factor 2 variance reduction

Few neighbors

Page 20: 1 Patch Complexity, Finite Pixel Correlations and Optimal Denoising Anat Levin, Boaz Nadler, Fredo Durand and Bill Freeman Weizmann Institute, MIT CSAIL

20Towards denoising bounds

Questions:

• For non-parametric methods:

How does the difficulty in finding nearest neighbors relates to the potential gain, and how can we make a better usage of a given database size?

• For any possible method:

Computational issues aside, what is the optimal possible restoration? Can we achieve zero error?

- What is the convergence rate as a function of patch size?

Page 21: 1 Patch Complexity, Finite Pixel Correlations and Optimal Denoising Anat Levin, Boaz Nadler, Fredo Durand and Bill Freeman Weizmann Institute, MIT CSAIL

21

The Dead Leaves model (Matheron 68)

Image = random collection of finite size piece-wise constant regions

Region intensity = random variable with uniform distribution

Page 22: 1 Patch Complexity, Finite Pixel Correlations and Optimal Denoising Anat Levin, Boaz Nadler, Fredo Durand and Bill Freeman Weizmann Institute, MIT CSAIL

22

Optimal denoising in the Dead Leaves model

Given a segmentation oracle, best possible denoising is to average all observations within a segment

s

2Expected reconstruction error:

MMSE dssps

1

2

)(

s=number of pixels in segment

Page 23: 1 Patch Complexity, Finite Pixel Correlations and Optimal Denoising Anat Levin, Boaz Nadler, Fredo Durand and Bill Freeman Weizmann Institute, MIT CSAIL

23

Optimal patch denoising & Dead Leaves

Segment area <d

)(sp

dMMSE

Probability of a random pixel belonging to a segment of size s pixels

Segment area >d

dsspd

dssps d

d

)()(2

1

2

For a given window size d:- If segment has size s smaller than d, only average over s pixels

- Otherwise, use all d pixels inside window (but not the full segment)

s d

What is p(s)?

Page 24: 1 Patch Complexity, Finite Pixel Correlations and Optimal Denoising Anat Levin, Boaz Nadler, Fredo Durand and Bill Freeman Weizmann Institute, MIT CSAIL

24Scale invariance in natural images

Down-scaling natural images does not change statistical properties [Ruderman, Field, etc.]

ssp

1)(

Theorem: in a scale invariant distribution, the segment size distribution must satisfy

Empirical segment size distribution:

Fit

(Repeated from Alvarez, Gousseau, and Morel.)

Page 25: 1 Patch Complexity, Finite Pixel Correlations and Optimal Denoising Anat Levin, Boaz Nadler, Fredo Durand and Bill Freeman Weizmann Institute, MIT CSAIL

25

Optimal patch denoising & scale invariance

Segment area <d

dMMSE

Segment area >d

dsspd

dssps d

d

)()(2

1

2 s

1

s

1

MMSEd

c

Page 26: 1 Patch Complexity, Finite Pixel Correlations and Optimal Denoising Anat Levin, Boaz Nadler, Fredo Durand and Bill Freeman Weizmann Institute, MIT CSAIL

26Empirical PSNR v.s. window size

Good fit with a power law

d

ceMMSE d

Poor fit with an exponential curve (implied by Markov models)

dcreMMSE d

50 100P

SN

R

Window size Window size

PS

NR

Page 27: 1 Patch Complexity, Finite Pixel Correlations and Optimal Denoising Anat Levin, Boaz Nadler, Fredo Durand and Bill Freeman Weizmann Institute, MIT CSAIL

27Extrapolating optimal PSNR

d

c

MMSEdMMSE

Future sophisticated denoising algorithms appear to have modest room for improvement: ~ 0.6-1.2dB

Page 28: 1 Patch Complexity, Finite Pixel Correlations and Optimal Denoising Anat Levin, Boaz Nadler, Fredo Durand and Bill Freeman Weizmann Institute, MIT CSAIL

28

Summary: inherent uncertainty of denoising

Non-parametric methods: Law of diminishing return

- When increasing patch size requires a significant increase in training data, the gain is low

- Correlation with new pixels makes it easier to find samples AND makes them more useful

- Adaptive denoising

For any method:

Optimal denoising as a function of window size follows a power law convergence

- Scale invariance, dead leaves

- Extrapolation predicts denoising bounds

Page 29: 1 Patch Complexity, Finite Pixel Correlations and Optimal Denoising Anat Levin, Boaz Nadler, Fredo Durand and Bill Freeman Weizmann Institute, MIT CSAIL

29

MMSE Scope:

MMSE

Limitations:

- MSE - Our database - Power law extrapolation is a conjecture for real images

is by definition the lowest possible MSE of any algorithm

Including: object recognition, depth estimation, multiple images of the same scene, internal image statistics, each.