Inverse modeling in inverse problems using optimization

Summer School 2014: Inverse Problem and Image Processing

Inverse modeling in inverse problems

using optimization

Mila Nikolova

CMLA – CNRS, ENS Cachan, France

[email protected]

http://mnikolova.perso.math.cnrs.fr

Institute of Applied Physics and Computational Mathematics, Old yard

Beijing, 14-17 July, 2014

2

object uo →capture

energy→

sampling

quantization→ processing

scene

body

earth

reflected

or

emitted

signal

or

image

data v ↓output u

Mathematical model: v = Transform(uo) • (Perturbations)

Some transforms: loss of pixels, blur, FT, Radon T., frame T. (· · · )

Processing tasks:

u = recover(uo)

u = objects of interest(uo)(· · · )

Mathematical tools: PDEs, Statistics, Functional anal., Matrix anal., (· · · )

3

Example due to R.S.Wilson

uo (unknown–signal, picture, density map) v (data, degraded) = Transform(uo)•n (noise)

An ill-posed inverse problem

uo = [ 1 1 1 1 ]T Transform: A =

10 7 8 7

7 5 6 5

8 6 10 9

7 5 9 10

rank(A) = 4

• no noise: v = Auo = [ 32 23 33 31 ]T ⇒ u = A−1v = uo

• with noise: v = Auo + n = [ 32.1 22.9 33.1 30.9 ]T

Least-squares solution: u = arg minu∈R4

∥Au − v∥2

= A−1v

⇒ u = [ 9.2 − 12.6 4.5 − 1.1 ]T

Tikhonov regularization: u = arg minu∈R4

Fv(u)

Fv(u)def= ∥Au − v∥2 + β

3∑i=1

(u[i + 1] − u[i]

)2β = 1 ⇒ u = [ 1 1.01 1.02 0.98 ]T

1 2 3 4

−10

0

9

1 2 3 4

23

32

4 Image/signal processing tasks often require to solve ill-posed inverse problems

Out-of-focus picture: v = a ∗ uo + noise = Auo + noise

A is ill-conditioned ≡ (nearly) noninvertible

Least-squares solution: u = argminu

∥Au − v∥2

Tikhonov regularization: u := argmin

u

∥Au − v∥2+β

∑i

∥Giu∥2for Gi ≈ ∇, β>0

Original uo Blur a Data v u: Least-squares u: Tikhonov

5

θ (degrees)

0

2

4

6

8

10

12

Impulse noise Jitter (video) Radon (tomography)

⇓ ⇓ ⇓

Formulate your problem as the minimization (maximization) of a functional

(an energy) whose solution is the sought after signal/image

6

Goal of this tutorial: How to choose your energy Fv?

Approach: Salient features of the minimizers of classes of energies Fv

Outline

1. Energy minimization methods (p. 7)

2. Regularity results (p. 17)

3. Non-smooth regularization – minimizers are sparse in a given subspace (p. 26)

4. Non-smooth data-fidelity – minimizers fit exactly some data entries (p. 35)

5. Comparison with Fully Smooth Energies (p. 51)

6. Non-convex regularization – edges are sharp (p. 54)

7. Nonsmooth data-fidelity and regularization – peculiar features (p. 62)

8. Fully smoothed ℓ1−TV models – bounding the residual (p. 83)

9. Inverse modeling and Bayesian MAP – there is distortion (p. 98)

10. Some References (p. 103)

7

1. Energy minimization methods

uo (unknown) v (data) = Transform(uo)• (Perturbations)

solution u

u close to data production model Ψ(u, v) (data-fidelity)

coherent with priors and desiderata Φ(u) (prior – functional, constraint )

Combining models: u := argminu∈Ω

Fv(u) (P)

Fv(u) := Ψ(u, v) + βΦ(u), β > 0

How to choose (P) to get a good u ?

Applications: Denoising, Segmentation, Deblurring, Tomography, Seismic imaging, Zoom,

Superresolution, Compression, Learning, Motion estimation, Pattern recognition (· · · )

The m× n image u is stored in a p = mn-length vector, u ∈ Rp, data v ∈ Rq

8

Ψ usually models the production of data v ⇒ Ψ = − log(Likelihood (v|u)

)v = Auo + n for n white Gaussian noise ⇒ Ψ(u, v) ∝ ∥Au− v∥2

2

The information on u we have is implicitly contained in Ψ. It is scarcely enough.

A good prior Φ is needed to solve our task.

Φ model for the unknown u (statistics, smoothness, edges, textures, expected features)

• Bayesian approach

• Variational approach

Both approaches lead to similar energies

Prior via regularization term Φ(u) =∑i

φ(∥Giu∥)

φ : R+ → R+ potential function (PF)

Gi — linear operators. Examples : Id, ∇, ∇2, ∇W for W left inverse of a frame if

u = W (image)

9

Bayes: U, V random variables, Likelihood fV|U (v|u), Prior fU (u) ∝ exp−λΦ(u)

Maximum a Posteriori (MAP) yields the most likely solution u given the data V = v:

u = argmaxu

fU|V (u|v) = argminu

(− ln fV|U (v|u)− ln fU (u)

)= argmin

u

(Ψ(u, v) + βΦ(u)

)= argmin

uFv(u)

MAP is a very usual way to combine models on data-acquisition and priors

Realist models for data-acquisition fV|U and prior fU is still an open question

1 50 100

0

20

1 50 100

0

20

Original uo ∼ fU (—) The true MAP u (—)

Data v = uo+noise (· · · ), noise ∼ fV|U The original uo (· · · )

Are you satisfied with the solution?

10

• Minimizer approach (the core of our tutorials)

− Analyze the main properties exhibited by the (local) minimizers u of Fv as a function of

the shape of Fv

Strong results.

Rigorous tools for modelling

− Conceive Fv so that the properties of u satisfy your requirements.

(a “chicken and egg” problem?)

“There is nothing quite as practical as a good theory.” Kurt Lewin

11Illustration: the role of the smoothness of Fv

stair-casingFv(u) =

p∑i=1

(ui − vi)2 + β

p−1∑i=1

|ui − ui+1|

smooth non-smooth

exact data-fitFv(u) =

p∑i=1

|ui − vi| + β

p−1∑i=1

(ui − ui+1)2

non-smooth smooth

both effectsFv(u) =

p∑i=1

|ui − vi| + β

p−1∑i=1

|ui − ui+1|

non-smooth non-smooth

Data (−−−), Minimizer (—)

Fv(u) =

p∑i=1

(ui − vi)2 + β

p−1∑i=1

(ui − ui+1)2

smooth smooth

We shall explain why and how to use

12

Some energy functions

Regularization [Tikhonov, Arsenin 77]: Fv(u) = ∥Au − v∥2+β∥Gu∥2, G = I or G ≈ ∇

Focus on edges, contours, segmentation, labeling

Statistical framework

Potts model [Potts 52] (ℓ0 semi-norm applied to differences):

Fv(u) = Ψ(u, v) + β∑i,j

ϕ(u[i] − u[j]) ϕ(t) :=

0 if t = 0

1 if t = 0

Line process in Markov random field priors [Geman, Geman 84]: (u, ℓ) = argminu,ℓ

Fv(u, ℓ)

Fv(u, ℓ) = Ψ(u, v) + β∑i

( ∑j∈Ni

φ(u[i] − u[j])(1 − ℓi,j) +∑

(k,n)∈Ni,j

V(ℓi,j, ℓk,n))

[ℓi,j = 0 ⇔ no edge

],

[ℓi,j = 1 ⇔ edge between i and j

], φ(t) = 1

iNitid d ddd

d tid d dd d dd d d

13

Image credits: S. Geman and D. Geman 1984. Restoration with 5 labels using Gibbs sampler

“We make an analogy between images and statistical mechanics systems. Pixel gray levels and the presence and

orientation of edges are viewed as states of atoms or molecules in a lattice-like physical system. The assignment of an

energy function in the physical system determines its Gibbs distribution. Because of the Gibbs distribution, Markov

random field (MRF) equivalence, this assignment also determines an MRF image model.” [S. Geman, D. Geman 84]

14PDE’s framework Φ(u)

M.-S. functional [Mumford, Shah 89]: Fv(u, L)=

∫Ω

(u − v)2dx +β

(∫Ω \L∥∇u∥2dx+α |L |

)discrete version: Φ(u) =

∑i

φ(∥Giu∥), φ(t) = mint2, α, Gi ≈ ∇

Total Variation (TV) [Rudin, Osher, Fatemi 92]: Fv(u) = ∥u − v∥22 + β TV(u)

TV(u) =

∫∥∇u∥2 dx ≈

∑i

∥Giu∥2

t

φ(t)Various edge-preserving functions φ to define Φ

φ is edge-preserving if limt→∞

φ′(t)

t= 0

[Charbonnier, Blanc-Feraud, Aubert, Barlaud 97 ...]

Minimizer approach

ℓ1− Data fidelity [Nikolova 02]: Fv(u) = ∥Au − v∥1 + βΦ(u)

L1 − TV model [T. Chan, Esedoglu 05]: Fv(u) = ∥u − v∥1 + βTV(u)

CPU time ! Computers ↑↑

15Original uo Data v = a ∗ uo + n φ(t) = |t|α∈(1,2) φ(t) = |t|

Row 54 Row 54

Row 90 Row 90

φ

c

o

n

v

e

xFv(u) = ∥Au − v∥2 + β∑i φ((∇u)[i])

φ smooth at 0 φ nonsmooth at 0

φ(t) = αt2/(1 + αt2) φ(t) = α|t|/(1 + α|t|)

Row 54 Row 54

Row 90 Row 90

φ(t) = minαt2, 1 φ(t) = 1− 1l(t=0)

Row 54 Row 54

Row 90 Row 90

n

o

n

c

o

n

v

e

x

16


Tutorial: Inverse modeling in inverse problems using optimization

Outline


2. Regularity results









17

2 Regularity Results

Optimization problems

u

Fv(u)

two localminimizers

u

Fv(u)

No minimizer

u

Ω

Fv nonconvex Fv convex non coercive Ω = R Fv convex non coercive Ω compact

u

Fv(u)

Ω Ω

u

Fv(u)

minimizers

u

Fv strictly convex, Ω nonconvex Fv non strictly convex Fv strictly convex on R

18

Fv : Ω → R Ω ⊂ Rp

• Set of optimal solutions U = u ∈ Ω : Fv(u) 6 Fv(u) ∀ u ∈ Ω

U = u if Fv strictly convex

U = ∅ if Fv coercive of if Fv continuous and Ω compact

Otherwise – check

(e.g. see if Fv is asymptotically level stable [Auslender, Teboulle 03])

• Nonconvex problems:

Algorithms may get trapped in local minima

A “good” local minimizer can be satisfying

Global optimization – difficult, but progress, e.g. [Robini, Reissman JGO 13]

Convex relaxation methods, see, e.g., [Yuan, Bae, Tai CVPR 10]

• Attention to numerical errors

19

Definition: U : O → Rp, O ⊂ Rq open, is a (local) minimizer function for

FO := Fv : v ∈ O if Fv has a strict (local) minimum at U(v), ∀ v ∈ O

Minimizer functions – an useful tool to analyze the properties of minimizers...

Fv(u)

u0

Fv(u)

u

Fv(u)

u

Fv(u) = (u− v)2 + β√α+ u2 Fv(u) = (u− v)2 + β αu2

1+αu2 Fv(u) = (u− v)2 + βα|u|

1+α|u|

minimizer function (••••) local minimizer functions (••••) global minimizer function (••••)

Each blue curve curve: u → Fv(u) for v ∈ 0, 2, · · ·

Question 1 What these plots reveal about the local / global minimizer functions?

20

Fv(u) = ∥Au− v∥22 + βΦ(u)

Φ(u) =∑i

φ(∥Giu∥2)

u ∈ Rp

v ∈ Rq

φ : R+ → R

φ incresing, continuous

φ(t) > φ(0), ∀t > 0

Gi linear operators Rp → Rs, s > 1

φ′(0+) > 0 ⇒ Φ is nonsmooth on∪i

u : Giu = 0

Systematically: kerA ∩ kerG = 0 G :=

G1

G2

· · ·

Recall:

Fv has a (local) minimum at u ⇒ δFv(u)(d) = limt↓0

Fv(u+td)−Fv(u)

t> 0, ∀d ∈ Rp

Fv nonconvex ⇒ there may be many local minima

21

• N = (s, t) : t = ± arctan(s)

• N is closed in R2 and its Lebesgue measure in R2 is L2(N) = 0

• (x, y) = random R2

Question 2 What is the chance that (x, y) ∈ N?

22

Stability of the minimizers of Fv [Durand & Nikolova 06]

Assumptions: φ : R+ → R is continuous and Cm>2 on R+ \ θ1, · · · θn,Assumptions: edge-preserving, possibly non-convex and rank(A) = p

A. Local minimizers

(knowing local minimizers is important)

There is a closed N ⊂ Rq with Lebesgue measure Lq(N) = 0 such that ∀v ∈ Rq \N ,

every (local) minimizer u of Fv is given by u = U(v) where U is a Cm−1 (local) minimizer

function.

Question 3 For v ∈ Rq \N, compare U(v) and U(v + ε) where ε ∈ Rq is small enough.

B. Global minimizers

• ∃ N ⊂ Rq with Lq(N) = 0 and Int(Rq \ N) dense in Rq such that ∀v ∈ Rq \ N , Fv

has a unique global minimizer.

• There is an open subset of Rq \ N , dense in Rq, where the global minimizer function Uis Cm−1-continuous.

Question 4 What is the chance that v ∈ N? What can happen if v ∈ N?

23

Nonasymptotic bounds on minimizers [Nikolova 07]

Classical bounds for β 0 or β ∞

Assumption: φ is piecewise C1

• φ is strictly increasing or rank(A) = p

u is a (local) minimizer of Fv ⇒ ∥Au∥ 6 ∥v∥

• ∥φ′∥∞ = constant (φ is edge-preserving) and rank(A) = q 6 p

u is a (local) minimizer of Fv ⇒ ∥v −Au∥∞ 6 β2∥φ′∥∞ ∥(AA∗)−1A∥∞ ∥G∥1

∥φ′∥∞ = 1, A = Id and G− 1st order differences:

signal ⇒ ∥v − u∥∞ 6 β

image ⇒ ∥v − u∥∞ 6 2β

Question 5 If v = uo + n for n Gaussian noise, is it possible to clean v

from this noise by minimizing Fv? (See Ψ on p. 8.)

24

Non-Smooth Energies, Side Derivatives, Subdifferential

Rademacher’s theorem: If Fv : Rp → R is Lipschitz continuous, then Fv is differentiable (in

the usual sense) almost everywhere in Rp.

A kink is a point u where ∇Fv(u) is not defined (in the usual sense).

Example: Fv(u) =1

2(u− v)2 + β|u| for β = 1 > 0 and u, v ∈ R

−1 0 1

1

−1 0 1

1

−1 0 1

1

−1 0 1

v = −0.9 v = −0.2 v = 0.95 v = 1.1

−1 0 1

0

−1 0 1

0

−1 0 1

0

−1 0 1

0

u =

v + β if v < −β

0 if |v| 6 β

v − β if v > β

Question 6 Comment the minimizers on the 1st row. What is drawn on the 2nd row?

25



Outline



3. Non-smooth regularization – minimizers are sparse in a given subspace








26

3 Minimizers under Non-Smooth Regularization

Fv(u)=Ψ(u, v)+β

r∑i=1

φ(∥Giu∥), Ψ∈Cm>2, φ∈Cm(R∗+), 0<φ

′(0+)6∞

φ(t) tα, α∈(0, 1)α t

α t + 1ln(αt + 1) 1 − αt α ∈ (0, 1) (· · · ) , α > 0

0 10

3

t

φ

α = 0.6

0 10

1

t

α = 4

0 10

2

t

φ

α = 2

0 10

1

t

φ

α = 0.5

0 10

inf

φ′

0 10

4

φ′

0 10

2

φ′

0 10

0.7

φ′

φ(t) = t and Giu ≈ (∇u)i ⇒ Φ(u) = TV(u) (total variation) [Rudin, Osher, Fatemi 92]

27

General case Fv(u)=Ψ(u, v)+βr∑

i=1

φ(∥Giu∥) Ψ∈Cm>2, φ′(0+)>0 [Nikolova 97,00]

Let u be a (local) minimizer of Fv. Set h := i : Giu = 0Then ∃ O ⊂ Rq open, ∃ U ∈ Cm−1 (local) minimizer function so that

v′ ∈ O, u′ = U(v′) ⇒ Giu′ = 0, ∀ i ∈ h

h ⊂ 1, .., r Oh := v ∈ Rq : GiU(v) = 0, ∀i ∈ h ⇒ Lq(Oh) > 0

Data v yield (local) minimizers u of Fv such that

Giu = 0 for a set of indexes h

Gi = ∇i ⇒ u[i] = u[j] for many neighbors (i, j) (the “stair-casing” effect)

Giu = u[i] ⇒ many samples u[i] = 0 – highly used in Compressed Sensing

Question 7 What happens if Gi yield second-order differences?

Property fails if Fv is smooth, except for v ∈ N where N is closed and Lq(N) = 0.

28

1 100

0

4

1 100

0

4

1 100

0

4

1 100

0

4

1 100

0

4

1 100

0

4

φ(t) =√α+ t2, φ′(0) = 0 (smooth at 0) φ(t) = (t+ αsign(t))2, φ′(0+) = 2α

1 100

0

4

1 100

0

4

1 100

0

4

1 100

0

4

φ(t) = |t|, φ′(0+) = 1 φ(t) = α|t|/(1 + α|t|), φ′(0+) = α

Fv(u) = ∥u− v∥2

+β∑

φ(|u[i]− u[i− 1]|)

29

TV energy: Fv(u) = ∥Au− v∥2 + βTV(u)

Original Data Restored: TV energy

Image credit to the authors: D. C. Dobson and F. Santosa, “Recovery of blocky images

from noisy and blurred data”, SIAM J. Appl. Math., 56 (1996), pp. 1181-1199.

30

Questions to clarify the main property

Let uo ∈ R and pdf(uo) =12e

−|uo| (Laplacian distribution)

Question 8 Give Pr(uo = 0).

Let v = uo + n where pdf(n) = 1σ√2πe−

n2

2σ2 (centered Gaussian distribution)

The corresponding MAP energy to recover uo from v reads as

Fv(u) =1

2(u− v)2 + β|u| for β =

1

σ2

Question 9 Give the minimizer function U for Fv.

Useful reminder on p. 24.

Question 10 Determine the set ν ∈ R : U(ν) = 0. Comment the result.

31

Disparity estimation

(a) Left input image (b) Right input image (c) True disparity

Figure 7. Rectified stereo image pair and the ground truth disparity. Light gray pixels indicate structures

near to the camera, and black pixels correspond to unknown disparity values.

−2 0 2 −2 0 2 −2 0 2 −2 0 2

quadratic TV Huber Lipscitz

Image credits to the authors: Pock, Cremers, Bischof, and Chambolle “Global Solutions of

Variational Models with Convex Regularization”, SIIMS 3(4) 2010, pp. 1122-1145

32

Minimization of Fv(u) = ∥u− v∥22 + βTV(u), β = 100 and β = 180

33

Questions relevant to the Potts model (see p. 12)

Here φ(t) =

0 if t = 0

1 if t = 0

Question 11 Compute the global minimizer of Fv(u) = (u− v)2 + βφ(u) for u, v ∈ R

and β > 0, according to the value of v.

Consider Fv(u) = ∥u− v∥22 + β

p∑i=1

φ(u[i]) for β > 0 and u, v ∈ Rp.

Note:

p∑i=1

φ(u[i]) = #i : u[i] = 0 = ℓ0(u) is the counting norm.

The global minimizer function U : Rp → Rp for Fv has p components which depend on v.

Question 12 Compute each component Ui

Question 13 Let h ⊂ 1, · · · , p. Determine the subset Oh ⊂ Rp such that

if v ∈ Oh then the global minimizer u of Fv satisfies u[i] = 0, ∀ i ∈ h

and u[i] = 0 if i ∈ h.

34



Outline




4. Non-smooth data-fidelity – minimizers fit exactly some data entries







35

4 Minimizers relevant to non-smooth data-fidelity

General case [Nikolova 02]

Fv(u)=

∑i

ψ(|aiu− v[i]|) + βΦ(u), ai=Arow i,Φ∈Cm, ψ∈Cm(R∗+ ), ψ′(0+) > 0

Let u be a (local) minimizer of Fv. Set h =: i : aiu = v[i].Then ∃ O ⊂ Rq open, ∃ U ∈ Cm−1 (local) minimizer function so that

v′ ∈ O, u′ = U(v′) ⇒ ai u′ = v[i], ∀ i ∈ h

h ⊂ 1, .., q Oh :=v ∈ Rq : ai U(v) = vi,∀i ∈ h

⇒ Lq(Oh) > 0

(Local) minimizers u of Fv achieve an exact fit to (noisy) data

aiu = v[i] for a certain number of indexes i

Property fails if F is fully smooth, except for v ∈ N where N is closed and Lq(N) = 0.

36

Question 14 Suggest cases when you would like that your minimizer obeys this property.

Question 15 Compute the minimizer of Fv(u) = |u− v|+ βu2 for u, v ∈ R and β > 0.

Question 16 Can you find a relationship between the properties of the minimizer

when φ′(0+) > 0 (chapter 3, p. 26) and when ψ′(0+) > 0 (chapter 4, p. 35)

37

Original uo Data v = uo+outliers

Restoration u for β = 0.14 Residuals v − u

Fv(u) =∑i

|u[i] − v[i]| + β∑j∈Ni

|u[i] − u[j]|1.1

38


Fv(u) =∑i

∣∣u[i] − v[i]∣∣ + β

∑j∈Ni

|u[i] − u[j]|1.1


TV-like energy: Fv(u) =∑i

(u[i] − v[i])2 + β∑j∈Ni

|u[i] − u[j]|

39

Detection and cleaning of outliers using ℓ1 data-fidelity [Nikolova 04]

Fv(u) =

p∑i=1

|u[i] − v[i]| +β

2

p∑i=1

∑j∈Ni

φ(|u[i] − u[j]|) tid d dNid d dd d d

ddddddb b bbbb

φ: smooth, convex, edge-preserving

Assumptions:

data v contain uncorrupted samples v[i]

v[i] is outlier if |v[i] − v[j]| ≫ 0, ∀j ∈ Ni

v ∈ Rp ⇒ u = argminu

Fv(u)

h = i : u[i] = v[i]

v[i] is regular if i ∈ h

v[i] is outlier if i ∈ hc

Outlier detector: v → hc(v) = i : u[i] = v[i]Smoothing: u[i] for i ∈ hc = estimate of the outlier

Justification based on the properties of u

40

L. Bar, A. Brook, N. Sochen and N. Kiryati,

“Deblurring of Color Images Corrupted by Impulsive Noise”,

IEEE Trans. on Image Processing, 2007

Fv(u) = ∥Au− v∥1 + βΦ(u)

blurred, noisy (r.-v.) zoom - restored

41

Recovery of frame coefficients using ℓ1 data-fitting [Durand, Nikolova 07]

• Data: v = uo + noise

• Frame coefficients: y = Wv = Wuo+ noise W =left inverse of W

• Hard thresholding yT [i] :=

0 if |y[i]| 6 T

y[i] if |y[i]| > T

keeps relevant information if T small

• u = WyT — Gibbs oscillations and wavelet-shaped artifacts

• Hybrid energy methods—combine fitting to yT with prior Φ(u)

[Bobichon, Bijaoui 97], [Coifman, Sowa 00], [Durand, Froment 03]...

42

Desiderata: Fy convex and

Keep x[i] = yT [i] Restore x[i] = yT [i]

significant coefs: y[i] ≈ (Wuo)[i] outliers: |y[i]| ≫ |(Wuo)[i]| (frame-shaped artifacts)

thresholded coefs: (Wuo)[i]≈0 edge coefs: |(Wuo)[i]|> |yT [i]|=0 (“Gibbs” oscillations)

Then:minimize Fy(x) =

∑i

λi

∣∣(x− yT )[i]∣∣+ ∫

Ω

φ(|∇Wx|) ⇒ x

u = W x for W left inverse, φ edge-preserving

Question 17 Explain why the minimizers of Fy fulfill the desiderata.

Hint: “good” coefficients fitted exactly, “bad” coefficients corrected according to the prior.

43

1 250 500

0

100

1 250 500

0

100

1 250 500

0

100

Original and data Sure-shrink method Hard thresholding

1 250 500

0

100

1 250 500

0

100

410 425

23

50

original× threshold∗ restored

Total variation The proposed method Magnitude of coefficients

Restored signal (—), original signal (- -).

44

Fast 2-stage restoration under impulse noise [R.Chan, Nikolova et al. 04,05,08]

1. Approximate the outlier-detection stage by rank-order filter

(e.g. adaptive or center-weighted median)

Corrupted pixels hc =i : v[i] = v[i]

where v=Rank-Order Filter (v)

⇒ improve speed and accuracy

2. Restore u (denoise, deblur) using an edge-preserving energy method

subject to aiu = v[i] for all i ∈ h

45

50% RV noise ACWMF DPVM Our method

70 % SP noise(6.7dB) Adapt.med.(25.8dB) Our method(29.3dB) Original Lena

46

One-step real-time dejittering of digital video [Nikolova 09]

• Image u ∈ Rm×n, rows ui, its pixels ui[j]

• Data vi[j] = ui[j + di], di integer,∣∣di∣∣ 6 M , typically M 6 20.

• Restore u ≡ restore di, 1 6 i 6 m

original jittered

Original (b) One column Jittered

(b) The same column in the original (left) and in the jittered (right) image

The gray-values of the columns of natural images can be seen as large pieces of 2nd (or 3rd)

order polynomials which is false for their jittered versions.

47

Each column ui is restored using di = arg min|di|6N

F(di)

F(di) =

c−N∑j=N+1

∣∣vi[j + di] − 2ui−1[j] + ui−2[j]∣∣α, α ∈ 0.5, 1, N > M

Question 18 Explain why the minimizers of F can solve the problem as stated.

Question 19 What changes if α = 1 or if α = 0.5?

Question 20 Is it easy to solve the numerical problem?

A Monte-Carlo experiment shows that in almost all cases, α = 0.5 is better.

Jittered, [−20, 20] α = 1 Jitter: 6 sin(n4

)α=1 ≡ Original

48

original

restored

Jittered −8, . . . , 8 Original image α = 1 Zooms

(512×512)JitterM=6 α∈1, 12=Original Lena (256× 256) Jitter −6, .., 6 α∈1, 1

2

49

Jitter -15,..,15 α = 1, α = 0.5 Original image

50

Jitter Jittered Image Bayesian TV Bake & Shake

Original Our: α=0.5 Our: Error uo − u

[Kokaram98, Laborelli03, Shen04, Kang06, Scherzer11]

51

5. Comparison with Fully Smooth Energies

Fv(u) = Ψ(u, v) + βΦ(u), F ∈ Cm>2 + easy assumptions. If h = ∅ ⇒

v ∈ Rq : Fv—minimum at u, Giu = 0, ∀i ∈ hv ∈ Rq : Fv—minimum at u, ai u = vi, ∀i ∈ h

closed and

negligible in Rq

For Fv smooth, the chance that noisy data v yield a minimizer u of Fv which

for some i satisfies exactly Giu = 0 or ai u = vi is negligible

Nearly all v ∈ Rq lead to u = U(v) satisfying Giu = 0, ∀i and ai u = vi, ∀i

Question 21 What are the consequences if one approximates a nonsmooth energy

by a smooth energy?

52

Questions to clarify the theoretical results

Let u ∈ Rp and v ∈ Rq.

Consider that A ∈ Rq×p and G ∈ Rr×p satisfy ker(A) ∩ ker(G) = 0.

Fv(u) = ∥Au− v∥22 + β∥Gu∥22 for β > 0

Question 22 Calculate ∇Fv(u).

Question 23 Determine the minimizer function U .

Let Gi ∈ R1×p denote the ith row of G.

Question 24 Characterize the set K = ν ∈ Rp : Gi U(ν) = 0.

Let ai ∈ R1×p denote the ith row of A.

Question 25 Characterize the set L = ν ∈ Rp : ai U(ν) = ν[i].

53



Outline






6. Non-convex regularization – edges are sharp





54

6 Nonconvex Regularization: Why Edges are Sharp? [Nikolova 04, 10]

Fv(u) = ∥Au− v∥2 + β

∑i∈J

φ(∥Giu∥) J = 1, · · · , r

Standard assumptions on φ: C2 on R+ and limt→∞

φ′′(t) = 0, as well as:

φ′(0) = 0 (Φ is smooth) φ′(0+) > 0 (Φ is nonsmooth)

0 1

1

φ(t)=αt2

1 + αt2

0 1

0

τ T<0

>0

increase, 60

φ′′(t)

0 10

1

φ(t) =αt

1 + αt

0 1

0

increase, 60

<0

φ′′(t)

55

Illustration on R Fv(u) = (u− v)2 + βφ(|u|), u, v ∈ R

ξ0

ξ1

v

uθ0 θ1

u+ β2φ′(u)

φ′(0) = 0

ξ0

ξ1

v

u

θ0 θ1

u+ β2φ′(u)

φ′(0) > 0

No local minimizer in (θ0,θ1)

∃ ξ0 > 0, ∃ ξ1 > ξ0

|v| 6 ξ1 ⇒ |u0| 6 θ0strong smoothing

|v| > ξ0 ⇒ |u1| > θ1

loose smoothing

∃ ξ ∈ (ξ0, ξ1)|v| 6 ξ ⇒ global minimizer = u0 (strong smoothing)

|v| > ξ ⇒ global minimizer = u1 (loose smoothing)

For v = ξ the global minimizer jumps from u0 to u1 ≡ decision for an “edge”

Since [Geman21984] various nonconvex Φ to produce minimizers with smooth regions and

sharp edges

56

Sharp edge property

There exist θ0 > 0 and θ1 > θ0 such that any (local) minimizer u of Fv satisfies

either ∥Giu∥ 6 θ0 or ∥Giu∥ > θ1 ∀ i ∈ J

h0 =i : ∥Giu∥ 6 θ0

homogeneous regions

h1 =i : ∥Giu∥ > θ1

edges

When β increases, then θ0 decreases and θ1 increases.

In particular

φ′(0+) > 0 ⇒ θ0 = 0 fully segmented image (Giu = 0, ∀i ∈ h0)

Question 26 Explain the prior model involved in Fv when φ is nonconvex

with φ′(0) = 0 and with φ′(0+) > 0.

57

Image Reconstruction in Emission Tomography

0

1

2

3

4

Original phantom Emission tomography simulated data

0

1

2

3

4

0

1

2

3

4

φ is smooth (Huber function) φ(t) = t/(α+ t) (non-smooth, non-convex)

Reconstructions using Fv(u) = Ψ(u, v) + β∑j∈Ni

φ(|u[i]− u[j]|), Ψ = smooth, convex

58

Selection for the global minimizer

Additional assumptions: ∥φ∥∞ <∞, Gi—1st-order differences, A∗A invertible

1lΣi =

1 if i ∈ Σ ⊂ 1, .., p

0 else

Original: uo = ξ1lΣ, ξ > 0

Data: v = ξ A 1lΣ = Auo

u = global minimizer of Fv

Sketch of the results

∃ ξ1 > 0 such that ξ > ξ1 ⇒ u—perfect edges

Moreover:

• Φ non smooth, then ξ > ξ1 ⇒ u = c uo, c < 1, limξ→∞

c=1

• φ(t) = η, t > η, then ξ > ξ1 ⇒ u = uo

This holds true also for φ(t) = minαt2, 1 and for φ(t) =

0 if t = 0

1 if t = 0

59

Comparison with Convex Edge-Preserving Regularization

1 100

0

4

1 100

0

4

1 100

0

4

Data v = uo + n φ(t) = |t| φ(t) = α|t|/(1 + α|t|)

original data φ(t) = |t|1.4 φ(t) = minαt2, 1

Question 27 Why edges are sharper when φ is nonconvex?

60

Fv(u)

u

v=0v=22

0 θ1

v=0v=22

uθ0 θ1

Fv(u)

u

Fv(u) = (u− v)2 + βα|u|

(1+α|u|) Fv(u) = (u− v)2 + β αu2

(1+αu2)Fv(u) = (u− v)2 + β

√α+ u2

global function (••••) global minimizer functions (••••) unique minimizer function (••••)

Each blue curve curve: u → Fv(u) for v ∈ 0, 2, · · ·

Question 28 How to describe the global minimizer when v increases?

61



Outline







7. Nonsmooth data-fidelity and regularization – peculiar features




62

7. Nonsmooth data-fidelity and regularization

Consequence of §3 and §4: if Φ and Ψ non-smooth,

Giu = 0 for i ∈ hφ = ∅

aiu = v[i] for i ∈ hψ = ∅

The L1-TV energy

T. F. Chan and S. Esedoglu, “Aspects of Total Variation Regularized L1 Function

Approximation”, SIAM J. on Applied Mathematics, 2005

Fv(u) = ∥u− 1lΩ∥1 + β

∫Rd

∥∇u(x)∥2 dx where 1lΩ(x) :=

1 if x ∈ Ω

0 else

• ∃ u = 1lΣ (Ω convex ⇒ Σ ⊂ Ω and u unique for almost every β > 0)

• contrast invariance: if u minimizes for v = 1lΩ then cu minimizes Fcv

the contrast of image features is more important than their shapes

• critical values β∗

β < β∗ ⇒ objects in u with good contrast

β > β∗ ⇒ they suddenly disappear

⇒ data-driven scale selection

63

Binary images by L1 − TV [T. Chan, S. Esedoglu, Nikolova 06]

Classical approach to find a binary image u = 1lΣ from binary data 1lΩ, Ω ⊂ R2

Σ = argminΣ

∥∥1lΣ − 1lΩ∥22 + βTV(1lΣ)

nonconvex problem (⋆)

usual techniques (curve evolution, level-sets) fail

Σ solves (⋆) ⇔ u = 1lΣ minimizes∥∥u− 1lΩ∥1 + β TV(u) (convex)

Data Restored

64

Multiplicative noise removal on Frame coefficients [Durand, Fadili, Nikolova 09]

Multiplicative noise arises in various active imaging systems e.g. synthetic aperture radar

• Original image: So

• One shot: Σk = Soηk

• Data: Σ =1

K

K∑k=1

Σk = So1

K

K∑k=1

ηk = So η where pdf(η) = Gamma density

• Log-data: v = logΣ = log So + log η = u0 + n

• Frame Coefficients: y = Wv = Wu0 +Wn (W curvelets)

0 5 −6 0 2 −1 0 1 1 2 −1 0 1 −1 0 1

K=1

η = η1

K=1 K=1 K=10 K=10 K=10

pdf(η) = pdf(ηk) pdf(n) pdf(Wn

)pdf(η) pdf(n) pdf

(Wn

)Question 29 Comment the noise distribution of Wn

65

• Hard Thresholding: yT [i] =

0 if |y[i]| 6 T,

y[i] otherwise∀i ∈ I, T > 0 (suboptimal).

I1 = i ∈ I : |y[i]| > T and I0 = I \ I1

• Restored coefficients: x = argminx

Fy(x) (ℓ1 − TV energy)

Fy(x) = λ0

∑i∈I0

∣∣x[i]∣∣ + λ1

∑i∈I1

∣∣x[i] − y[i]∣∣ + ∥Wx∥TV

S = B exp(W x

), where W left inverse, B bias correction

Question 30 Explain the job the minimizer x of Fy should do.

Some comparisons

• BS [Chesneau,Fadili,Starck 08]: Block-Stein thresholds the curvelet coefficients, ≈minimax(large class of images with additive noises), optimal threshold T = 4.50524

• AA [Aubert,Aujol 08]: Ψ = − Log-Likelihood(Σ), Φ = TV(Σ) (i.e. Fv ≡ MAP for Σ)

• SO [Shi,Osher 08]: relaxed inverse scale-space for Fv(u) = ∥v − u∥22 + βTV(u) ≈ MAP(u)

Stopping rule: k∗ = maxk ∈ IN : Var(u(k) − uo) > Var(n).

Monte-Carlo comparative experiment confirms the proposed method

66

Noisy Fields K = 1 (512×512) SO: PSNR=9.59, MAE=196 AA: PSNR=15.74, MAE=76.66

BS: PSNR=22.52, MAE=35.22 Fields (original) Our: PSNR=22.89, MAE=33.67

67

Noisy K = 10 SO: PSNR=25.36, MAE=25.14 AA: PSNR=17.13, MAE=65.40

BS: PSNR=27.24, MAE=19.61 Fields (original) Our: PSNR=28.04, MAE=18.19

68

Noisy City K = 1 (512×512) SO: PSNR=18.39, MAE=24.08 AA: PSNR=22.18, MAE=13.71

BS: PSNR=22.25, MAE=13.96 City (original) Our: PSNR=22.64, MAE=13.39

69

Noisy K = 4 SO: PSNR=24.40, MAE=10.76 AA: PSNR=24.55, MAE=10.06

BS: PSNR=24.92, MAE=9.87 City (original) Our: PSNR=25.84, MAE=9.09

70

C. Clason, B. Jin, K. Kunisch

“Duality-based splitting for fast ℓ1 − TV image restoration”, 2012,

http://math.uni-graz.at/optcon/projects/clason3/

Scanning transmission electron microscopy (2048× 2048 image)

true image noisy image restoration

71

ℓ1 data-fidelity with concave regularization [Nikolova, Ng, Tam 12]

Fv(u) =∑i∈I

∣∣aiu− v[i]∣∣ + β

∑j∈J

φ(∥Gju∥2), φ′(0+) > 0, φ′′(t) < 0, ∀t > 0

I = 1, · · · , q , J = 1, · · · , r

φ is strictly concave on [0,+∞).

φ(t)α t

α t + 11 − αt, α∈(0, 1) ln(αt + 1) (t + ε)α, α∈(0, 1), ε>0 (· · · )

0 10

1

t

α = 4

0 10

1

t

φ

α = 0.5

0 10

2

t

φ

α = 2

0 10

2

t

φ

α = 0.3ε = 0.02

Motivation

• This family of objective functions has never been considered before

• Fv can be seen as an extension of L1− TV

• u—(local) minimizer of Fv?

=⇒ many i, j such that aiu = v[i] and Gju = 0

72

Minimizers of Fv(u) = ∥u − v∥1 + β

p−1∑i=1

φ(|u[i + 1] − u[i]|)

φ(t) = αtαt+1

for α = 4 φ(t) = ln(αt + 1) for α = 2

71

0

10

71

0

5

β ∈ 78, · · · , 156 β ∈ 0.1 × 10, · · · , 14

71

0

10

71

0

5

β ∈ 157, · · · , 400 β ∈ 0.1 × 16, · · · , 30Data samples (), Minimizer samples u[i] (+++).

73

5 20 53 71

0

10

5 20 53 71

0

10

(a) φ(t) = α tα t+1 , α = 4, β = 3 (b) φ(t) = 1− αt, α = 0.1, β = 2.5

5 20 53 71

0

10

5 20 53 71

0

10

(c) φ(t) = ln(αt+ 1), α = 2, β = 1.3 (d) φ(t) = (t+ 0.1)α, α = 0.5, β = 1.4

Denoising: Data samples () are corrupted with Gaussian noise. Minimizer samples

u[i] (+++). Original (−−−). β—the largest value so that the gate at 71 survives.

74

Zooms

0

10

53 71

(a) (b) (c) (d)

5 20

12

11

12.5

Constant pieces—solid black line.

Data points v[i] fitted exactly by the minimizer u ().

75

5 20 53 71

0

10

5 20 53 71

0

10

φ(t) = t, β = 0.8 (ℓ1 − TV) the minimizer for φ(t) = α tα t+1

, α = 4, β = 3

the convex relaxation of Fv closest to (ℓ1 − TV)

0

10

5 20 53 71

0

10

error for φ(t) = α tα t+1

, α = 4, β = 3 φ(t) = α tα t+1 , α = 4, β = 3

∥original− u∥∞ = 0.24 original ∈ [0, 12], data v ∈ [−0.6, 12.9]

76

On the figures, u are global minimizers of Fv (Viterbi algorithm)

Question 31 Can you sketch the main properties of the minimizers of Fv?

Question 32 What seems being the role of the asymptotic of φ?

Numerical evidence:

critical values β1, · · · , βn such that

• β ∈ [βi, βi+1) ⇒ the minimizer remains unchanged

• β > βi+1 ⇒ the minimizer is simplified

Result proven (under conditions) for the minimizers of L1 − TV in [Chan, Esedoglu 2005]

77

Given v ∈ R consider the function

Fv(u) = |u− v|+ βφ(|u|) for φ(u) =αu

1 + αuu ∈ R, β > 0

Question 33 Does Fv have a global minimizer for any v? Explain.

Question 34 Determine φ′′(u) for u ∈ R \ 0.

Question 35 Show that ∀ v ∈ R, any minimizer u of Fv obeys u ∈ 0, v.

Question 36 Can you extend this result to the other φ on p. 71?

78

• Fv does have global minimizers, for any ai, for any v and for any β > 0.

• Let u be a (local) minimizer of Fv. Set

I0 = i ∈ I : aiu = v[i]J0 = j ∈ J : Gju = 0

u is the unique point solving the liner system aiu = v[i] ∀i ∈ I0

Gju = 0 ∀j ∈ J0

Each pixel of a (local) minimizer u of Fv is involved in (at least)

one equation aiu = v[i], or in (at least) one equation Gju = 0,

or in both types of equations.

• “Contrast invariance” of (local) minimizers

• The matrix with rows(ai, ∀i ∈ I0, Gj,∀j ∈ J0

)has full column rank

• All (local) minimizers of Fv are strict

79

MR Image Reconstruction from Highly Undersampled Data

0-filling Fourier ∥ · ∥22+TV ∥ · ∥1+TV Our method

−0.15

0

0.2

−0.15

0

0.2

−0.15

0

0.2

Reconstructed images from 7% noisy randomly selected samples in the k-space.

Our method for φ(t) =αt

αt+ 1.

80

MR Image Reconstruction from Highly Undersampled Data

0-filling Fourier ∥ · ∥22+TV ∥ · ∥1+TV Our method

−0.06

0

0.08

−0.06

0

0.08

−0.06

0

0.08

Reconstructed images from 5% noisy randomly selected samples in the k-space.

Our method for φ(t) =αt

αt+ 1.

81

Cartoon

Observed ℓ1-TV Our method, φ(t) = αtαt+1

82



Outline








8. Fully smoothed ℓ1−TV models – bounding the residual



83

8. Fully smoothed ℓ1 − TV

Fv(u) = Ψ(u, v) + βΦ(u), β > 0

Ψ(u, v) =

p∑i=1

ψ(u[i] − v[i]) and Φ(u) =∑i

φ(|Giu|)

ψ(·) := ψ(·, α1)

φ(·) := φ(·, α2)

(α1, α2) > 0

Gi ∈ R1×p – forward discretization:

N4 Only vertical and horizontal differences;

N8 Diagonal differences are added.

iNi4sic c ccc

c si Ni8c c cc c cc c c

(ψ,φ) belong to the family of functions θ(·, α) : R → R satisfying

H1 For any α > 0 fixed, θ(·, α) is Cs>2-continuous, even and θ′′(t, α) > 0, ∀ t ∈ R.

H2 For any α > 0 fixed, |θ′(t, α)| < 1 and for t > 0 fixed, it is strictly decreasing in α > 0

α > 0 ⇒ limt→∞

θ′(t, α) = 1 θ′(t, α) :=d

dtθ(t, α)

t ∈ R ⇒ limα→0

θ′(t, α) = 1 and limα→∞

θ′(t, α) = 0 .

⇒ Fv is a fully smoothed ℓ1 − TV energy.

84

θ θ′

f1√t2 + α

t√t2 + α

f2 α log

(cosh

(t

α

))tanh

(t

α

)f3 |t| − α log

(1 +

|t|α

)t

α+ |t|

Choices for θ(·, α) obeying H1 and H2. When α 0, θ(·, α) becomes stiff near the origin.

−3 0 3

3

−3 0 3

−1

0

1

−1 0 1

−5

0

5

θ(t) =√t2 + α θ′(t) = t√

t2+α(θ′)

−1(y) = y

√α

1−y2

Plots of f1 for α = 0.05 (—–) and for α = 0.5 (−−−).

85

The minimizers u of Fv can decrease the quantization noise

Real-valued original v quantized on 0, · · · , 15 Restored u

86

[Nikolova, Wen, R. Chan 12]

• For any β > 0, Fv(Rp) has a unique minimizer function U : Rp → Rp which is Cs−1.

Define G :=

p∪i=1

p∪j=1

g ∈ R1×p : g[i] = −g[j] = 1, i = j, g[k] = 0 if k ∈ i, j

All difference operators Gi belong to G.

NG :=∪g∈G

v ∈ Rp : g U(v) = 0

and NI :=

p∪i=1

p∪j=1

v ∈ Rp : Ui(v) = v[j]

Question 37 How to interpret the sets NG and NI?

• The sets NG and NI are closed in Rp and obey

Lp(NG) = 0 and Lp(NI) = 0

The property is true for any β > 0 and (α1, α2) > 0.

87

• Rp \ (NG ∪NI) is open and dense in Rp.

The elements of (NG ∪NI) are highly exceptional in Rp.

• The minimizers u of Fv generically satisfy u[i] = u[j] for any (i, j) such that i = j and

u[i] = v[j] for any (i, j).

The minimizers u of Fv have pixel values that are different from

each other and different from any data pixel.

Question 38 Describe the consequences if ℓ1 − TV is approximated

by a smooth function like Fv.

Recall the illustration on p. 24 and the results in section 3 (p. 26) and section 4 (p. 35).

88

Further... [Bauss, Nikolova, Steidl 13]

• For any α1 > 0 fixed, there is an inverse function (ψ′)−1

(·, α1) : (−1, 1) → R which

is odd, Cs−1 and strictly increasing.

α1 7→ (ψ′)−1

(y, α1) is also strictly increasing on (0,+∞), for any y ∈ (0, 1).

• Set η := ∥G∥1. Then

βη < 1 ⇒ ∥u− v∥∞ 6 (ψ′)−1 (

βη, α1

)∀ v ∈ Rp

• Also, ∥u− v∥∞ (ψ′)−1 (

βη, α1

)as α2 0.

Full control on the bound ∥u− v∥∞.

Question 39 Can you suggest applications where the properties of Fv are important?

89

Exact histogram specification

• v – input digital gray value m× n image / stored as an p := mn vector

• v[i] ∈ 0, · · · , L− 1 ∀ i ∈ 1, · · · , p 8-bit image ⇒ L = 256

• Histogram of v: Hv[k] =1p#

v[i] = k : i ∈ 1, · · · , p

∀ k ∈ 0, · · · , L− 1

• Target histogram: ζ = (ζ[1], · · · , ζ[L])

• Goal of histogram specification (HS): convert v into u so that Hu = ζ

order the pixels in v: i ≺ j if v[i] < v[j]

i1 ≺ i2 ≺ · · · ≺ iζ[1]︸︷︷︸ ≺ · · · ≺ ip−ζ[L]+1 ≺ · · · ≺ ip︸︷︷︸ζ[1] ζ[L− 1]

• Ill-posed problem for digital (quantized) images since p≫ L

• An issue: obtain a meaningful total strict ordering of all pixels in v

Histogram equalization is a particular case of HS where ζ[k] = p/L ∀ k ∈ 0, · · ·L− 1

90

Histogram Equalization (HE) using Matlab and our ordering

512512

512512

input image HE by ”histeq” HE by ”sort” HE our ordering

449 512

64

512449 512

64

512449 512

64

512449 512

64

512

0 255 0 255 0 255

64 64 64 64

Nikolova

Typewritten Text

histograms

Nikolova

Typewritten Text

zooms

91

Modern sorting algorithms

For any pixel v[i], extract K auxiliary information, ak[i], k ∈ 1, · · · ,K, from v. Set a0 := v. Then

i ≺ j if v[i] 6 v[j] and ak[i] < ak[j] for some k ∈ 0, · · · ,K.

Local Mean Algorithm (LM) [Coltuc, Bolon, Chassery 06]

− If two pixels are equal and their local mean is the same, take a larger neighborhood.

− The procedure smooths edges and sorting often fails.

Wavelet Approach (WA) [Wan, Shi 07]

− Use wavelet coefficients from different subbands to order the pixels.

− Heavy and high level of failure.

Specialized variational approach (SVA) [Nikolova, Wen and R. Chan 12]

− Minimize Fv for a parameter choice yielding ∥u− v∥∞ / 0.1.

− Almost no failure, faithful order and fast algorithm. [Nikolova 13]

92

Some results using Fv for color image enhancement

New fast histogram based color enhancement algorithm. [Nikolova, Steidl 14]

• [NS 14] M. Nikolova and G. Steidl, “Fast Hue and Range Preserving Histogram Specification:

Theory and New Algorithms for Color Image Enhancement”, IEEE Trans. Image Process., to

appear.

• [HYL 11] J. H. Han, S. Yang, and B. U. Lee, A novel 3-D color histogram equalization method

with uniform 1-D gray scale histogram, IEEE Trans. Image Process., vol. 20, no. 2, pp.

506-512, Feb. 2011.

• [BCPR 07] M. Bertalmıo, V. Caselles, E. Provenzi, and A. Rizzi, “Perceptual color correction

through variational techniques”, IEEE Trans. Image Process., vol. 16, no. 4, pp. 1058–1072,

Apr. 2007.

• [APBC 09] R. Palma-Amestoy, E. Provenzi, M. Bertalmıo, and V. Caselles, “A perceptually

inspired variational framework for color enhancement”, IEEE Trans. Pattern Analysis and

Machine Intelligence, vol. 31, no. 3, pp. 458–474, 2009.

• [ACE G 12] P. Gertreuer, “Automatic color enhancement (ACE) and its fast implementation”,

Image Processing On Line, DOI: 10.5201/ipol.2012.g-ace, vol. 2012

93

club (1800× 3200) Hist.-based [NS 14]

0 255 0 255

Hist.-based [NS 14] Perceptual [APBC 09]

0 255

94

boy-on-stones (800× 800) Hist.-based [NS 14] Hist.-based [HYL 11]

0 239 0 255 0 255

Perceptual [BCPR 07] Perceptual [APBC 09] ACE [G 12]

95

orchid (768× 1024) Hist.-based [NS 14] Hist.-based [HYL 11]

0 255 0 255 0 255

Perceptual [APBC 09] Perceptual [BCPR 07] ACE [G 12]

Input “orchid” with a bad flashlight effect.

96

snake (1000× 1000) Hist.-based [HYL 11] Hist.-based [NS 14]

0 197 0 255 0 255

Goal – enhance the snake.

97



Outline









9. Inverse modeling and Bayesian MAP – there is distortion


98

9 Inverse modeling and Bayesian MAP [Nikolova 07]

MAP estimators to combine noisy data and prior

Bayesian approach: U, V random variables, events U = u, V = v.

Likelihood fV|U (v|u), Prior fU (u) ∝ exp−λΦ(u), Posterior fU|V (u|v) = fV|U (v|u)fU (u) 1Z

MAP u = the most likely solution given the recorded data V = v:

u = argmaxu fU|V (u|v) = argminu

(− ln fV|U (v|u)− ln fU (u)

)= argminu

(Ψ(u, v) + βΦ(u)

)MAP is the most frequent way to combine models on data-acquisition and priors

Realist models for data-acquisition fV|U and prior fU

⇒ u must be coherent with fV|U and fU

In practice one needs that: U ∼ fU

AU − V ∼ fN⇒

fU ≈ fU

fN ≈ fN , N ≈ AU − V

Our analytical results show that both models (fV|U and fU ) are violated in a MAP estimate

99

Example: MAP shrinkage [Simoncelli99, Belge-Kilmer00, Antoniadis02]

• Noisy wavelet coefficients y=Wv=Wuo+n = xo + n, n∼N (0, σ2I)

• Prior: xo[i] are i.i.d., f(xo[i]) = 1Ze−λ|xo[i]|α (Generalized Gaussian, GG)

Experiments have shown that α ∈ (0, 1) for many real-world images

• MAP restoration ⇔ x[i] = argmint∈R

((t− y[i])2 + λ|t|α

), ∀i

(α, λ, σ) fixed—10 000 independent trials:

(1) sample x ∼ fX and n ∼ N (0, σ2), (2) form y = x+ n, (3) compute the true MAP x

−10 0 100

250

500

−10 0 100

250

500

GG,α = 1.2, λ = 12

The true MAP x

−2 0 20

500

1000

−2 0 20

500

1000

Noise N (0, σ2) Noise estimate

−10 0 100

5000

−10 0 100

5000

−3 0 30

20

GG, α = 12, λ = 2 True MAP x

−3 0 30

100

−3 0 30

100

Noise N (0, σ2) Noise estimate

100Theoretical explanations

V = AU + N and fU|V continuous ⇒

Pr(Giu = 0) = 0, ∀i

Pr(ai u = vi) = 0, ∀i

Pr(θ0 < ∥Giu∥ < θ1) > 0, ∀i

The analytical results on u = argminu

Fv(u) =MAP yield:

• fU continuous and non-smooth at 0, φ′(0+) > 0 Ch. 3, p. 26

v∈Oh ⇒[Giu = 0, ∀i ∈ h

]⇒ Pr(Giu=0,∀i∈ h) > Pr(v∈Oh)> 0

The effective prior: Giu = 0 for many i. (e.g. locally constant images)

• fN continuous and nonsmooth at 0, ψ′(0+) > 0 Ch. 4, p. 35

v ∈ Oh ⇒[ai u=vi, ∀i ∈ h

]⇒ Pr

(ai u=vi,∀i ∈ h

)>Pr(V ∈ Oh) > 0

The effective model: there are uncorrupted data entries.

• − ln fU (resp., φ) continuous and nonconvex ⇒ Pr(θ0 < ∥GiU∥ < θ1

)= 0, ∀i

The effective prior: edges.

• − ln fU nonconvex, nonsmooth at 0, φ′(0+) > 0 and φ′′ 6 0 Ch. 6, p. 54

⇒ Pr(∥Giu∥=0

)> 0 and Pr

(0<∥Giu∥<θ1

)=0

101

Illustration

Original differences Ui − Ui+1 i.i.d.∼ f(t) ∝ e−λφ(t) on [−γ, γ], φ(t) = α|t|1+α|t|

1 50 100

0

20

1 50 100

0

20

Original uo (—) by f for α = 10, λ = 1, γ = 4 The true MAP u (—), β = 2σ2λ

data v = uo + n (· · · ), N ∼ N (0, σ2I), σ = 5. versus the original uo (· · · ).

Knowing the true distributions, with the true parameters, is not enough.

Combining models remains an open problem

102

Knowledge on the features of the minimizers enables

new energies yielding appropriate solutions to be conceived

‘‘ We’re in Act I of a digital revolution.’’

Jay Cassidy (film editor at Mathematical Technologies Inc.)

Thank you!

10 Some References

103

10 Some References

1. Alliney S (1992) Digital filters as absolute norm regularizers. IEEE Trans Signal Process

SP-40:1548–1562

2. Ambrosio L, Fusco N, Pallara D (2000) Functions of bounded variation and free discontinuity Problems.

Oxford Mathematical Monographs, Oxford University Press

3. Antoniadis A, Fan J (2001) Regularization of wavelet approximations. J Acoust Soc Am 96: 939–967

4. Aubert G, Kornprobst P (2006) Mathematical problems in image processing, 2nd edn. Springer, Berlin

5. Aubert G, Aujol J.-F. (2008) A variational approach to remove multiplicative noise, SIAM J. on Appl.

Maths., 68:925-946

6. Aujol J-F, Gilboa G, Chan T, Osher S (2006) Structure-texture image decomposition - modeling,

algorithms, and parameter selection. Int J Comput Vis 67:111–136

7. Auslender A and Teboulle M. (2003) Asymptotic Cones and Functions in Optimization and Variational

Inequalities. Springer, New York

8. Attouch, H., Bolte, J. and Svaiter, B. F. (2013) Convergence of descent methods for semi-algebraic and

tame problems: proximal algorithms, forwardbackward splitting, and regularized GaussSeidel methods.

Math. Program. 137:91-129

9. Bar L, Brook A, Sochen N, Kiryati N (2007) Deblurring of color images corrupted by impulsive noise.

IEEE Trans Image Process 16:1101–1111

10. Bar L, Kiryati N, Sochen N (2006) Image deblurring in the presence of impulsive noise, International. J

Comput Vision 70:279–298

104

11. Bar L, Sochen N, Kiryati N (2005) Image deblurring in the presence of salt-and-pepper noise. In

Proceeding of 5th international conference on scale space and PDE methods in computer vision, ser

LNCS, vol 3439, pp 107–118

12. Bae E, Yuan J, Tai X.-C. (2011) Global minimization for continuous multiphase partitioning problems

using a dual approach, Int. J. Comput. Vis., 92:112–129.

13. Baus F, Nikolova M, Steidl G. (2014) Fully smoothed ℓ1-TV models: Bounds for the minimizers and

parameter choice, J Math Imaging Vis 48:295-307

14. Belge M, Kilmer M, Miller E (2000) Wavelet domain image restoration with adaptive edge-preserving

regularization. IEEE Trans Image Process 9:597–608

15. Besag JE (1974) Spatial interaction and the statistical analysis of lattice systems (with discussion). J

Roy Stat Soc B 36:192–236

16. Besag JE (1989) Digital image processing : towards Bayesian image analysis. J Appl Stat 16:395–407

17. Black M, Rangarajan A (1996) On the unification of line processes, outlier rejection, and robust

statistics with applications to early vision. Int J Comput Vis 19:57–91

18. Blake A, Zisserman A (1987) Visual reconstruction. MIT Press, Cambridge

19. Bobichon Y, Bijaoui A (1997) Regularized multiresolution methods for astronomical image

enhancement. Exp Astron 7:239–255

20. Bolte, J., Sabach, S. and Teboulle, M. (2013) Proximal alternating linearized minimization for

nonconvex and nonsmooth problems, Math. Program. Ser. A, online

21. Bouman C, Sauer K (1993) A generalized Gaussian image model for edge-preserving map estimation.

IEEE Trans Image Process 2:296–310

105

22. Bouman C, Sauer K (1996) A unified approach to statistical tomography using coordinate descent

optimization. IEEE Trans Image Process 5:480–492

23. Bredies K, Kunich K, Pock T (2010) Total generalized variation. SIAM J Imaging Sci 3(3): 480–491.

24. Bredies K, and Holler, M. (2014) Regularization of linear inverse problems with total generalized

variation. J Inverse and Ill-posed Problems DOI: 10.1515/jip-2013-0068.

25. Candes EJ, Donoho D, Ying L (2005) Fast discrete curvelet transforms. SIAM Multiscale Model Simul

5:861–899

26. Candes EJ, Guo F (2002) New multiscale transforms, minimum total variation synthesis. Applications

to edge-preserving image reconstruction. Signal Process 82:1519–1543

27. Catte F, Coll T, Lions PL, Morel JM (1992) Image selective smoothing and edge detection by nonlinear

diffusion (I). SIAM J Num Anal 29:182–193

28. Chambolle A (2004) An Algorithm for Total Variation Minimization and Application. J. Math Imaging

Vis 20:89–98

29. Chan T, Esedoglu S, Nikolova M. (2006) Algorithms for finding global minimizers of image

segmentation and denoising models. SIAM J. Appl. Math. 66:1632-1648

30. Chan T, Esedoglu S (2005) Aspects of total variation regularized L1 function approximation. SIAM J

Appl Math 65:1817–1837

31. Chan TF, Wong CK (1998) Total variation blind deconvolution. IEEE Trans Image Process 7:370–375

32. Charbonnier P, Blanc-Feraud L, Aubert G, Barlaud M (1997) Deterministic edge-preserving

regularization in computed imaging. IEEE Trans Image Process 6:298–311

33. Chellapa R, Jain A (1993) Markov random fields: theory and application. Academic, Boston

106

34. Chen X., M. Ng, Zhang C. (2012) Non-Lipschitz ℓp-Regularization and Box Constrained Model for

Image Restoration. IEEE Trans Image Process 21:4709–4721.

35. Chesneau C, Fadili J, Starck J-L (2010) Stein block thresholding for image denoising. Appl. Comput.

Harmon. Anal. 28:67-88

36. Ciarlet PG (1989) Introduction to numerical linear algebra and optimization. Cambridge University

Press, Cambridge

37. Coifman RR, Sowa A (2000) Combining the calculus of variations and wavelets for image enhancement.

Appl Comput Harmon Anal 9:1–18

38. Do MN, Vetterli M (2005) The contourlet transform: an efficient directional multiresolution image

representation. IEEE Trans Image Process 15:1916–1933

39. Dobson D, Santosa F (1996) Recovery of blocky images from noisy and blurred data. SIAM J Appl

Math 56:1181–1199

40. Donoho DL, Johnstone IM (1994) Ideal spatial adaptation by wavelet shrinkage. Biometrika 81:425–455

41. Donoho DL, Johnstone IM (1995) Adapting to unknown smoothness via wavelet shrinkage. J Acoust

Soc Am 90:1200–1224

42. Dontchev AL, Zollezi T (1993) Well-posed optimization problems. Springer, New York

43. Durand S, Froment J (2003) Reconstruction of wavelet coefficients using total variation minimization.

SIAM J Sci Comput 24:1754–1767

44. Durand S, Nikolova M (2006) Stability of minimizers of regularized least squares objective functions I:

study of the local behavior. Appl Math Optim 53:185–208, II: Study of the global behaviour. Appl

Math Optim 53:259–277

107

45. Durand S, Nikolova M (2007) Denoising of frame coefficients using ℓ1 data-fidelity term and

edge-preserving regularization. SIAM J Multiscale Model Simulat 6:547–576

46. Durand S, Fadili J, Nikolova M. (2010) Multiplicative noise removal using L1 fidelity on frame

coefficients. J. Math Imaging and Vision, 36:201-226

47. Duval V, Aujol J-F, Gousseau Y (2009) The TVL1 model: a geometric point of view. SIAM J

Multiscale Model Simulat 8:154–189

48. Ekeland I, Temam R (1976) Convex analysis and variational problems. North-Holland/SIAM,

Amsterdam

49. Fessler F (1996) Mean and variance of implicitly defined biased estimators (such as penalized maximum

likelihood): applications to tomography. IEEE Trans Image Process 5:493–506

50. Fiacco A, McCormic G (1990) Nonlinear programming. Classics in applied mathematics. SIAM,

Philadelphia

51. Geman D (1990) Random fields and inverse problems in imaging, vol 1427, Ecole d’Ete de Probabilites

de Saint-Flour XVIII - 1988, Springer, Lecture notes in mathematics, pp 117–193

52. Geman D, Reynolds G (1992) Constrained restoration and recovery of discontinuities. IEEE Trans

Pattern Anal Mach Intell PAMI-14:367–383

53. Geman D, Yang C (1995) Nonlinear image recovery with half-quadratic regularization. IEEE Trans

Image Process IP-4:932–946

54. Geman S, Geman D (1984) Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of

images. IEEE Trans Pattern Anal Mach Intell PAMI-6:721–741

55. Green PJ (1990) Bayesian reconstructions from emission tomography data using a modified em

algorithm. IEEE Trans Med Imaging MI-9:84–93

108

56. Lustig M, Donoho D, Santos JM and Pauly LM (2008) Compressed Sensing MRI: a look how CS can

improve our current imaging techniques: IEEE Signal Proc. Magazine. 25:72–82.

57. Haddad A, Meyer Y (2007) Variational methods in image processing, in “Perspective in Nonlinear

Partial Differential equations in Honor of Haım Brezis,” Contemp Math (AMS) 446:273–295

58. Hiriart-Urruty J-B, Lemarechal C (1996) Convex analysis and minimization algorithms, vols I, II.

Springer, Berlin

59. Hofmann B (1986) Regularization for applied inverse and ill posed problems. Teubner, Leipzig

60. Kak A, Slaney M (1987) Principles of computerized tomographic imaging. IEEE Press, New York

61. Keren D, Werman M (1993) Probabilistic analysis of regularization. IEEE Trans Pattern Anal Mach

Intell PAMI-15:982–995

62. Li S (1995) Markov random field modeling in computer vision, 1st edn. Springer, New York

63. Li SZ (1995) On discontinuity-adaptive smoothness priors in computer vision. IEEE Trans Pattern Anal

Mach Intell PAMI-17:576–586

64. Luisier F, Blu T (2008) SURE-LET multichannel image denoising: interscale orthonormal wavelet

thresholding. IEEE Trans Image Process 17:482–492

65. Malgouyres F (2002) Minimizing the total variation under a general convex constraint for image

restoration. IEEE Trans Image Process 11:1450–1456

66. Morel J-M, Solimini S (1995) Variational methods in image segmentation. Birkhauser, Basel

67. Morozov VA (1993) Regularization methods for ill posed problems. CRC Press, Boca Raton

68. Moulin P, Liu J (1999) Analysis of multiresolution image denoising schemes using generalized Gaussian

and complexity priors. IEEE Trans Image Process 45:909–919

109

69. Moulin P, Liu J (2000) Statistical imaging and complexity regularization. IEEE Trans Inf Theory

46:1762–1777

70. Mumford D, Shah J (1989) Optimal approximations by piecewise smooth functions and associated

variational problems. Commun Pure Appl Math 42:577–684

71. Nashed M, Scherzer O (1998) Least squares and bounded variation regularization with nondifferentiable

functional. Numer Funct Anal Optim 19:873–901

72. Nikolova M (2000) Local strong homogeneity of a regularized estimator. SIAM J Appl Math 61:633–658

73. Nikolova M (2000) Thresholding implied by truncated quadratic regularization. IEEE Trans Image

Process 48:3437–3450

74. Nikolova M (2002) Minimizers of cost-functions involving nonsmooth data-fidelity terms. Application to

the processing of outliers. SIAM J Num Anal 40:965–994

75. Nikolova M (2004) A variational approach to remove outliers and impulse noise. J Math Imaging Vis

20:99–120

76. Nikolova M (2004) Weakly constrained minimization. Application to the estimation of images and

signals involving constant regions. J Math Imaging Vis 21:155–175

77. Nikolova M (2005) Analysis of the recovery of edges in images and signals by minimizing nonconvex

regularized least-squares. SIAM J Multiscale Model Simulat 4:960–991

78. Nikolova M (2007) Analytical bounds on the minimizers of (nonconvex) regularized least- squares.

AIMS J Inverse Probl Imag 1:661–677

79. Nikolova M (2007) Model distortions in Bayesian MAP reconstruction, AIMS J Inverse Probl Imag,

1:399–422

110

80. Nikolova M, Ng M, Tam CP (2013) On ℓ1 Data Fitting and Concave Regularization for Image

Recovery. SIAM J. Sci. Comput 35:397-430

81. Nikolova M (2009) One-iteration dejittering of digital video images, J Vis Commun Image R,

20:254–274

82. Nikolova M (2013) Description of the minimizers of least squares regularized with ℓ0 norm. Uniqueness

of the global minimizer. SIAM J. Imaging Sci., 6: 904–937

83. Nikolova M, Wen YW, Chan R (2013) Exact histogram specification for digital images using a

variational approach. J Math Imaging Vis 46: 309-325

84. Nikolova M, Steidl G (2014) Fast Hue and Range Preserving Histogram Specification: Theory and New

Algorithms for Color Image Enhancement. IEEE Trans Image Process (to appear)

85. Papadakis N., Yildizoglu R., Aujol J-F. and Caselles V. (2013) High-dimension multi-label problems:

convex or non convex relaxation? SIAM J. Imaging Sci. 6:2603–2639

86. Perona P, Malik J (1990) Scale-space and edge detection using anisotropic diffusion. IEEE Trans

Pattern Anal Mach Intell PAMI-12:629–639

87. Potts R. B. (1952) Some generalized order-disorder transformations, Proc. Cambridge Philos. Soc.,

48:106–109

88. Rockafellar RT, Wets JB (1997) Variational analysis. Springer, New York

89. Rudin L, Osher S, Fatemi C (1992) Nonlinear total variation based noise removal algorithm. Physica 60

D:259–268

90. Robini, M. and Magnin, I. (2010) Optimization by stochastic continuation. SIAM J. Imaging Sci.,

3:1096-1121

111

91. Robini, M. and Reissman, P.-J. (2013) From simulated annealing to stochastic continuation: a new

trend in combinatorial optimization. J. Global Optim 56:185–215

92. Sauer K, Bouman C (1993) A local update strategy for iterative reconstruction from projections. IEEE

Trans Signal Process SP-41:534–548

93. Scherzer O, Grasmair M, Grossauer H, Haltmeier M, Lenzen F (2009) Variational problems in imaging.

Springer, New York

94. Tautenhahn U (1994) Error estimates for regularized solutions of non-linear ill posed problems. Inverse

Probl 10:485–500

95. Tikhonov A, Arsenin V (1977) Solutions of ill posed problems, Winston, Washington

96. Vogel C (2002) Computational methods for inverse problems. Frontiers in applied mathematics series,

vol 23. SIAM, New York

97. Welk M, Steidl G, Weickert J (2008) Locally analytic schemes: a link between diffusion filtering and

wavelet shrinkage. Appl Comput Harmon Anal 24:195–224

98. Winkler G (2006) Image analysis, random fields and Markov chain Monte Carlo methods. A

mathematical introduction. Applications of mathematics, 2nd edn, vol 27. Stochastic models and

applied probability. Springer, Berlin

99. Yuan J, Bae E, Tai X.-C. (2010) A study on continuous max-flow and min-cut approaches. Comp Vis

and Pattern Recognition: 2217-2224

100. Yuan J, Bae E, Tai X.-C., Boykov Y. (2010) A continuous max-flow approach to Potts model.

Computer VisionECCV 2010:379-392.

Documents

Inverse modeling in inverse problems using optimization