Least Square Optimization and Sparse-Linear Solver

Topics In Digital Contents Signals – Special Issue 01Topics In Digital Contents Signals – Special Issue 01

The Least-Square Optimization andThe Least-Square Optimization andSparse Linear System SolverSparse Linear System Solver

Presented by Ji-yong KwonVisual Computing Lab.

1

OutlineOutline

• What is the least-square optimization?– Optimization– Least-square optimization– Application to computer graphics

• Poisson image cloning

• What is the sparse linear system– Dense matrix v.s. Sparse matrix– Steepest-descent approach– Conjugate Gradient method

2

ReferenceReference

• Valuable reading materials– Practical Least-Squares for Computer Graphics

• Pighin and Lewis• ACM SIGGRAPH 2007 course note• http://graphics.stanford.edu/~jplewis/lscourse/ls.pdf

– An Introduction to the Conjugate Gradient Method Without the Agonizing Pain

• J. R. Shewchuk• CMU tech. report• http://www.cs.cmu.edu/~quake-papers/painless-conjugate-gradie

nt.pdf

3

http://graphics.stanford.edu/~jplewis/lscourse/ls.pdf

http://www.cs.cmu.edu/~quake-papers/painless-conjugate-gradient.pdf

http://www.cs.cmu.edu/~quake-papers/painless-conjugate-gradient.pdf

The Least-Square OptimizationThe Least-Square Optimization

4

Simple ExampleSimple Example

• Example– Line equation passing two points on the a-b plane

• One unique solution

5

a

b

01 ybxa

0101

22

11

ybxaybxa

111

11

12

12

1221

22

11

aabb

babayx

yx

baba

Line equation:

We know:

11,ba

22 ,ba


• Example– Line equation passing three points on the a-b plane

• No exact solution

6

a

b

01 ybxa

010101

33

22

11

ybxaybxaybxa

111

33

22

11

yx

bababa

Line equation:

We know:

11,ba

22 ,ba

33 ,ba

Why Optimization?Why Optimization?

• Observation– Unfortunately, many problems do not have a unique solution.

• Too many solutions, or• No exact solution

– Concept of Optimization• Find approximated solution

– Not exactly satisfy conditions,– But satisfy conditions as much as possible.

• Strategy– Set the objective (or energy) function– Find a solution that minimizes (or maximizes) the objective

function.

7

Why Optimization?Why Optimization?

• Objective function– A.K.A. energy function

– Input: a set of variables that we want to know– Output: a scalar value

– Output value is used for estimation of solution’s quality• Generally, small output value (small energy) good solution

– A solution that minimizes the output value of the objective function Optimized solution

– To design the good objective function is the most important task of the optimization techniques.

8


• Example again,– Line equation passing three points on the a-b plane

• No exact solution,• But we can compute the approximated solution.

9

a

b

11,ba

22 ,ba

33 ,ba

– Passing all points would be impossible,

– Find the line that minimizes distances from all points

Objective FunctionObjective Function

• Example again,– How to compute ‘distances’?

• Setting the objective function

10

a

b

01 ybxaPoint on the line:

11,ba

22 ,ba

33 ,ba+

–

0

Point out of the line:

01,01

ybxaybxa

Objective function:

3

1

21,i ii ybxayxO

Optimization ProblemOptimization Problem

• Problem description– Find the line coefficients (x, y) that minimize a sum of squared

distances between the line and given points.– Mathematically,

– More compact description

11

3

1

21 minimizei ii ybxa

3

1

21 argmin,i iix,yoo ybxayx

SolutionSolution

• Solution of the example

– The objective function hasa parabolic shape

The objective function would be minimized at the zero gradient of the function

12

3

1

21,i ii ybxayxO

x

y

O(x,y)

3

123

1

3

123

1

0212

0212

i iiiii iii

i iiiii iii

aybbxaybxabyO

abyaxaybxaaxO

SolutionSolution

• Solution of the example

13

x

y

O(x,y)

3

1

3

123

1

3

1

3

1

3

12

i ii ii ii

i ii iii i

bbybax

abayax

3

1

3

13

123

1

3

1

3

12

i i

i i

i ii ii

i iii i

b

ayx

bba

baa

3

1

3

1

1

3

123

1

3

1

3

12

i i

i i

i ii ii

i iii i

b

a

bba

baayx

Squared DistanceSquared Distance

• Why ‘squared’?– Naïve sum

• Each distance would be positive or negative,• Sum of signed distances does not estimate the quality of the

solution.

– Sum of the absolute distance• Distances would be 0 or positive,• But the minimum point cannot be computed easily

(Not differentiable at the minimum point)

– Sum of the square distance• Distances would be 0 or positive,• Differentiable at the minimum point,• The shape of the square function would be parabolic.

14

3

11,

i ii ybxayxO

3

11,

i ii ybxayxO

3

1

21,i ii ybxayxO

Another SolutionAnother Solution

• Pseudo-inverse– The inverse matrix can be computed if and only if the matrix is

square– The pseudo-inverse matrix

– From the example before…

– Solution computed by using pseudo-inverse method= Solution computed by using least-square optimization

15

111

33

22

11

yx

bababa

bAx

IAAAAAIAAAAAA 1

1

T

TT

bAAAx TT 1

BackgroundBackground

• Background of matrix differentiation– Very comfortable technique for derivation of matrix system– Reference

• A. M. Mathai, ‘Jacobians of Matrix Transformations and Functions of Matrix Argument’, World Scientific Publishing, 1997

– Contents that would be covered in this lecture…• A scalar value function of a vector• A vector function of a vector

16

T

p

Tp

xf

xf

xfyfy

xxx

xxxx

x

x

,,,

,,,

21

21


• Theorem 1– Let be the vector of variables

be the constant vector, then

Tpxxx ,,, 21 x

Tpaaa ,,, 21 a

xAAx

Axx

xx

xx

ax

xa

TT

T

T

yy

yy

yy

2


• Proof of Theorem 1 Tpxxx ,,, 21 x

Tpaaa ,,, 21 a

xAAx

xAxA

Axx

xx

xx

ax

xa

TTii

p

jjji

p

jjij

i

p

i

p

jjiij

T

Tp

T

pi

i

pT

Tp

T

pi

i

ppT

yxaxaxy

xxay

xxxxy

xy

xyyx

xy

xxxy

aaaxy

xy

xyya

xy

xaxaxay

3

2222 2

2

1

11

1 1

2121

222

21

2121

2211


• Theorem 2– Let

19

xy

xyJ

yx

Jdd

xy

xy

xy

xy

xy

xy

xy

xy

xy

xy

yyyxxx

p

ppp

p

p

j

i

Tp

Tp

21

2

2

2

1

2

1

2

1

1

1

2121 ,

Matrix FormulationMatrix Formulation

• Example again,– Can be described as a matrix form

20

2

33

22

11

2

33

22

11

3

1

2

111

111

1,

yx

bababa

ybxaybxaybxa

ybxayxOi ii

bAxbAx

bAxx

T

O 2

Matrix FormulationMatrix Formulation

• Matrix formulation of the least square optimization

21

bAAAx

bAAxA

bAAxAbAxAAAA

bbAxbAxAxx

bbbAxAxbAxAxx

bAxbAxx

bAxbAxxx

TT

TT

TT

TTT

TTTT

TTTTTT

TTT

TO

1

022

2

2

Can be solved by usingthe linear system solver

ConstraintsConstraints

• Slightly different example– Line that minimizes distance from the red points,– One additional constraint

• This line should pass the white point

22

a

b

11,ba

22 ,ba

33 ,ba

cc ba ,

01subject to1minimize 3

1

2

cc

i ii

ybxaybxa

ConstraintsConstraints

• Constraints– A.K.A. hard constraints

• c.f. soft constraints objective (energy) term– Condition that must be satisfied

– Constrained optimization• Optimization with some constraints

– (Linear / Non-linear) (equality / inequality) constraint

– This lecture only covers linear equality constraint

23

Constrained OptimizationConstrained Optimization

• Lagrange multiplier– Constrained optimization can be expressed as a

unconstrained optimization with a Lagrange multiplier

– Why is it possible?• This lecture would not covers the theory of the Lagrange

multiplier.• Reference

– http://en.wikipedia.org/wiki/Lagrange_multipliers

24

cyxC

yxO,subject to

,minimize cyxCyxO ,,minimize

http://en.wikipedia.org/wiki/Lagrange_multipliers


• Solution of constrained optimization

– At the minimum point, the gradient of objective function should be zero

– Also can be solved by using the linear system solver

25

121minarg 2 xcbAxx

T

01

xc

0cbAAxAx

T

TT

O

O

10bAx

ccAA T

T

T

bxA ˆˆˆ


• Case for multiple constraints– Multiple Lagrange multipliers

26

cCxλbAxx T2

21minarg

0

cCxλ

0λCbAAxAxO

O TTT

cbAx

CCAA TTT

0

bxA ˆˆˆ

ImplementationImplementation

• How to solve the linear system?– Provided by many libraries.– Using OpenCV,

• Data structure for storing a matrix– CvMat *aMat = cvCreateMat(nRow, nCol, CV_32F);– cvReleaseMat(&aMat);

• Set and get the element of a matrix– cvmSet(aMat, m, n, 1.0f);– float mn = cvmGet(aMat, m, n);

• Linear operation– cvAdd(aMat, bMat, cMat);– cvGEMM(aMat, aMat, 1.0f, NULL, 0.0f, ataMat,

CV_GEMM_A_T);• Solver

– cvSolve(aMat, bVect, xVect, CV_LU);

27

Practical ExamplePractical Example

• Poisson image cloning– Interesting solution to composite the image– Paste the modified gradient of source image with satisfaction

of boundary colors

28

Poisson Image CloningPoisson Image Cloning

• Problem description– Source image pixel– Target image pixel– Unknown new image pixel– Objective

• Minimize difference of gradients between a new image and a source image

– Constraint• Pixel values at the boundary should be equal to those of the

target image

29

yxs ,

yxt ,

yxn ,

s t nt


• Problem description– Mathematical formulation

30

yxtn

ssnn

ssnn

yxyx

yxyxyxyxyxyx

yxyxyxyxyxyx

,for ,subject to

minimize

,,

1,,,

2,1,,1,

,1,,

2,,1,,1

s t nt


• Matrix formulation

31

BtBnλGsGnn T2

21minarg

11G

yx, 1, yx

yx ,1

1B

yx,

0

0

BtBnλ

λBGsGGnGnO

O TTT

BtGsG

λn

0BBGG TTT


• Why is it called ‘Poisson’?

32

s t nt

-1x,y-

1-1x-1,y

4x,y

-1x+1,

y-1x,y+

1

1,,1,11,,

1,,1,11,,

,1,,1,

,,1,,1

,1,,1,

1,,1,,

,

44

yxyxyxyxyx

yxyxyxyxyx

yxyxyxyx

yxyxyxyx

yxyxyxyx

yxyxyxyx

yx

sssssnnnnn

ssnnssnnssnnssnn

nO

syn

xn

2

2

2

2


• Implementation issues

– Computation of can be expensive Construct

– The number of neighbors is not always equal to four

33

BtGsG

λn

0BBGG TTT

GGT

GGL T

11411L

-1

-1 4 -1

-1

-1

-1 3 -1

-1

-1 2

Sparse Linear System SolverSparse Linear System Solver

34

Dense V.S. SparseDense V.S. Sparse

• In previous example,– Assume that the size of the composite image is 200 x 200– 200 x 200 40,000 pixels 40,000 unknowns– We should solve the linear system

– Size of A: 40,000 x 40,000 1,600,000,000 elements 1,600,000,000 float 1,600,000,000 byte about 1.6Gb

– Computing the inverse of (40,000 x 40,000) is very, very expensive

35

bAx


• Concept of the dense / sparse matrix– Dense matrix

• The matrix that has a small number of zero elements– Sparse matrix

• The matrix that has a large number of zero elements

• Storing the dense / sparse matrix– Dense matrix

• Store all elements• [1, 0, 0; 0, -1, 0; 0, 0, 2]

– Sparse matrix• Store only non-zero elements• [(1,1,1), (2,2,-1), (3,3,2)]

36

200010001


• Linear system of Poisson image cloning

– has at most 5 non-zero elements per row.

– The number of non-zero elements of is same as that of the boundary pixels.

– 200 x 200 images (40000 x 5 + a) elements

– Efficient multiplication

37

BtGsG

λn

0BBGG TTT

GGT

B

Steepest-Descent MethodSteepest-Descent Method

• How to solve the sparse linear system?– Computing the inverse is expensive

Find an optimized solution Iteratively

– Strategy of the iterative method• Set the initial solution• Until the objective value converges,

– Compute the gradient ofthe objective functionat the current solution

– Move the solution withan inverse directionof the gradient

38x

y

O(x,y)

i

Oii

xxxx

1


• Steepest-descent method for linear system

– Gradient of the objective function (assume A is symmetric)

– Next iteration

– How to determine the step size ?

39

bAx cf TT xbAxxx21

bAxxbAxxx cf TT

21

iiiii rxAxbxx 1


• Determine the optimal step– minimizes the objective function at the zero gradient

40

0111 iii ddff

dd xxx

iTi

iTi

iTii

Ti

iT

ii

iT

ii

iT

ii

iT

i

Arrrr

Arrrr

rArr

rArAxb

rrxAb

rAxb

0

1

Conjugate Gradient MethodConjugate Gradient Method

• Performance of steepest-descent method– Simplest iterative algorithm that solves the linear system,– But slow convergence.

• Especially at the optimized point

• Conjugate Gradient Method– One of the most popular method that solves the sparse linear

system– Use the gradient direction and its conjugate gradient direction

to find the solution– Ideally, conjugate gradient method always converges at least

N iterations for a (N x N) matrix system– This lecture does not cover the detail of the CGM

41

Conjugate Gradient MethodConjugate Gradient Method

• Many improved versions of CGM– For stabilized and fast convergence

– Preconditioned Conjugate Gradient Method– Conjugate Gradient Squared Method– Bi-Conjugate Gradient Method– Bi-Conjugate Gradient Stabilized Method

– Reference• J. R. Shewchuk , An Introduction to the Conjugate Gradient

Method Without the Agonizing Pain, CMU tech. report• Wikipedia

42


• Library for sparse linear solver– TAUCS

• http://www.tau.ac.il/~stoledo/taucs/– OpenNL

• http://alice.loria.fr/index.php/software/4-library/23-opennl.html

• My original library for CGM– Made by Ji-yong Kwon– Simple implementation for dense /sparse matrix– Features

• 2 types of data structure: CDenseVect, CSparseMat• Linear operations between datas• A set of sparse matrix solver (CGM, BiCGSTAB)• Multi-core processing

43

http://www.tau.ac.il/~stoledo/taucs/

http://alice.loria.fr/index.php/software/4-library/23-opennl.html


• Basic usage– CSparseMat aMat;

CDenseVect bVect, xVect; CSparseSolverBiCGSTAB solver;

– Memory allocation• bVect.Init(nRow);• aMat.Create(nRow, nCol, 32);

// maximum number of elements per row• Automatic de-allocation

– Set/get element• bVect.Set(row, 1.0f);

float value = bVect.Get(row);• aMat.AddElement(row, col, 1.0f);

float value = aMat.GetElement(row, elementId);

44


• Basic usage– Solver initialization

• solver.InitSolver(aMat, bVect, xVect, 1.0f);– This function initialize the solver’s state– xVect should be initialized.

– Solve• while(solver.CheckTermination())

solver.OneStep(aMat, bVect, xVect);– Residual

• float residual = solver.GetResidual();

– Additional comment• For ‘Visual Studio’, check project property C/C++

Language Provide OpenMP : Yes• Release mode is much faster than debug mode.

45

SummarySummary

• Concept of the least-square optimization– Useful to convert the hard problem into the approximated

version– Can be solved by using the linear system solver

• Although the problem has multiple linear equality constraints

• Concept of the sparse linear system– Solving the large linear system with a dense matrix can be

very expensive Use iterative methods to solve the sparse linear system efficiently.

• The most important thing is to design the objective function

46

Science

Least Square Optimization and Sparse-Linear Solver