Upload
jocelin-york
View
227
Download
1
Embed Size (px)
Citation preview
CONJUGATE GRADIENT METHOD
Monica GarikaChandana Guduru
METHODS TO SOLVE LINEAR SYSTEMSDirect methods Gaussian elimination method LU method for factorization Simplex method of linear programmingIterative method Jacobi method Gauss-Seidel method Multi-grid method Conjugate gradient method
Conjugate Gradient methodThe CG is an algorithm for the numerical solution of particular system of linear equations
Ax=b. Where A is symmetric i.e., A = AT and positive
definite i.e.,
xT * A * x > 0 for all nonzero vectorsIf A is symmetric and positive definite then the
function Q(x) = ½ x`Ax – x`b + c
Conjugate gradient methodConjugate gradient method builds up a
solution x*€ Rn
in at most n steps in the absence of
round off errors.Considering round off errors more than n
steps may be needed to get a good approximation of exact solution x*
For sparse matrices a good approximation of exact solution can be achieved in less than n steps in also with round off errors.
In oil reservoir simulation, The number of linear equations corresponds to the number of grids of a reservoirThe unknown vector x is the oil pressure of
reservoirEach element of the vector x is the oil
pressure of a specific grid of the reservoir
Practical Example
Linear System
Square matrixUnknown vector
(what we want to find)Known vector
Matrix Multiplication
Positive Definite Matrix
> 0[ x1 x2 … xn ]
ProcedureFinding the initial guess for solution 𝑥0Generates successive approximations to 𝑥0Generates residualsSearching directions
x0 = 0, r0 = b, p0 = r0
for k = 1, 2, 3, . . . αk = (rT
k-1rk-1) / (pTk-1Apk-1) step length
xk = xk-1 + αk pk-1 approximate solution
rk = rk-1 – αk Apk-1 residual
βk = (rTk rk) / (rT
k-1rk-1) improvement
pk = rk + βk pk-1 search direction
Conjugate gradient iteration
Iteration of conjugate gradient method is of the form
x(t) = x(t-1) + s(t)d(t) where, x(t) is function of old value of vector x s(t) is scalar step size x
d(t) is direction vector Before first iteration, values of x(0), d(0) and g(0) must be set
Steps to find conjugate gradient methodEvery iteration t calculates x(t) in four steps : Step 1: Compute the gradient g(t) = Ax(t-1) – b Step 2: Compute direction vector d(t) = - g(t) + [g(t)` g(t) / g(t-1)` g(t-1)] d(t-1) Step 3: Compute step size s(t) = [- d(t)` g(t)]/d(t)’ A d(t)] Step 4: Compute new approximation of x x(t) = x(t-1) + s(t)d(t)
Sequential Algorithm1) 𝑥0 = 02) 𝑟0 := 𝑏 − 𝐴𝑥03) 𝑝0 := 𝑟04) 𝑘 := 05) 𝐾𝑚𝑎𝑥 := maximum number of iterations to be done6) if 𝑘 < 𝑘𝑚𝑎𝑥 then perform 8 to 167) if 𝑘 = 𝑘𝑚𝑎𝑥 then exit8) calculate 𝑣 = 𝐴𝑝k
9) αk : = rkT rk/pT
k v
10) 𝑥k+1:= 𝑥k + ak pk
11) 𝑟k+1 := 𝑟k − ak v
12) if 𝑟k+1 is sufficiently small then go to 16
end if13) 𝛽k :=( r T
k+1 r k+1)/(rTk 𝑟k)
14) 𝑝k+1 := 𝑟k+1 +βk pk
15) 𝑘 := 𝑘 + 116) 𝑟𝑒𝑠𝑢𝑙𝑡 = 𝑥k+1
Complexity analysisTo Identify Data DependenciesTo identify eventual communicationsRequires large number of operationsAs number of equations increases complexity
also increases .
Why To Parallelize?Parallelizing conjugate gradient method is a
way to increase its performanceSaves memory because processors only store
the portions of the rows of a matrix A that contain non-zero elements.
It executes faster because of dividing a matrix into portions
How to parallelize?For example , choose a row-wise block
striped decomposition of A and replicate all vectors
Multiplication of A and vector may be performed without any communication
But all-gather communication is needed to replicate the result vector
Overall time complexity of parallel algorithm is
Θ(n^2 * w / p + nlogp)
Row wise Block Striped Decomposition of a Symmetrically Banded Matrix
( a) ( b )
Dependency Graph in CG
Algorithm of a Parallel CG on each Computing Worker (cw)1) Receive 𝑐𝑤𝑟0,𝑐𝑤 𝐴,𝑐𝑤 𝑥02) 𝑐𝑤𝑝0 =𝑐𝑤 𝑟03) 𝑘 := 04) 𝐾𝑚𝑎𝑥 := maximum number of iterations to be done5) if 𝑘 < 𝑘𝑚𝑎𝑥 then perform 8 to 166) if 𝑘 = 𝑘𝑚𝑎𝑥 then exit7) 𝑣 =𝑐𝑤 𝐴𝑐𝑤𝑝𝑘8) 𝑐𝑤𝑁𝛼𝑘9) 𝑐𝑤𝐷𝛼𝑘10) Send 𝑁𝛼𝑘,𝐷𝛼𝑘11) Receive 𝛼𝑘12) 𝑥𝑘+1 = 𝑥𝑘 + 𝛼𝑘𝑝𝑘13) Compute Partial Result of 𝑟𝑘+1: 𝑟𝑘+1 =𝑟𝑘 − 𝛼𝑘𝑣14) Send 𝑥𝑘+1, 𝑟𝑘+115) Receive 𝑠𝑖𝑔𝑛𝑎𝑙16) if 𝑠𝑖𝑔𝑛𝑎𝑙 = 𝑠𝑜𝑙𝑢𝑡𝑖𝑜𝑛𝑟𝑒𝑎𝑐ℎ𝑒𝑑 go to 2317) 𝑐𝑤𝑁𝛽𝑘18) 𝑐𝑤𝐷𝛽𝑘19) Send 𝑐𝑤𝑁𝛽𝑘 , 𝑐𝑤𝐷𝛽𝑘20) Receive 𝛽𝑘21) 𝑐𝑤𝑝𝑘+1 =𝑐𝑤 𝑟𝑘+1 + 𝛽𝑘 𝑐𝑤𝑝𝑘22) 𝑘 := 𝑘 + 123) Result reached
Speedup of Parallel CG on Grid Versus Sequential CG on Intel
Communication and Waiting Time of the Parallel CG on Grid
We consider the difference between f at the solution x and any other vector p:
1)
21
) 21 1
) ( )2 2
1 1simplify) ( )
2 21 1
replace ) ( )2 2
1 1
2 2
t
t
t t
t t
t t t t
t t
i f
ii f
ii i f f
f f
f f
t
t
t t
t t
p p Ap b p c
x x Ax b x c
x p x Ax b x c p Ap b p c
x p x Ax b x p Ap b p
b x p x Ax x Ax p Ap x Ap
x Ax p A
1
2positive-definitess) ( ) 0
t
t
f f
p x Ap
x p A x p
x p
Parallel Computation Design– Iterations of the conjugate gradient method can be executed only in sequence, so the most advisable approach is to parallelize the computations, that are carried out at each iteration The most time-consuming computations are the
multiplication of matrix A by the vectors x and d– Additional operations, that have the lower computational complexity order, are different vector processing procedures (inner product, addition and subtraction, multiplying by a scalar).While implementing the parallel conjugate gradient method, it can be used parallel algorithms of matrix-vector multiplication,
Parallel Computation Design
“Pure” Conjugate Gradient Method (Quadratic Case)
0 - Starting at any x0 define d0 = -g0 = b - Q x0 , where gk is the column vector of gradients of the objective function at point f(xk)
1 - Using dk , calculate the new point xk+1 = xk + ak dk , where
2 - Calculate the new conjugate gradient direction dk+1 , according to: dk+1 = - gk+1 + bk dk
where
T a k = -
gk dkdkTQdk
b k = gk+1TQd kdkTQd k
ADVANTAGES Advantages:1) Gradient is always nonzero and linearly
independent of all previous direction vectors.2) Simple formula to determine the new
direction. Only slightly more complicated than steepest descent.
3) Process makes good progress because it is based on gradients.
ADVANTAGESAttractive are the simple formulae for
updating the direction vector.Method is slightly more complicated than
steepest descent, but converges faster.Conjugate gradient method is an Indirect SolverUsed to solve large systemsRequires small amount of memoryIt doesn’t require numerical linear algebra
ConclusionConjugate gradient method is a linear solver
tool which is used in a wide range of engineering and sciences applications.
However, conjugate gradient has a complexity drawback due to the high number of arithmetic operations involved in matrix vector and vector-vector multiplications.
Our implementation reveals that despite the communication cost involved in a parallel CG, a performance improvement compared to a sequential algorithm is still possible.
ReferencesParallel and distributed computing systems by
Dimitri P. Bertsekas, John N.TsitsiklisParallel programming for multicore and cluster
systems by Thomas Rauber,Gudula RungerScientific computing . An introduction with parallel
computing by Gene Golub and James M.OrtegaParallel computing in C with Openmp and mpi by
Michael J.QuinnJonathon Richard Shewchuk, ”An Introduction to
the Conjugate Gradient Method Without the Agonizing Pain”, School of Computer Science, Carnegie Mellon University, Edition 1 1/4