Numerical Solution of Skew-Symmetric Linear Systems · skew-symmetry is preserved. A few numerical experiments illustrate the performance of preconditioners generated by this procedure

Numerical Solution of

Skew-Symmetric Linear Systems

by

Tracy Lau

B.Sc., The University of British Columbia, 2007

A THESIS SUBMITTED IN PARTIAL FULFILLMENT OFTHE REQUIREMENTS FOR THE DEGREE OF

MASTER OF SCIENCE

in

The Faculty of Graduate Studies

(Computer Science)

THE UNIVERSITY OF BRITISH COLUMBIA

(Vancouver)

December 2009

c© Tracy Lau 2009

Abstract

We are concerned with iterative solvers for large and sparse skew-symmetriclinear systems. First we discuss algorithms for computing incomplete fac-torizations as a source of preconditioners. This leads to a new Crout variantof Gaussian elimination for skew-symmetric matrices. Details on how to im-plement the algorithms efficiently are provided. A few numerical results arepresented for these preconditioners. We also examine a specialized precon-ditioned minimum residual solver. An explicit derivation is given, detailingthe effects of skew-symmetry on the algorithm.

ii

Table of Contents

Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ii

Table of Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii

List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iv

List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v

Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . vi

1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

2 Factorizations of skew-symmetric matrices . . . . . . . . . . 32.1 Gaussian elimination variants . . . . . . . . . . . . . . . . . . 32.2 Bunch’s LDLT decomposition . . . . . . . . . . . . . . . . . 42.3 Incomplete factorizations . . . . . . . . . . . . . . . . . . . . 72.4 Crout factorization for skew-symmetric matrices . . . . . . . 82.5 Numerical results . . . . . . . . . . . . . . . . . . . . . . . . 14

3 Minimum residual iterations for skew-symmetric systems 173.1 Skew-Lanczos . . . . . . . . . . . . . . . . . . . . . . . . . . 173.2 QR factorization of Tk+1,k . . . . . . . . . . . . . . . . . . . 193.3 Skew-MINRES . . . . . . . . . . . . . . . . . . . . . . . . . . 213.4 Preconditioned skew-MINRES . . . . . . . . . . . . . . . . . 24

4 Conclusions and future work . . . . . . . . . . . . . . . . . . . 28

Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

iii

List of Tables

2.1 Convergence results of preconditioned GMRES for test problem. 15

iv

List of Figures

2.1 Residual vector norms for test problem. . . . . . . . . . . . . 16

v

Acknowledgements

I am deeply indebted to my supervisor, Chen Greif, for tirelessly guiding andencouraging me on the journey that led to this thesis, sharing his knowledgeand enthusiasm, and for supporting me in all my endeavours. Jim Varahhas also assisted in my work and kindly volunteered to be a reader of thisthesis, providing many helpful comments.

Scientific Computing Lab members Dan Li, Ewout van den Berg, HuiHuang, and Shidong Shan have been of much help and shown me greatkindness throughout my graduate studies. I am especially grateful to Ewoutfor consistently going out of his way to offer his expertise and excellent advicefrom the very first time I stepped into the lab for my undergraduate research.I thank all my labmates for fielding my numerous questions, sharing theirwisdom and experience, and for their good company and humour.

vi

Chapter 1

Introduction

A real skew-symmetric matrix S is one that satisfies S = −ST . Such matri-ces are found in many applications, perhaps not explicitly, but more likelyas an implicit part of the problem. Any general square matrix can be splitinto a symmetric part and a skew-symmetric part,

A =A + AT

2+

A − AT

2,

and any non-symmetric linear system has a nonzero skew component. Insome problems, the skew component may dominate the system, for examplein convection-dominated convection-diffusion problems.

Here are a few basic properties of a skew-symmetric matrix A:

• Its diagonal is zero since aij = −aji → aii = 0 ∀i.

• For any x, (xT Ax)T = −xT Ax = 0.

• If the dimension n of A is odd, then A is singular.

• A−1 is also skew-symmetric if it exists.

• All eigenvalues of A are either 0 or occur in pure imaginary complexconjugate pairs.

Skew-symmetric matrices can be factorized as PAP T = LDLT , where P isa permutation matrix, L is block lower triangular, and D is block diagonalwith 1×1 and 2×2 blocks. This is similar to the LDLT factorization forsymmetric indefinite matrices [6, §11.1].

In this thesis, we are concerned with solving large and sparse skew-symmetric systems using iterative solvers. Thus preconditioners are thefocus of the first section. With the goal of using incomplete factorizationsas preconditioners, we begin in Chapter 2 by discussing general factoriza-tions of skew-symmetric matrices. The Bunch decomposition [1] is stablewith appropriate monitoring of element growth, and it can be adapted toproduce incomplete factorizations. By using skew-symmetry, it halves work

1

Chapter 1. Introduction

and storage when compared to general LU factorizations. For an efficientalgorithm for sparse matrices, however, we turn to the Crout variants ofGaussian elimination, and see how exactly to incorporate pivoting such thatskew-symmetry is preserved. A few numerical experiments illustrate theperformance of preconditioners generated by this procedure.

In Chapter 3, we discuss the topic of iterative solvers themselves. Thevarious properties of skew-symmetric matrices can lead to specific variantsof existing solvers. We focus on a minimum residual algorithm startingwith a specialized Lanczos procedure and examining how matrix structureaffects MINRES [10]. Finally, we examine the preconditioned skew-MINRESalgorithm.

Thesis contributions

In the section on factorizations, we derive a new algorithm to compute in-complete factorizations of skew-symmetric matrices, very much in the spiritof [8, Algorithm 3.3] for symmetric matrices. Our algorithm is based onan incomplete Crout variant of Gaussian elimination and incorporates thepartial pivoting strategy of Bunch [1].

Implementation details are given for the two main factorization algo-rithms we discuss. In particular we consider the issues involved with work-ing only on half of the matrix. Also stemming from our work in Matlab,prototype code was developed and numerical experiments were run to testour preconditioners.

Our work on MINRES unpacks the details of the derivation of themethod given in [5]. We present explicit algorithms for both the unpre-conditioned and preconditioned schemes.

2

Chapter 2

Factorizations of

skew-symmetric matrices

We are interested in using iterative solvers on skew-symmetric systems, andso it is natural to explore the options for preconditioning. In particular,we look at incomplete LDLT factorizations based on the well-known ILU(0)and ILUT factorizations [11].

We start by briefly looking at two basic variants of Gaussian eliminationsince the order in which the elimination is executed affects the type of drop-ping scheme that can be implemented. The Bunch LDLT decompositionis then discussed along with some considerations for working with skew-symmetric matrices. We then review incomplete factorizations and mentionhow they can be generated with the Bunch decomposition. However, Croutfactorizations may be better suited to generating incomplete factorizationsfor sparse matrices, so we derive a skew-symmetric version. Finally, a fewnumerical experiments examine the performance of the preconditioners thatare generated by this new variant.

2.1 Gaussian elimination variants

Gaussian elimination for a general matrix A is typically presented in its KIJform (Algorithm 2.1), so-named due to the order of the loop indices. At

Algorithm 2.1 KIJ Gaussian elimination for general matrices

1: for k = 1 to n − 1 do

2: for i = k + 1 to n do

3: aik = aik/akk

4: for j = k + 1 to n do

5: aij = aij − aikakj

6: end for

7: end for

8: end for

3

2.2. Bunch’s LDLT decomposition

step k, every row below row k is modified as Schur complement updates forthe Ak+1:n,k+1:n submatrix are computed. This is fine for a dense matrix,but it is inefficient when working with matrices stored in sparse mode. Inthis case, the IKJ variant of Gaussian elimination (Algorithm 2.2) is moreefficient, modifying only row i in step i. It accesses rows above row i, whichhave previously been modified, but not those below. This is referred toas a “delayed-update” variant, since all Schur complement updates for aparticular element aij are computed within step i [8].

Algorithm 2.2 IKJ Gaussian elimination for general matrices

1: for i = 2 to n do

2: for k = 1 to i − 1 do

3: aik = aik/akk

4: for j = k + 1 to n do

5: aij = aij − aikakj

6: end for

7: end for

8: end for

The KIJ and IKJ algorithms produce the same complete factorization ofa matrix if it has an LU decomposition [11, Proposition 10.3]. Both of thesevariants can be adapted in various ways to decomposing skew-symmetricmatrices. Beyond modifying them to compute the block LDLT factorization,the idea is to use skew-symmetry wherever possible to make gains over theplain Gaussian elimination process.

2.2 Bunch’s LDLT decomposition

A method for decomposing a skew-symmetric matrix A into LDLT is de-tailed in [1]. It takes advantage of the structure of A to halve both workand storage, and stability can be increased with pivoting. It generates thefactorization PAP T = LDLT where P is a permutation matrix, L is blockunit lower triangular containing the multipliers, and D is block diagonalwith 1×1 and 2×2 blocks. Because the blocks are also skew, they are either0 or of the form

[

0 −αα 0

]

. In the case of the latter, they are nonsingular inthis decomposition.

We describe here a simplified version of the decomposition, and considerthroughout only nonsingular matrices. This implies n is even, and that Dwill only have 2×2 blocks on its diagonal. When performing operations onlyon half the matrix, we will use the lower triangular part.

4


The decomposition starts by partitioning A as[

S −CT

C B

]

where S is 2×2, C is (n−2)×2, and B is (n−2)×(n−2). Note that S andB are also skew-symmetric. If S =

[

0 −a21

a21 0

]

is nonsingular, i.e., a21 6= 0,then the first step in the factorization is

A =

[

S −CT

C B

]

=

[

I 0

CS−1 I

] [

S 0

0 B + CS−1CT

] [

I −S−1CT

0 I

]

.

The Schur complement B+CS−1CT is again skew-symmetric, so to con-tinue the factorization, repeat this first step on the (n−2)×(n−2) submatrix.Because the submatrix in each step is skew-symmetric, the factorization canbe carried out using only the lower triangular part of the matrix. At eachstep, A can be overwritten with the multipliers in CS−1 as well as the lowertriangular parts of S and B + CS−1CT .

If S is singular, i.e. a21 = 0, there must be some i, 2 ≤ i ≤ n, suchthat ai1 6= 0 since A is nonsingular. Interchange the ith and second row andcolumn to move ai1 into the (2, 1) position. This gives

A = P T1

[

S −CT

C B

]

P1

where S is now nonsingular and P1 = P T1 is obtained by interchanging the

ith and second column of the identity matrix.A high-level summary of the decomposition is given in Algorithm 2.3.

The factors are obtained as L, which is the collection of CS−1 from eachstep, and D, the collection of matrices S from each step. Both L and D canbe stored in half of A, and pivoting information can be stored in one vector.

Algorithm 2.3 Bunch LDLT for skew-symmetric matrices

1: for k = 1, 3, . . . , n − 1 do

2: find pivot apq and permute A3: set S = Ak:k+1,k:k+1, C = Ak+2:n,k:k+1, B = Ak+2:n,k+2:n

4: Ak+2:n,k:k+1 = CS−1

5: Ak+2:n,k+2:n = B + CS−1CT

6: end for

With regard to using only half of A, there is a small implementationissue that relates to the exact amount of storage required in total. When

5


computing the multiplier CS−1 and the Schur complement B + CS−1CT ,it would make sense to avoid computing CS−1 twice. It is then naturalto overwrite C with CS−1, but CT is still needed in the computation ofthe Schur complement. There are two straightforward ways to resolve this.If memory is not an issue, temporarily store either C or CS−1, which onlyrequires two vectors in either case, such that both are available for the Schurcomplement computation that follows. For the other solution, use the factthat CT = ((CS−1)S)T , hence CT may be recovered with S and CS−1.Since S is skew-symmetric, this amounts to multiplying two vectors withscalars and swapping their positions; in practice this only involves indexinginto the proper elements if the Schur complement is computed element-wise.

Pivoting and symmetric permutations

The pivoting scheme used above only ensures that the factorization may becompleted, but does nothing to control element growth in factors. In [1],Bunch proposes a partial piovting scheme that finds max2≤p≤n, 1≤q≤2{|apq|}.It then moves that element into the block pivot A1:2,1:2 by interchanging thefirst and qth row and column, followed by interchanging the pth and secondrow and column.

Of course, interchanging rows and columns should also exploit skew-symmetry and be performed only on the lower triangular part of the matrix.Let us examine a small 6×6 example to see the effects of skew-symmetry onhow elements are moved around by symmetric permutations.

Example 2.1. Let A be as shown below. The largest element in magnitudein the first two columns is a51 = 12. It is already in the first column, so weneed only interchange the fifth and second row and column. Let A denotethis permuted matrix.

A =

2

6

6

6

6

6

6

4

0 −1 −5 −11 −12 −31 0 −2 −7 −8 −45 2 0 −13 −15 −611 7 13 0 −10 −1412 8 15 10 0 −93 4 6 14 9 0

3

7

7

7

7

7

7

5

A =

2

6

6

6

6

6

6

4

0 −12 −5 −11 −1 −312 0 15 10 8 −95 −15 0 −13 2 −611 −10 13 0 7 −141 −8 −2 −7 0 −43 9 6 14 4 0

3

7

7

7

7

7

7

5

We show the entire matrix only for reference; the dark shading indicatesthe elements that cannot be accessed. The light shading highlights positionsin the lower triangular part of the matrix where elements are modified.Starting in the (5, 2) position, a52 = −a52. Elements in A to the right of

6

2.3. Incomplete factorizations

this position take their values from elements above this position in A andvice versa. For example, we set a53 = −a32, instead of a23, which is offlimits. That takes care of the elements that change sign. To the left of thesecond column, rows two and five are swapped giving a21 = 12 and a51 = 1.Likewise, below row five, columns two and five are swapped; a62 = 9 anda65 = 4 in this example.

In general, suppose rows and columns p and q are to be interchangedwhere p < q. Then

• aqp = −aqp,• atp = −aqt and aqt = −atp for t = p + 1 . . . q − 1,• apt = aqt and aqt = apt for t = 1 . . . p − 1, and• atp = atq and atq = apt for t = q + 1 . . . n.

While the Bunch partial pivoting scheme can limit element growth, itdoes not guarantee that the elements in L are smaller than one in magni-tude. Indeed, even in Example 2.1, |a32| = 15 > 12 = |a21|. Thus elementgrowth would have to be monitored to guarantee stability, similar to Gaus-sian elimination with partial pivoting.

The rook pivoting strategy chooses a pivot that is maximal in both itsoriginal row and column; see [6, §9.1] for its use in general matrices. It isconsidered as an alternate pivoting scheme for this decomposition in [5]. Weonly mention here that the adaptation of rook pivoting to using only halfof A is straightforward. Where either A:,p or Ap,: was searched in the fullmatrix for its maximal entry in magnitude, now search instead Ap,1:p−1 andAp:n,p.

2.3 Incomplete factorizations

Incomplete factorizations of the coefficient matrix of a linear system area rich source of preconditioners [11, §10.3]. There are numerous ways togenerate these, but here we focus on adapting two of the basic ones, namelyILU(0) and ILUT.

In ILU(0), the sparsity pattern of the factors matches that of the orig-inal matrix [11, §10.3.2]. In the skew-symmetric case, since operations areperformed on 2×2 blocks, the analogous ILDLT(0) produces factors thatmatch the block sparsity pattern of A. This is fairly straightforward to in-corporate into the Bunch decomposition. Compute the multiplier in line 4of Algorithm 2.3 only if the original 2×2 block in A has nonzero norm. Then

7

2.4. Crout factorization for skew-symmetric matrices

only compute the update for a block in line 5 if the corresponding multiplierfrom CS−1 has nonzero norm.

For general matrices, ILUT is a more accurate factorization than ILU(0)[11, §10.4]. It uses two dropping rules to determine which elements to replacewith zero. The first rule limits the amount of work performed by ignoringelements with a small norm: if the norm of an element is below a thresholdrelative to the norm of its row, then set that element to zero to skip furthercomputations with it. The second dropping rule limits the amount of mem-ory used to store L: in each row, allow at most a fixed number of nonzeroblocks, keeping those that are largest in magnitude and setting the rest tozero. The diagonal element is never dropped.

For the skew-symmetric version of ILUT, which we call ILDLTT, weagain treat 2×2 blocks rather than individual elements. If ILDLTT were com-puted using the Bunch decomposition, then column-based dropping shouldbe used instead of the row-based dropping described above. Again, any drop-ping would be performed after line 4 in Algorithm 2.3. However, the Bunchdecomposition is general and not specialized for sparse matrices. Whenworking with such matrices in a compressed storage format, it is not partic-ularly efficient due to the pattern in which matrix entries are accessed. Wenow turn to Crout factorization, which is able to handle computing ILDLTTefficiently.

2.4 Crout factorization for skew-symmetric

matrices

It is possible to adapt ILUT, derived from IKJ Gaussian elimination, toskew-symmetric matrices by creating a 2×2 block version that also halvesboth work and storage. However, preserving symmetry such that these sav-ings are possible is impractical once pivoting is introduced. The CompressedSparse Row storage scheme [11, §3.4], which is the data structure typicallyused for sparse matrices, does not easily lend itself to symmetric pivotingin IKJ Gaussian elimination. Since pivoting is often necessary, we follow[8], which discusses sparse symmetric matrices, in using a Crout variant ofGaussian elimination. As a side note, Crout factorization allows for efficientimplementation of more robust dropping strategies [9], but we will not dis-cuss them here. The full Crout LU factorization for general matrices from[11, Algorithm 10.8] is shown here in Algorithm 2.4.

The Crout form of Gaussian elimination for general matrices allows forboth symmetric pivoting and the inclusion of dropping rules to produce

8


Algorithm 2.4 Crout LU factorization for general matrices

1: for k = 1 : n do

2: for i = 1 : k − 1 and if aki 6= 0 do

3: ak,k:n = ak,k:n − akiai,k:n

4: end for

5: for i = 1 : k − 1 and if aik 6= 0 do

6: ak+1:n,k = ak+1:n,k − aikak+1:n,i

7: end for

8: for i = k + 1 : n do

9: aik = aik/akk

10: end for

11: end for

incomplete factorizations. Additionally, it is easily adapted to produce ablock factorization. It is similar to IKJ Gaussian elimination in that it alsoinvolves a delayed update. At each step k, the kth row of U and the kthcolumn of L are computed. Note that if A were symmetric, lines 3 and 6would produce essentially the same numbers. The Ak+1:n,k+1:n submatrix isnot accessed.

We now derive an incomplete skew-symmetric version of Crout factoriza-tion by making modifications to Algorithm 2.4. The new version is summa-rized in Algorithm 2.5. This algorithm operates on 2×2 blocks and producesan LDLT factorization. As with the Bunch factorization, work and storagecan be halved, and so lines 2–4 of Algorithm 2.4 are redundant for skew-symmetric matrices.

Algorithm 2.5 Incomplete skew-symmetric Crout factorization, ILDLTC

1: for k = 1, 3, . . . , n − 1 do

2: for i = 1, 3, . . . , k − 2 and ‖Ak:k+1,i:i+1‖ 6= 0 do

3: Ak:n,k:k+1 = Ak:n,k:k+1 − Ak:n,i:i+1Ai:i+1,i:i+1ATk:k+1,i:i+1

4: end for

5: apply dropping rules to Ak+1:n,k:k+1

6: for i = k + 2, k + 4, . . . , n − 1 do

7: Ai:k+1,k:k+1 = Ai:i+1,k:k+1A−1k:k+1,k:k+1

8: end for

9: end for

9


In line 7, there is no explicit inversion of Ak:k+1,k:k+1 since

Ai:i+1,k:k+1A−1k:k+1,k:k+1 = Ai:i+1,k:k+1

[

0 −ak+1,k

ak+1,k 0

]−1

=1

ak+1,k

Ai:i+1,k:k+1

[

0 1−1 0

]

,

which can be trivially computed element-wise. To keep with using only halfof the matrix, line 3 can also be computed element-wise so that only ai+1,i

is required from Ai:i+1,i:i+1.Again there is the issue of overwriting A with the multipliers in L in line

3. If the full matrix were being operated on, the line would read Ak:n,k:k+1 =Ak:n,k:k+1 −Ak:n,i:i+1Ai:i+1,k:k+1. Since Ai:i+1,k:k+1 is in the upper half of Aand not available, we want to use −AT

k:k+1,i:i+1, but the required value for aparticular i would have been overwritten in line 7 during the ith step of thefactorization. So Ai:i+1,k:k+1 can be recovered with Ak:k+1,i:i+1Ai:i+1,i:i+1.Transposing and negating gives line 3 as shown.

Incorporating thresholding to obtain an incomplete LDLT factorizationonly requires the addition of line 5. Either method presented in §2.3 can beused here, and the algorithm is certainly not restricted to using only thesedropping schemes.

Pivoting

Bunch’s partial pivoting strategy employed in the LDLT decomposition issomewhat more involved when used in a delayed update algorithm. InCrout, once pivoting is incorporated, it is no longer true that elements inAk+2:n,k+2:n are not accessed at step k. As we shall see though, it will be onlyone column that is needed from this submatrix. Algorithm 2.6 summarizesthe result of incorporating Bunch pivoting into ILDLTC.

When choosing a pivot in the usual KIJ Gaussian elimination schemes,the set of elements from which the pivot is chosen has already been modifiedfrom the original matrix in previous steps with Schur complement updates.So in Algorithm 2.5, a pivot may only be chosen after the updates in lines2–4. In Algorithm 2.6, instead of writing these updates directly into A, westore them in a temporary block vector w. This is due to the swapping ofelements to and from the submatrix Ak+2:n,k+2:n when performing row andcolumn interchanges.

10


Algorithm 2.6 ILDLTC-BP

1: for k = 1, 3, . . . , n − 1 do

2: w1:k−1,1:2 = 0, wk:n,1:2 = Ak:n,k:k+1

3: for i = 1, 3, . . . , k − 2 and ‖Ak:k+1,i:i+1‖ 6= 0 do

4: wk:n,1:2 = wk:n,1:2 − Ak:n,i:i+1Ai:i+1,i:i+1ATk:k+1,i:i+1

5: end for

6: find max |wpq′ | s.t. p > q′

7: set q = q′ + k − 18: if p > k + 1 then

9: z1:p−1 = −ATp,1:p−1, zp:n = Ap:n,p

10: for i = 1, 3, . . . , k − 2 do

11: zk:n = zk:n − Ak:n,i:i+1Ai:i+1,i:i+1ATp,i:i+1

12: end for

13: end if

14: update A with w and z15: interchange rows and columns k and q16: interchange rows and columns k + 1 and p17: apply dropping rules to Ak+2:n,k:k+1

18: for i = k + 2, k + 4, . . . , n − 1 do

19: Ai:i+1,k:k+1 = Ai:i+1,k:k+1A−1k:k+1,k:k+1

20: end for

21: end for

For simplicity, we want to satisfy the invariant that Ak+2:n,k+2:n is un-touched after each step of the elimination. Holding the updates in a tem-porary vector allows us to write into A only the updates for elements thatremain, after permutations are performed, in Ak:n,k:k+1. Any elements thatare brought into Ak+2:n,k:k+1 due to row and column interchanges must alsobe updated first. Such elements lie in one column and we use the tempo-rary vector z for this purpose. We illustrate this update process with thefollowing example.

Example 2.2. We ignore dropping rules for a moment and step throughone step of both Bunch (Algorithm 2.3) and Crout (Algorithm 2.6) factor-izations, paying attention to how elements are updated in the context ofpivoting. In making this comparison, we see an instance of how delayed-update algorithms differ from the usual KIJ-based algorithms.

11


Let us take A to be

A =

0 −10 −1 −4 −2 −4 −9 −310 0 −2 −5 −3 −10 −1 −21 2 0 −0.7 −4.9 −11.2 −10.3 −2.64 5 0.7 0 −2.2 −9 −3.9 −3.32 3 4.9 2.2 0 −13.8 −12.5 −5.54 10 11.2 9 13.8 0 −1.4 −11.89 1 10.3 3.9 12.5 1.4 0 −10.53 2 2.6 3.3 5.5 11.8 10.5 0

.

We will now focus only on the lower triangular part of A. Let B(1)

and C(1) denote what A looks like after the first step of Bunch and Crout,respectively. (The shading will be discussed shortly.) Note that the lower

right submatrix of C(1) is unmodified from A.

B(1) =

010 0

−0.2 0.1 0−0.5 0.4 1 0−0.3 0.2 5 2 0−1 0.4 11 7 13 0−0.1 0.9 12 8 15 10 0−0.2 0.3 3 4 6 14 9 0

C(1) =

010 0

−0.2 0.1 0−0.5 0.4 0.7 0−0.3 0.2 4.9 2.2 0−1 0.4 11.2 9 13.8 0−0.1 0.9 10.3 3.9 12.5 1.4 0−0.2 0.3 2.6 3.3 5.5 11.8 10.5 0

In the second step, k = 3, we see from B(1) that the next pivot element

is in the (7, 3) position. Contrast this with C(1)7,3 , which is not the largest

element in magnitude in the third and fourth columns. The Schur comple-ment update needs to be computed first for these two columns, and so w is

loaded with C(1)3:8,3:4. After the update, w is the same as the third and fourth

columns of B(1). Its elements below the diagonal are shaded.With the pivot element in the (p, q) = (7, 3) position, only rows and

columns 4 and 7 need to be interchanged to move it into the block pivot.

12


Performing this on B(1),

PB(1)PT =

010 0

−0.2 0.1 0−0.1 0.9 12 0−0.3 0.2 5 −15 0−1 0.4 11 −10 13 0−0.5 0.4 1 −8 −2 −7 0−0.2 0.3 3 9 6 14 4 0

.

There are now unshaded elements in the third and fourth columns, andsome shaded elements have moved to the right. This tells us which elementsof w need to be written into A in the Crout factorization, namely, thosecorresponding to the shaded elements that are still within the third andfourth columns. Elements of w corresponding to those that are now to theright of the fourth column should not be updated.

Since p > k+1 in our example, this also shows which elements to theright of the fourth column in C(1) need to be loaded into z and updated,namely column seven. Due to the restriction to the lower-half of the matrix,z is also loaded from the seventh row, as stated in line 9 of Algorithm 2.6.The shaded elements in C(1) are those that need to be updated before thematrix is permuted. It is no surprise that the shading pattern here is thesame as the one in PB(1)P T which highlights elements originating from thethird and fourth columns; it follows from how symmetric permutations areperformed. The updated and permuted matrix in the Crout factorization is

010 0

−0.2 0.1 0−0.1 0.9 12 0−0.3 0.2 5 −15 0−1 0.4 11 −10 13.8 0−0.5 0.4 1 −8 −2.2 −9 0−0.2 0.3 3 9 5.5 11.8 3.3 0

,

which agrees with PB(1)P T in the first four columns. From here, the mul-tipliers are computed in both factorizations, the Bunch factorization alsocomputes Schur complements, and the second step is complete.

We summarize the update procedure for line 14 of Algorithm 2.6 withthe following:

13

2.5. Numerical results

if q == k

A(k+1:n,k) = w(k+1:n,1);

if p == k+1

A(k+2:n,k+1) = w(k+2:n,2);

else

A(p,k+1:p-1) = -z(k+1:p-1);

A(p+1:n,p) = z(p+1:n);

end

else % q==k+1

A(k+1,k) = w(k+1,1);

A(k+2:p-1,k+1) = w(k+2:p-1,2);

A(p+1:n,k+1) = w(p+1:n,2);

if p > k+1

A(p,k:p-1) = -z(k:p-1);

if p < n

A(p+1:n,p) = z(p+1:n);

end

end

end

The idea is to update elements so that the pivot can be found, and then takepermutations into account and write only the updates into A for elementsthat end up in columns k and k + 1. There are a few equivalence classesof possible pivot locations within these two columns, and the exact updatedetails, which are trivial and somewhat tedious, are in the code shown above.

We note that rook pivoting may again be used instead of Bunch partialpivoting here, but it is slightly more involved. Recall from §2.2 that rookpivoting on half the matrix involves looking at Ap,k:p−1 and Ap:n,p in eachstep. In Algorithm 2.6, after finding the maximal element in magnitude ofw, z is exactly the next row/column searched in rook. At each subsequentstep of rook, another vector serving a role similar to z will be needed totemporarily store updates before finding its maximal element. This is cer-tainly more work, but as noted in [5], the number of rook steps needed tofind the pivot is typically small.

2.5 Numerical results

We compare in Matlab the performance of preconditioners generated fromthe incomplete factorizations ILDLT and ILU. We use our ILDLTC-BProutine to generate ILDLT factors. For the general ILU factors, we use

14


Matlab’s built-in luinc routine. For the test problem, we use the skew-symmetric part of the matrix for the 3D convection-diffusion equation, dis-cretized using the centered finite difference scheme on the unit cube withDirichlet boundary conditions. The mesh Reynolds numbers are 0.48, 0.5,and 0.52. The grid size is 24, so A has dimension n = 13, 824. The num-ber of nonzeros of A is 79,488, and the number of nonzeros in its full LUfactorization is 7,220,435.

Our results are presented in Table 2.1. The right-hand vector b is set

preconditioner parameters nnz itn err res

luinc 1e-1 4,409,643 21 1.50 7e-7luinc 1e-2 6,832,291 31 2e-5 8e-7ildlc-bp 1e-2, 50 411,779 9 1.22 3e-7ildlc-bp 1e-3, 50 489,190 9 1.14 1e-7

Table 2.1: Convergence results of preconditioned GMRES for test problem.

to be Axe where xe is the normalized all-ones vector. Preconditioners aregenerated with a few different parameters for each of the methods. For luincthe parameter is the drop tolerance. For ildlc-bp, the first parametershown is the drop tolerance and the second is the maximum number ofnonzero blocks allowed per row. The number of nonzeros (nnz) reported isthat of L+U for luinc and L+D for ildlc-bp since we need not store LT .We use Matlab’s gmres with tolerance 10−6 and restart of 30 iterations.The remaining columns of the table show the number of iterations requiredfor convergence and the 2-norms of the error and residual.

A plot of the relative residual norms at each step of GMRES is given inFigure 2.1 with no preconditioner and with the second and fourth precondi-tioners from the table. Unpreconditioned GMRES does not converge within500 outer iterations.

With the ILDLT preconditioners, GMRES converges to within the de-sired tolerance faster than with either of the ILU preconditioners. For allthe preconditioned systems, the desired tolerance is achieved for the normof the residual; however, most of them are not so well conditioned and wesee that the error is relatively large except in the case of luinc with tol-erance 1e-2. Unfortunately, the storage cost for achieving such accuracy isprohibitively large; the number of nonzeros in this luinc preconditioner isnearly that of the full factorization. When trying to obtain a sparser factorby using a higher drop tolerance with luinc, the error is similar to thatfor ILDLT preconditioners. For the latter, however, the factors are much

15


0 50 100 150 200 250 30010

−7

10−6

10−5

10−4

10−3

10−2

10−1

100

iteration

rela

tive

resi

dual

unpreconditionedluinc (1e−2)ildlc−bp (1e−3,50)

Figure 2.1: Residual vector norms for test problem.

sparser, requiring an order of magnitude less storage than either of the ILUpreconditioners. This gain is beyond that coming from not storing LT .

There is potential for further gains in using ILDLT over ILU. Gaininginsight into the parameter choice may be useful. Let us note that the factorsgenerated were often quite ill-conditioned. It seems from some numerical ex-periments that this is due to large isolated eigenvalues in the preconditionedmatrix. Seeking ways to improve conditioning remains an item for futurework.

16

Chapter 3

Minimum residual iterations

for skew-symmetric systems

For skew-symmetric systems, it is natural to turn to short recurrence solverssuch as the ones used in the symmetric case, for example, conjugate gradientand MINRES. These save on storage that would be required when usinggeneral non-symmetric solvers such as GMRES. Both CG and MINRES areslightly different when adapted to skew-symmetric systems. Greif and Varahgive a detailed account of skew-CG in [5] and summarize the derivationfor skew-MINRES, including preconditioning for both. See also [7] for anunpreconditioned version of MINRES for shifted skew-symmetric systems.

In this chapter, we elaborate on skew-MINRES. We first discuss the un-preconditioned algorithm, starting with the Lanczos procedure and devel-oping skew-MINRES, before adding preconditioning. Skew-symmetry intro-duces a number of zeros in predictable places, all due to the zero diagonalin the matrix. This slightly simplifies the iteration compared to that forstandard symmetric systems.

3.1 Skew-Lanczos

The Arnoldi iteration computes the Hessenberg reduction of a general matrixA = VkHkV

Tk with Hk k×k upper Hessenberg and Vk n×k orthonormal. For

symmetric A, it is referred to as the Lanczos iteration and Hk is tridiagonal[2, §6.6.1]. Similarly, if A were skew-symmetric then Hk, which we shall nowlabel Tk, is also skew-symmetric:

T Tk = (V T

k AVk)T = V T

k AT Vk = −V Tk AVk = −Tk.

This gives rise to an analogous skew-Lanczos iteration [5].The iteration starts with the n×n, skew-symmetric matrix A and some

initial vector v0. We take the first vector in Vk to be v1 = v0/‖v0‖2. At thekth step of the iteration, we have a sequence of vectors vi such that

AVk = Vk+1Tk+1,k. (3.1)

17

3.1. Skew-Lanczos

The columns of Vk form an orthonormal basis of Kk(A; r0). Due to skew-symmetry, Tk+1,k is even sparser than that in the symmetric case:

Tk+1,k =

0 α1

−α1 0 α2

−α2 0 α3

−α3. . .

. . .. . .

. . .. . .

. . . 0 αk−1

−αk−1 0−αk

.

The kth column of (3.1) satisfies

Avk = αk−1vk−1 − αkvk+1.

Rearranging this gives the two-term recurrence,

vk+1 = (αk−1vk−1 − Avk)/αk,

instead of the three-term recurrence found in the symmetric case. The αi arechosen to normalize the vi. If ever αi is effectively 0, the algorithm breaksdown and another vector cannot be generated. This implies Avi ∈ Ki(A; r0).

Algorithm 3.1 summarizes the skew-Lanczos iteration. The results ofthe algorithm are the set of vi that make up Vk+1 and the αi that defineTk+1,k. Note that A need not be available explicitly; it suffices to have aroutine that returns matrix-vector products with A.

Algorithm 3.1 Skew-Lanczos

1: set v1 = b/‖b‖2, α0 = 0, v0 = 02: for i = 1 to k do

3: zi = Avi − αi−1vi−1

4: αi = ‖zi‖2

5: if αi = 0 then

6: quit7: end if

8: vi+1 = −zi/αi

9: end for

18

3.2. QR factorization of Tk+1,k

3.2 QR factorization of Tk+1,k

The skew-MINRES procedure that follows uses the QR factorization ofTk+1,k,

Tk+1,k = Qk+1

(

Rk

0

)

= Qk+1Rk,

where Qk+1 is (k+1)×(k+1) and Rk is k×k. It is convenient to defineQk+1, which consists of the first k columns of Qk+1. The relatively simplestructure of Tk+1,k in turn gives a relatively simple Rk, and we can deriveexplicit formulas for each of its nonzero elements. The orthonormal matrixQ is not needed explicitly for skew-MINRES, but it is a product of Givensrotation matrices.

The first step of skew-Lanczos produces T2,1 =(

0−α1

)

. Applying a Givensrotation,

GT1 T21 =

(

0 −11 0

) (

0

−α1

)

=

(

α1

0

)

=

(

R1

0

)

,

so R1 = (α1). Taking Q2 = G1, the first factorization is complete.Each Gi denotes a particular rotation, but its dimension will vary de-

pending on context, padding it with the identity matrix as necessary. In thekth iteration, GT

1 will first be applied to Tk+1,k, followed by GT2 and so on

up to GTk . They are all of dimension (k+1)×(k+1), and Qk+1 is defined as

Qk+1 = G1G2...Gk.The second step of skew-Lanczos produces

T3,2 =

0 α1

−α1 00 −α2

.

Applying the first rotation matrix,

GT1 T3,2 =

0 −1 01 0 00 0 1

0 α1

−α1 00 −α2

=

α1 00 α1

0 −α2

.

To zero out −α2, use the rotation( c2 s2

−s2 c2

)T, where c2 and s2 are α1/r2,2

and α2/r2,2, respectively. This replaces the α1 in the second column withr2,2 =

√

α21 + α2

2. In full,

GT2 (GT

1 T3,2) =

1 0 00 c2 −s2

0 s2 c2

α1 00 α1

0 −α2

=

α1 0

0√

α21 + α2

2

0 0

.

19

3.2. QR factorization of Tk+1,k

To explicitly see the pattern forming, we continue for a few more itera-tions, generating Rk from Tk+1,k. For k = 3,

GT3 (GT

2 GT1 T4,3) =

1 0 0 00 1 0 00 0 0 −10 0 1 0

α1 0 −α2

0√

α21 + α2

2 00 0 00 0 −α3

=

α1 0 −α2

0√

α21 + α2

2 00 0 α3

0 0 0

, with

r13 = −α2

r33 = α3

c3 = 0

s3 = 1

.

For k = 4,

GT4 (QT

4 T5,4) =

α1 0 −α2 0

0√

α21 + α2

2 0 −α3s2

0 0 α3 0

0 0 0√

(α3c2)2 + α24

0 0 0 0

with c4 = α3c2/r44, and s4 = α4/r44.For k = 5, we again have c5 = 0 and s5 = 1.For k = 6,

GT6 (QT

6 T7,6) =

0

B

B

B

B

B

B

B

B

@

α1 0 −α2 0 0 0

0p

α2

1+ α2

20 −α3s2 0 0

0 0 α3 0 −α4 0

0 0 0p

(α3c2)2 + α2

40 −α5s4

0 0 0 0 α5 0

0 0 0 0 0p

(α5c4)2 + α2

6

0 0 0 0 0 0

1

C

C

C

C

C

C

C

C

A

with c6 = α5c4/r66 and s6 = α6/r66.In summary, the QR factorization of Tk+1,k can be stated explicitly. First

define c0 = 1, and the ci and si are

ci =

{

0 i oddαi−1ci−2

riii even

, si =

{

1 i oddαi

riii even

.

Qk+1 = G1G2 · · ·Gk and each Gi is a Givens rotation matrix, which isessentially the identity matrix with the substitution

Gi(i : i + 1, i : i + 1) =

(

ci si

−si ci

)

.

20

3.3. Skew-MINRES

There are only two nonzero diagonals in Rk given by

ri,i =

{

αi i odd√

(αi−1ci−2)2 + α2i i even

, ri,i+2 =

{

−αi+1 i odd

−αi+1si i even.

3.3 Skew-MINRES

Having developed skew-Lanczos and examined the QR factorization of theresulting skew-symmetric tridiagonal matrix, we have the tools to adaptMINRES into skew-MINRES to solve Ax = b where A is skew-symmetric.

At each iteration, we seek xk such that ‖b−Axk‖2 is minimized over theset x0 + Kk(A; r0). Thus xk = x0 + Vkyk for some yk, with Vk generated byskew-Lanczos using v0 = r0.

Throughout, we can use the initial guess x0 = 0 without loss of generality,for suppose we had x0 6= 0. Then define b′ = b − Ax0 = r0, implyingKk(A; r0) = Kk(A; b′), and rewrite the original problem as

minxk∈x0+Kk(A;r0)

‖b − Axk‖2 = min(xk−x0)∈Kk(A;r0)

‖b − Ax0 − A(xk − x0)‖2

= minx′

k∈Kk(A;b′)

‖b′ − Ax′k‖2,

with x′k := xk − x0. This new problem is equivalent to the original one

with the new iterates being related to the original xk, and in particular,x′

0 = x0 − x0 = 0. Hence from here on we will consider Kk(A; r0) to beKk(A; b), and

xk = Vkyk. (3.2)

Making the usual substitutions,

minxk

‖b − Axk‖2 = minyk

‖b − AVkyk‖2

= minyk

‖b − Vk+1Tk+1,kyk‖2.

Since Vk+1 is orthonormal and v1 = b/‖b‖2, the problem becomes

minyk

‖ρe1 − Tk+1,kyk‖2, (3.3)

where ρ = ‖b‖2 and e1 is the first standard basis vector of size k+1.

21

3.3. Skew-MINRES

Define Wk = VkR−1k and zk = QT

k+1ρe1. Combining (3.2), (3.3), andthe QR factorization of Tk+1,k, the kth approximation to the solution of thesystem can be written as

xk = (VkR−1k )(QT

k+1ρe1) = Wkzk.

Now we step through skew-MINRES iterations to derive expressions forWk and zk, beginning with the latter. Define Ik to be the first k rows of thek+1 identity matrix. Rewrite zk = QT

k+1ρe1 = IkQTk+1ρe1 = IkG

Tk (QT

k ρe1).

Note zk−1 = Ik−1(QTk ρe1). For k = 1,

z1 = I1QT2 ρe1 = I1

(

0 −11 0

) (

ρ

0

)

= I1

(

0

ρ

)

= 0,

so x1 = 0 also. For k = 2 to 6,

z2 = I2

1 0 00 c2 −s2

0 s2 c2

0ρ0

= I2

0ρc2

ρs2

=

(

0

ρc2

)

,

z3 = I3

1 0 0 00 1 0 00 0 0 −10 0 1 0

0ρc2

ρs2

0

= I3

0ρc2

0ρs2

=

0ρc2

0

,

z4 = I4

1 0 0 0 00 1 0 0 00 0 1 0 00 0 0 c4 −s4

0 0 0 s4 c4

0ρc2

0ρs2

0

= I4

0ρc2

0ρc4s2

ρs2s4

=

0ρc2

0ρc4s2

,

z5 = I5(0 ρc2 0 ρc4s2 0 ρs2s4)T = (0 ρc2 0 ρc4s2 0)T , and

z6 = I6(0 ρc2 0 ρc4s2 0 ρc6s2s4 ρs2s4s6)T = (0 ρc2 0 ρc4s2 0 ρc6s2s4)

T .

In general, z1 = 0 and

zk =

(

zk−1

ζk

)

,

where

ζk =

{

0 k odd

ρck

∏

i=2,4,...,k−2 si k even.

As for Wk, only the even-indexed columns are of interest since xk = Wkzk

and all odd-indexed elements of zk are zero. Indeed, skew-MINRES will onlyproduce an approximation xk for k even.

22

3.3. Skew-MINRES

For k = 2, W2 := V2R−12 =

(

v1

r1,1

v2

r2,2

)

and w2 = v2

r2,2. For k = 4, 6, . . .,

recalling the structure of Rk gives vk = wk−2rk−2,k + wkrk,k. Rearrangingthis,

wk =vk − wk−2rk−2,k

rk,k

.

In the end, each even iteration of skew-MINRES produces

xk = w2ζ2 + w4ζ4 + . . . + wkζk.

Algorithm 3.2 shows the method in detail, incorporating skew-Lanczos. Asin GMRES with a nonsingular A, here if αi = 0 for some i, then residual isalso 0 and the solution has been found [11, Proposition 6.10].

Algorithm 3.2 Skew-MINRES

1: p = ‖b‖2, α0 = 0, v0 = w = 0, v1 = b/p, c = 1, s = 0, x0 = 02: for k = 1, 2, . . . ,until convergence do

3: zk = Avk − αk−1vk−1

4: αk = ‖zk‖2

5: vk+1 = −zk/αk

6: if k even then

7: r =√

(αk−1c)2 + α2k

8: w = (vk + αk−1sw)/r9: c = αk−1c/r

10: ζ = cp11: xk = xk−2 + ζw12: s = αk/r13: p = ps14: end if

15: end for

Typically, the norm of the residual is used to determine convergence and,as usual, it can be computed without calculating b−Ax explicitly. Note thatyk solves minyk

‖ρe1 − Tk+1,kyk‖ by satisfying Tkyk = ρe1 [11, Proposition

23

3.4. Preconditioned skew-MINRES

6.9]. Thus the residual vector itself is

rk = b − Axk

= ρv1 − AVkyk

= ρv1 − Vk+1Tk+1,kyk

= ρVke1 − (VkTk + αkvk+1eTk )yk

= Vk(ρe1 − Tkyk) − αkvk+1eTk yk

= −αkeTk ykvk+1,

where eTk yk is the kth element of yk. Its norm is given by

‖rk‖2 = |αkeTk yk|

= |αkeTk (V T

k Wkzk)|= |αke

Tk (R−1

k zk)|= |αkζk/rkk|= |ζk|,

exactly as in the general case.

3.4 Preconditioned skew-MINRES

We can precondition skew-symmetric systems in a manner very similar tothat in the symmetric case [4, Chapter 8] to obtain a preconditioned skew-MINRES iteration, or skew-PMINRES. To preserve skew-symmetry, a sym-metric positive definite preconditioner

M = LLT

is used in a symmetric preconditioning scheme, giving the system

L−1AL−T x = L−1b, x = L−T x. (3.4)

Instead of searching over Kk(A; b) for iterates, skew-PMINRES searchesover Kk(L−1AL−T ; L−1b), so we start by developing preconditioned skew-Lanczos. If regular skew-Lanczos (Algorithm 3.1) were run with the matrixL−1AL−T and initial vector v1 = L−1b/‖L−1b‖2, then each iteration wouldproduce

zk = L−1AL−T vk − αk−1vk−1

αk =√

zTk zk

vk+1 = −zk/αk.

24


Using this directly in a preconditioned MINRES algorithm would requirehaving L explicitly, yet M may not always be available in factored form.Additionally, it would only compute iterates xk, and an additional solvewith LT would be required to obtain xk. We make modifications to arriveat skew-PMINRES which addresses both of these issues. It requires onlyone solve with M at each iteration.

First, to combine the L and LT solves in preconditioned skew-Lanczos,define

vk = Lvk

zk = Lzk

uk = L−T L−1vk = M−1vk.

Note that the vk are no longer orthonormal; it is the vk that are orthonormaland form a basis for Kk(L−1AL−T ; L−1b). Now each iteration generates

zk = LL−1AL−T L−1vk − αk−1LL−1vk

= Auk − αk−1vk

αk =√

(L−1zk)T (L−1zk) =√

zTk M−1zk

vk+1 = −zk/αk

uk+1 = M−1vk+1.

Define uk = M−1zk to remove the unnecessary preconditioner solve for αk.Then

αk =√

zTk uk

uk+1 = −uk/αk.

This gives Algorithm 3.3, the preconditioned skew-Lanczos routine that gen-erates the Vk and Tk+1,k used in skew-PMINRES.

There is only one further change to make in skew-MINRES to obtain theskew-PMINRES algorithm; one that makes the algorithm explicitly computeiterates xk approximating the solution to Ax = b rather than the xk of (3.4).

Consider line 8 in Algorithm 3.2, wk = (vk +αk−1sk−2wk−1)/rk. The xk

generated in line 11 are simply linear combinations of the wk, which in turnare linear combinations of the vk. However, these are the vk from precondi-tioned skew-Lanczos. To obtain xk, we need vk = L−1vk. If the solve withLT is introduced here, then line 11 computes the xk desired. Conveniently,

25


Algorithm 3.3 Preconditioned skew-Lanczos

1: a0 = 0, v0 = 0, v1 = b/√

bT M−1b, u1 = M−1v1

2: for i = 1, 2, . . . , k do

3: zi = Aui − αi−1vi−1

4: ui = M−1zi

5: αi =√

zTi ui

6: vi+1 = −zi/αi

7: ui+1 = −ui/αi

8: end for

the uk from preconditioned skew-Lanczos are exactly L−T L−1vk, and line 8of Algorithm 3.2 becomes

wk = (uk + αk−1sk−2wk−1)/rk.

With this we have skew-PMINRES in Algorithm 3.4. The inputs are A,b, and either M or L. It works on the preconditioned system of (3.4), andit returns approximations xk to the solution of Ax = b.

Algorithm 3.4 Skew-PMINRES

1: p =√

bT M−1b, α0 = s = 0, c = 1, v0 = w = x0 = 0, v1 = b/p,u1 = M−1v1

2: for k = 1, 2, . . . , until convergence do

3: zk = Auk − αk−1vk−1

4: uk = M−1zk

5: αk =√

zTk uk

6: vk+1 = −zk/αk

7: uk+1 = −uk/αk

8: if k even then

9: r =√

(αk−1c)2 + α2k

10: w = (uk + αk−1sw)/r11: c = αk−1c/r12: ζ = cp13: xk = xk−2 + ζw14: s = αk/r15: p = ps16: end if

17: end for

26


For computing the residual easily in the preconditioned algorithm, thederivation is similar to that for plain skew-MINRES. The problem nowsolved is that given in (3.4), yet the residual of interest is still b − Axk.Instead of (3.1), we now have (L−1AL−T )(L−1Vk) = (L−1Vk+1)Tk+1,k, or

AM−1Vk = Vk+1Tk+1,k,

since it is the L−1vk that are orthonormal here. Additionally, it is xk thatcan be written as L−1Vkyk, for some yk. By the definition of xk,

xk = L−T L−1Vkyk = M−1Vkyk.

Putting these together,

rk = b − Axk

= ρv1 − AM−1Vkyk

= ρv1 − Vk+1Tk+1,kyk.

The remainder of the derivation is exactly as shown previously in the un-preconditioned case, so again ‖rk‖2 = |ζk|.

27

Chapter 4

Conclusions and future work

We discussed the Bunch LDLT decomposition for skew-symmetric matricesand derived a Crout-based factorization for computing incomplete LDLT

factorizations with pivoting. Implementation details have been given onhow to practically save on computational work and storage by performingoperations on only the lower triangular half of the matrix. This may beuseful in writing more efficient implementations using compressed matrixstorage schemes.

Numerical results show that a skew-symmetric preconditioner can bean improvement over the general incomplete LU factorization. For precon-ditioners that gave roughly the same error and residual in GMRES, theskew preconditioner was much sparser than the general one and requiredfewer iterations for convergence. Further work here may investigate theeffects of the parameters of ILDLTC-BP on the performance of precondi-tioned GMRES. Another item is to explore improving the conditioning ofthe factors, since those produced by our factorization were not as robust asexpected.

We have also presented the details of a skew preconditioned MINRESiteration, derived in [5], starting by adapting Lanczos to skew-symmetricsystems and seeing that the structure results in skew-MINRES producing ap-proximations every other iteration. While preconditioning was incorporated,the procedure relies on having a symmetric positive definite preconditioner.Given the gains observed when using a skew-symmetric preconditioner inGMRES, future work includes investigating whether it is possible to derivea form of MINRES that can utilize skew-symmetric preconditioners. It isnot clear to us how to formulate such a MINRES procedure.

Finally, we return to the point mentioned originally that skew-symmetricsystems arise from non-symmetric systems. Another potential area of worklies in incorporating the above into general non-symmetric solvers whendealing with systems that have a dominant skew part.

28

Bibliography

[1] J. R. Bunch. A note on the stable decomposition of skew-symmetricmatrices. Math. Comp., 38(158):475–479, 1982.

[2] J. W. Demmel. Applied Numerical Linear Algebra. SIAM, 1997.

[3] I. S. Duff. The design and use of a sparse direct solver for skew sym-metric matrices. J. Comput. Appl. Math., 226(1):50–54, 2009.

[4] A. Greenbaum. Iterative methods for solving linear systems. SIAM,1997.

[5] C. Greif and J. M. Varah. Iterative solution of skew-symmetric linearsystems. SIAM J. Math. Anal., 31(2):584–601, 2009.

[6] N. J. Higham. Accuracy and Stability of Numerical Algorithms. SIAM,2nd edition, 2002.

[7] R. Idema and C. Vuik. A minimal residual method for shifted skew-symmetric systems. Technical Report 07-09, Delft University of Tech-nology, 2007.

[8] N. Li and Y. Saad. Crout versions of the ILU factorization with pivotingfor sparse symmetric matrices. Electron. Trans. Numer. Anal., 20:75–85, 2005.

[9] N. Li, Y. Saad, and E. Chow. Crout versions of ILU for general sparsematrices. SIAM J. Sci. Comput., 25(2):716–728, 2003.

[10] C. C. Page and M. A. Saunders. Solution of sparse indefinite systemsof linear equations. SIAM J. Numer. Anal., 12(4):617–629, 1975.

[11] Y. Saad. Iterative Methods for Sparse Linear Systems. SIAM, Philadel-pha, PA, 2nd edition, 2003.

[12] L. N. Trefethen and D. Bau. Numerical Linear Algebra. SIAM, 1997.

29

Documents

Numerical Solution of Skew-Symmetric Linear Systems · skew-symmetry is preserved. A few numerical experiments illustrate the performance of preconditioners generated by this procedure