High accuracy multigrid solution of the 3D …jzhang/pub/MGDIR/zhang-gupta00a.pdfHigh accuracy multigrid solution of the 3D convection–di•usion equation Murli M. Gupta a,1, Jun

High accuracy multigrid solution of the 3Dconvection±di�usion equation

Murli M. Gupta a,1, Jun Zhang b,*

a Department of Mathematics, The George Washington University, Washington, DC 20052, USAb Department of Computer Science, University of Kentucky, 773 Anderson Hall, Lexington,

KY 40506-0046, USA

Abstract

We present an explicit fourth-order compact ®nite di�erence scheme for approxi-

mating the three-dimensional (3D) convection±di�usion equation with variable coe�-

cients. This 19-point formula is de®ned on a uniform cubic grid. Fourier smoothing

analysis is performed to show that the smoothing factor of certain relaxation techniques

used with the scheme is smaller than 1. We design a parallelization-oriented multigrid

method for fast solution of the resulting linear system using a four-color Gauss±Seidel

relaxation technique for robustness and e�ciency, and a scaled residual injection op-

erator to reduce the cost of multigrid inter-grid transfer operator. Numerical experi-

ments on a 16 processor vector computer are used to test the high accuracy of the

discretization scheme as well as the fast convergence and the parallelization or vector-

ization e�ciency of the solution method. Several test problems are solved and highly

accurate solutions of the 3D convection±di�usion equations are obtained for small to

medium values of the grid Reynolds number. E�ects of using di�erent residual pro-

jection operators are compared on both vector and serial computers. Ó 2000 Elsevier

Science Inc. All rights reserved.

Keywords: 3D convection±di�usion equation; Fourth-order compact scheme; Multigrid method;

Four-color Gauss±Seidel relaxation

1. Introduction

Numerical simulation of three-dimensional (3D) problems tends to becomputationally intensive and may be prohibitive on conventional computers

Applied Mathematics and Computation 113 (2000) 249±274www.elsevier.nl/locate/amc

* Corresponding author.1 E-mail address: [email protected], [email protected] (M.N. Gupta).

0096-3003/00/$ - see front matter Ó 2000 Elsevier Science Inc. All rights reserved.

PII: S0 0 96 -3 0 03 (9 9 )0 00 8 5- 5

due to the requirements on the memory and the CPU time to obtain solutionswith desired accuracy. Traditional numerical schemes have low accuracy andrequire extremely ®ne discretization. The size of the resulting linear systems for3D problems is usually so large that even modern computers may not beable to handle them. One approach to alleviate these di�culties is to usehigher-order methods, which yield approximate solutions with comparableaccuracy using much coarser discretization, resulting in linear systems ofsmaller size.

In the two-dimensional (2D) case, a number of fourth-order compact ®nitedi�erence schemes have been designed for convection±di�usion equations andthe Navier±Stokes equation by several authors (see, e.g., Refs. [9,13,17]). Theseschemes have good numerical stability and provide high accuracy approxi-mations. Our recent studies [23,25] indicate that the fourth-order compactschemes work very well with the multigrid method. Such fourth-order methodshave been found [15] to be computationally more e�cient than the traditionalmethods for the di�usion dominated problems, and for the problems whereconvection is not very strong. We have shown that in order to obtain computedsolutions of a given accuracy, the fourth-order compact schemes using rela-tively coarser discretization may be hundreds of times faster and use lessmemory than the lower-order schemes.

In this paper, we consider the numerical solution of the 3D convection±di�usion equation

Du�x; y; z� � k�x; y; z�; l�x; y; z�; /�x; y; z�� ru�x; y; z� � f �x; y; z�; �1�for a speci®ed forcing function f in a continuous 3D domain X with Dirichletboundary conditions prescribed on oX. Here k; l;/ and f are assumed to becontinuously di�erentiable and X is a union of rectangular solids.

Eq. (1) is very important in computational ¯uid dynamics to model thetransport phenomena. The functions k; l and / are convection coe�cients. Forlarge values of the convection coe�cients, Eq. (1) is said to be convection-dominated.

For the convection-dominated problems, basic iterative methods fail toconverge when used for solving linear systems resulting from the standard7-point central di�erence discretization. The ®rst-order upwind di�erencescheme, though stable, yields solutions of poor resolution and extremely ®nediscretization is needed for the approximate solutions to be su�ciently accu-rate. Recently, Greif and Varah [11] used a cyclic reduction technique toprecondition linear systems resulting from the ®rst- and second-order discret-izations of Eq. (1) with constant coe�cients and showed that the reducedsystems have better algebraic properties than the original systems. On the otherhand, our experience with the 2D fourth-order compact di�erence schemessuggests that their 3D counterparts may provide an e�ective approach to stableand high accuracy solution for Eq. (1) [13,15,23,25].

250 M.M. Gupta, J. Zhang / Appl. Math. Comput. 113 (2000) 249±274

For the discretized 3D convection±di�usion equation, there is a lack ofdetailed discussion on the multigrid solution method, although multigridtechniques for 3D nonsymmetric problems have been discussed in Refs. [3,20]and elsewhere. For some di�usion problems and the Poisson equation, mul-tigrid solution strategies on vector machines were investigated in Refs. [10,16].Dendy's black box multigrid methods [8] were recently used by Bandy [2] (withnew inter-grid transfer operators) to solve some interface problems.

In this paper, we present an explicit fourth-order ®nite di�erence scheme,and the related multigrid solution strategies, for Eq. (1). In Section 2, wepresent the fourth-order compact ®nite di�erence scheme and compare theadvantages and implementation costs of this scheme with the standard 7-pointscheme in the context of the basic iterative methods. Section 3 contains solu-tion strategies for the resulting linear system, including the multigrid methodand the associated Fourier smoothing analysis, the grid coloring strategy, andmultigrid inter-grid transfer operators. In Section 4, we report results of nu-merical experiments to verify the accuracy and computational e�ciency of themultigrid solution method. We obtain high accuracy solutions of 3D convec-tion±di�usion equations in a highly e�cient procedure for small to mediumvalues of the grid Reynolds number.

2. Finite di�erence scheme

The discretization is carried out on a uniform 3D grid with meshsize h, andwe use a local coordinate system where the unit cubic grids are labeledas in Fig. 1. The approximate value of a function u�x; y; z� at an internal meshpoint �i; j; k� is denoted by u0. The approximate values of its immediate

Fig. 1. Labeling of the 3D grid points in a unit cube.

M.M. Gupta, J. Zhang / Appl. Math. Comput. 113 (2000) 249±274 251

18 neighboring mesh points are denoted by ul; l � 1; 2; . . . ; 18; as in Fig. 1.The 8 corner points of the unit cube are not used in our ®nite di�erence scheme.The discrete values of kl;ll;/l and fl for l � 0; 1; . . . ; 6; are de®ned analo-gously. We expand u�x; y; z� in Taylor series at a grid point 0 as

u�x; y; z� �Xi;j;k

ai;j;kxiyjzk: �2�

The convection coe�cients k; l;/ and the forcing function f are expanded inTaylor series analogously. The expansions are substituted into Eq. (1) to ob-tain a ®nite di�erence approximation of order 4. This is achieved by truncatingthe Taylor series to order 4 (by setting all the Taylor series coe�cients ai;j;k tozero for i� j� k > 4). The procedure is straightforward but extremely tediousand requires substantial symbolic manipulations. Some fourth-order compact®nite di�erence schemes for the 3D elliptic di�erential equations were obtainedby Ananthakrishnaiah et al. [1] using a lot of pencil and paper analysis. Suchapproximations may now be derived using the symbolic software such asMathematica or Maple [12]. The fourth-order 19-point approximation ofEq. (1) is given by

X18

l�0

clul � F0; �3�

where the coe�cients cl are given by

c0 � ÿ �24� h2�k20 � l2

0 � /20� � h�k1 ÿ k3 � l2 ÿ l4 � /5 ÿ /6��;

c1 � 2ÿ h4�2k0 ÿ 3k1 ÿ k2 � k3 ÿ k4 ÿ k5 ÿ k6�

� h2

84k2

0

� � k0�k1 ÿ k3� � l0�k2 ÿ k4� � /0�k5 ÿ k6��;

c2 � 2ÿ h4�2l0 ÿ l1 ÿ 3l2 ÿ l3 � l4 ÿ l5 ÿ l6�

� h2

84l2

0

� � k0�l1 ÿ l3� � l0�l2 ÿ l4� � /0�l5 ÿ l6��;

c3 � 2� h4�2k0 � k1 ÿ k2 ÿ 3k3 ÿ k4 ÿ k5 ÿ k6�

� h2

84k2

0

� ÿ k0�k1 ÿ k3� ÿ l0�k2 ÿ k4� ÿ /0�k5 ÿ k6��;


c4 � 2� h4�2l0 ÿ l1 � l2 ÿ l3 ÿ 3l4 ÿ l5 ÿ l6�

� h2

84l2

0

� ÿ k0�l1 ÿ l3� ÿ l0�l2 ÿ l4� ÿ /0�l5 ÿ l6��;

c5 � 2ÿ h4�2/0 ÿ /1 ÿ /2 ÿ /3 ÿ /4 ÿ 3/5 � /6�

� h2

84/2

0

� � k0�/1 ÿ /3� � l0�/2 ÿ /4� � /0�/5 ÿ /6��;

c6 � 2� h4�2/0 ÿ /1 ÿ /2 ÿ /3 ÿ /4 � /5 ÿ 3/6�

� h2

84/2

0

� ÿ k0�/1 ÿ /3� ÿ l0�/2 ÿ /4� ÿ /0�/5 ÿ /6��;

c7 � 1� h2�k0 � l0� �

h8�k2 ÿ k4 � l1 ÿ l3� �

h2

4k0l0;

c8 � 1ÿ h2�k0 ÿ l0� ÿ

h8�k2 ÿ k4 � l1 ÿ l3� ÿ

h2

4k0l0;

c9 � 1ÿ h2�k0 � l0� �

h8�k2 ÿ k4 � l1 ÿ l3� �

h2

4k0l0;

c10 � 1� h2�k0 ÿ l0� ÿ

h8�k2 ÿ k4 � l1 ÿ l3� ÿ

h2

4k0l0;

c11 � 1� h2�k0 � /0� �

h8�k5 ÿ k6 � /1 ÿ /3� �

h2

4k0/0;

c12 � 1� h2�l0 � /0� �

h8�l5 ÿ l6 � /2 ÿ /4� �

h2

4l0/0;

c13 � 1ÿ h2�k0 ÿ /0� ÿ

h8�k5 ÿ k6 � /1 ÿ /3� ÿ

h2

4k0/0;

c14 � 1ÿ h2�l0 ÿ /0� ÿ

h8�l5 ÿ l6 � /2 ÿ /4� ÿ

h2

4l0/0;

c15 � 1� h2�k0 ÿ /0� ÿ

h8�k5 ÿ k6 � /1 ÿ /3� ÿ

h2

4k0/0;

c16 � 1� h2�l0 ÿ /0� ÿ

h8�l5 ÿ l6 � /2 ÿ /4� ÿ

h2

4l0/0;

c17 � 1ÿ h2�k0 � /0� �

h8�k5 ÿ k6 � /1 ÿ /3� �

h2

4k0/0;


c18 � 1ÿ h2�l0 � /0� �

h8�l5 ÿ l6 � /2 ÿ /4� �

h2

4l0/0;

F0 � h2

2�6f0 � f1 � f2 � f3 � f4 � f5 � f6�

� h3

4�k0�f1 ÿ f3� � l0�f2 ÿ f4� � /0�f5 ÿ f6��:

When k � l � / � 0, Eq. (1) reduces to the 3D Poisson equation and the ®nitedi�erence scheme (3) reduces to the 19-point formula, see, e.g. Ref. [12].

Our fourth-order ®nite di�erence approximation given above is in a compactform in the sense that it only involves the 18 neighboring grid points nearest tothe reference grid point in a unit cube. No special formulas are needed forapproximations at grid points near the boundaries. The compactness alsomeans that the computed accuracy of the scheme is increased at the expense ofonly a slight increase in the density of the sparse matrix structure as comparedto the conventional 7-point O�h2� stencil. A major advantage of the compactscheme over noncompact schemes is that it reduces communication costs inparallel computation associated with domain decomposition techniques.

After suitable boundary conditions are incorporated into the ®nite di�erenceequations (3) at all internal grid points we obtain a system of linear equations

Au � b: �4�The coe�cient matrix A is very large and sparse; it is nonsymmetric andnonpositive de®nite if the convection coe�cients are not identically zero. Thematrix A loses diagonal dominance when the convection coe�cients assumelarge values. Our experience with the 2D problems [23] showed that there is nostability di�culty with the fourth-order compact schemes even without thediagonal dominance. Our Fourier smoothing analysis in Section 3.2 and nu-merical results in Section 4 show that there are no stability problems in 3D caseas well, and the multigrid method converges without the diagonal dominance.

Except for nominal arithmetic operations to compute the stencil coe�cients(which can usually be done once for all grid points at the beginning of thecomputation), one iteration of a Gauss±Seidel type iterative method with the19-point scheme requires 37 operations, while one iteration with the standard7-point scheme requires 13 operations; thus the implementation cost of the 19-point scheme is almost 3 times as expensive as that of the 7-point scheme. Therelatively high cost of the 19-point scheme is repaid by the high accuracy of thecomputed solution. Suppose that the meshsize used for the 19-point scheme ish19 and that for the 7-point scheme is h7. If comparable accuracy can beachieved by choosing h19 � 2h7, the size of the linear system from the 19-pointscheme is only about 1=8 of the size of the linear system from the 7-pointscheme. If convergence rate remains the same, the 19-point scheme is 104=37 �2:8 times faster than the 7-point scheme. In practice, the 19-point scheme


frequently requires much coarser discretization, say, h19 > 4h7 or evenh19 � 8h7, to yield a comparable accuracy. This indicates that the relativecomputational cost of using the 19-point compact scheme may be even lower.

3. Solution strategies

In this section, we discuss solution strategies for e�ciently solving the linearsystem (4).

3.1. Multigrid method

It is well known that classical iterative (relaxation) methods converge veryslowly for solving large linear systems. We resort to the multigrid methodwhich has been shown to be very e�ective for solving discretized ellipticproblems [21].

The multigrid method is based on the idea that classical relaxation methodsstrongly damp the oscillatory error components, but converge slowly forsmooth error components. Hence, after a few relaxation sweeps, we computethe smooth residual and project it to a coarser grid on which the smooth errorcomponents become more oscillatory. Solving the residual equation on acoarse grid, interpolating the error correction back to the ®ne grid, and addingit to the current approximate solution gives us the two-level method. Themultigrid method exploits the idea that the residual equation on the coarse gridhas a similar structure as the original problem on the ®ne grid and the basicidea of the two-level method can be applied recursively. Therefore, on thecoarse grid, relaxation sweeps are carried out and the smooth residual isprojected to an even coarser grid. This process may go down to the coarsestgrid where a direct solver or several relaxation sweeps may be employed toobtain the solution (both approaches are cheap because the size of the linearsystem on the coarsest grid is small). Then the corrections are interpolated backto ®ner grids until the process reaches the ®nest grid and the ®ne gridapproximate solution is corrected. The procedure just described is a multigridV -cycle algorithm. In a multigrid V �m1; m2�-cycle algorithm, we carry out m1

relaxation sweeps on a given grid before going to a coarser grid and do m2

relaxation sweeps after adding the coarse grid correction to the currentapproximation.

3.2. Fourier smoothing analysis

The convergence of the multigrid method may be predicted by a Fouriersmoothing analysis. We assume a cubic domain X � �0; 1�3, periodic boundaryconditions, a uniform ®nite di�erence molecule and row-wise upward point-


numbering in each x±y plane marching along z direction (lexicographic order).Our methodology and notations are similar to those used by Wesseling [21] andexempli®ed by Bandy [2]. An alternative methodology (half space analysis) canbe found in Ref. [6].

Smoothing methods are certain basic iterative procedures, which are typi-cally based on some (regular) splittings of the coe�cient matrix A of linearsystem (4)

A � M ÿ N ;

and the error ampli®cation matrix is given by

E � I ÿMÿ1A:

It is convenient to use stencil notations and we assume that the matrices A,M and N are represented in stencil notation by �A�, �M � and �N �, respectively.For the analysis purpose, we assume that the convection coe�cients k; c;/ inEq. (1) are constant. If, in addition, the periodic boundary conditions are as-sumed, the stencils �A�, �M � and �N � are independent of the grid point �i; j; k�.We assume that E has a complete set of eigenfunctions (local modes) U�h�.Suppose the error vectors before and after the smoothing process are ek andek�1, we have

ek�1 � Eek:

It follows that

EU�h� � q�h�U�h�;where q�h� is an eigenvalue associated with the eigenfunction U�h�. Theeigenfunctions of E are

Ui;j;k�h� � ei�ih1�jh2�kh3�; h 2 H�h�;where i � ��ÿ1

pand h � �h1; h2; h3�. H�h� is de®ned as

H�h� � �h1; h2; h3�: hl

n� 2phml;

ml � ÿ n2ÿ 1;ÿ n

2; . . . ;

n2

; l � 1; 2; 3o;

�5�

and h � 1=n is the uniform meshsize used in our ®nite di�erence scheme (n isthe number of grid points in each coordinate direction and is assumed to beeven). The corresponding (absolute values of the) eigenvalues of E are given by

q�h� �P

d N�d�ei�ih1�jh2�kh3�� P

d M�d�ei�ih1�jh2�kh3�� ; h 2 H�h�; �6�

where d � �i; j; k�. For simplicity, we only perform here the smoothing analysisof the (point) lexicographic Gauss±Seidel method, though most other relax-ation methods may be analyzed analogously. After substituting the stencilvalues (see Fig. 1 and Eq. (3)) into Eq. (6), we have


q�h� �c1 eih1 � c2 eih2 � c5 eih3 � c7 ei�h1�h2� � c8 ei�ÿh1�h2�

�c11 ei�h1�h3� � c12 ei�h2�h3� � c13 ei�ÿh1�h3� � c14 ei�ÿh2�h3�

�� c0 ÿ c3 eÿih1 � c4 eÿih2 � c6 eÿih3 � c9 ei�ÿh1ÿh2� � c10 ei�h1ÿh2�

�� c15 ei�h1ÿh3� � c16 ei�h2ÿh3� � c17 ei�ÿh1ÿh3� � c18 ei�ÿh2ÿh3�

�� :In the multigrid method, we are interested in determining how the high fre-quencies (relative to the meshsize) are damped. The set of the low frequencies isde®ned as

HL�h� � H�h� \�ÿ p

2;p2

�3

;

the set of the high frequencies is de®ned as

HH�h� � H�h� nHL�h�:The Fourier smoothing factor is de®ned as

l � maxh2HH�h�

q�h�: �7�

The e�ect of Dirichlet boundary conditions can be incorporated heuristi-cally as follows. Since the errors are zero on the boundary, we ignore thefrequencies associated with h1 � 0, and/or h2 � 0 and/or h3 � 0. We thenrede®ne the set of high frequencies in the case of the Dirichlet boundaryconditions as

~HH�h� � HH�h� \ fh 2 H�h�: h1 6� 0; and=or h2 6� 0; and=or h3 6� 0g;and the corresponding smoothing factor ~l is de®ned as

~l � maxh2 ~HH�h�

q�h�: �8�

The above two de®nitions of smoothing factor are grid-dependent becausethey depend on h. The smoothing factor de®nition can be made grid-inde-pendent by changing the de®nition of the discrete set H�h� in Eq. (5) to be

H�h� � �h1; h2; h3�: hl 2 �f ÿ p; p�; l � 1; 2; 3g;and by replacing Eqs. (7) and (8) with

l � suph2HH�h�

q�h�; and ~l � suph2 ~HH�h�

q�h�;

respectively.It is very di�cult to obtain an analytical solution to Eq. (8). We numerically

compute ~l for some values of the convection coe�cients. Four cases areconsidered:

Case 1: k � l � / � Re; Case 2: k � l � ÿ/ � Re;Case 3: k � l � Re;/ � 0; Case 4: k � ÿl � Re;/ � 0;

where Re varies from 1 to 109. Throughout this paper, Re is a constant used toroughly indicate the magnitude of the convection coe�cients and is often


referred to as the Reynolds number re¯ecting the ratio of the convection todi�usion. We choose a uniform gridsize h � 0:01. For the above four cases, weplot in Fig. 2 the smoothing factor ~l as a function of logarithm of Re.

It is clear from Fig. 2 that the smoothing factor ~l is smaller than 1 uniformly inRe which indicates that the iterations would converge for all values of Reynoldsnumber Re. We note here that the smoothing factor ~l increases as Re increases.For the isotropic cases (Cases 1 and 2), ~l approaches some constants (0:7212 and0:8199, respectively) as Re!1. For the anisotropic cases (Cases 3 and 4), ~lapproaches 1 from below as Re!1. In all cases, ~l seems to reach an asymptoticvalue when Re P 106, which is usually referred to as the Reynolds number limit.The behavior of these iterative methods changes little beyond Re� 106.

The Fourier smoothing analysis indicates that the (point) lexicographicGauss±Seidel relaxation method may be a good smoother when the Reynoldsnumber Re is not too large and convection is not dominant in Eq. (1). Weexpect that the multigrid method using our 19-point compact scheme with the(point) lexicographic Gauss±Seidel relaxation and appropriate inter-gridtransfer operators would converge for all (moderate) values of the convectioncoe�cients, though the convergence may deteriorate for convection-dominatedand anisotropic problems. (We note here that the asymptotic behavior exhib-ited in Fig. 2 may not be re¯ected in actual computations for convectiondominated cases where special techniques may need to be employed. Further,

Fig. 2. Smoothing factor ~l as a function of logarithm of Re.


the asymptotic behavior exhibited for Re values in the range 100±109 does notnecessarily provide an indicator of performance for large Reynolds numbers.)

3.3. Parallelization and vectorization by coloring grid

It is known that the relaxation direction has a strong in¯uence on theconvergence of the lexicographic Gauss±Seidel type methods when the con-vection±di�usion equation represents strong convection in a particular direc-tion. The direction of the relaxation should generally follow the convectiondirection, otherwise the convergence could be seriously deteriorated. When theconvection direction is unknown, it is important to use robust methods whoseconvergence does not strongly depend on the convection direction.

Since all grid points are updated using similar information (simultaneousreplacement), the convergence of the Jacobi iteration method does not dependon the convection direction and the Jacobi method can be totally parallelized.The drawback of the Jacobi method is that when it is used as a smoother in themultigrid method, it usually needs to be damped by a damping factor which isvery di�cult to estimate for most practical problems. Even with a dampingfactor obtained by trial and error, the smoothing e�ect of the (damped) Jacobirelaxation is usually poor.

The lexicographic Gauss±Seidel relaxation, which has a better smoothinge�ect than the Jacobi relaxation, is often used as the smoother in the multigridmethod. For parallelization and vectorization bene®t, we may reorder the gridpoints by dividing them into several colored groups so that relaxation sweepscan be carried out with respect to each group independently. The rate ofconvergence of a colored Gauss±Seidel relaxation lies between that of thelexicographic Gauss±Seidel (following the convection direction) and the Jacobirelaxations. In the 2D case, four colors are needed to decouple a 9-pointcompact scheme and there are essentially two di�erent arrangements of thefour colors that may decouple the grid points completely. In the 3D case withour 19-point compact scheme, we ®nd that four colors are also su�cient, butthere is only one arrangement that can be used to completely decouple the 3Dgrid points. For simplicity, we assume that red (R), black (B), green (G) andorange (O) colors are used. For a grid point with a given color, it is necessarythat the nearest grid points along the three coordinate directions are markedwith di�erent colors. Fig. 3 depicts a reference point colored with red and its 18nearest neighboring grid points are colored with black, green and orange. Notethat updating a red point needs values of 2 nearest and 4 next nearest gridpoints marked with each of the other three colors. This arrangement is po-tentially advantageous for estimating correct weights for a scaled residual in-jection factor by a heuristic residual analysis [22], since, on parallel computers,the residual injection may be more cost-e�ective than the traditional full-weighting operator.


For the 19-point compact discretization scheme, we noted above that if thegrid is colored by four colors, all grid points with each color can be updatedsimultaneously on parallel computers and four sub-sweeps can be carried outto perform a Gauss±Seidel relaxation on the whole grid. This approach is re-ferred to as four-color Gauss±Seidel relaxation in the sequel. Our four-colorGauss±Seidel relaxation is programmed as four parts (four separate loops);each part updates all grid points of the same color simultaneously using valuesof the nearest grid points of di�erent colors.

The four-color Gauss±Seidel relaxation leads to highly vectorizable andparallelizable solvers. E�cient vectorization is obtained since relaxation ongrid points with the same color no longer contains vector feedback depen-dencies [2]. Parallelization is obtained since the grid points with each color arenow decoupled and all the equations of a single color can be computed inde-pendently of the other colors. The computations are performed in a number ofparallel operations equal to the number of independent colors. In addition tothe gains in parallelization and vectorization, practical experience showed thatbetter convergence and smoothing properties are usually obtained with mul-tiple color ordering [2].

3.4. Standard inter-grid transfer operators

The prolongation operator for transferring coarse grid correction fromcoarse to ®ne grids is the trilinear interpolation. Speci®cally, correction valuesof common points of ®ne and coarse grids are directly transferred. Depending

Fig. 3. Decoupling the 3D grid points with four colors.


on the location, correction values of other ®ne grid points are obtained byaveraging correction values of the nearest two, four or eight points on thecoarse grid. For detailed formulas, see Ref. [21, p. 67].

The residual restriction operator for projecting residual from ®ne to coarsegrids is the full-weighting scheme. The full-weighting operator evaluates re-siduals at all ®ne grid points and projects a weighted average of residuals at thenearest 27 points in a cube centered at the reference point according to thefollowing formula:

�r�i;�j;�k �1

648ri;j;k

� � 4�ri�1;j;k � riÿ1;j;k � ri;j�1;k � ri;jÿ1;k � ri;j;k�1 � ri;j;kÿ1�

� 2�ri�1;j�1;k � riÿ1;j�1;k � ri�1;jÿ1;k � riÿ1;jÿ1;k

� ri�1;j;k�1 � riÿ1;j;k�1 � ri;j�1;k�1 � ri;jÿ1;k�1 � ri�1;j;kÿ1

� riÿ1;j;kÿ1 � ri;j�1;kÿ1 � ri;jÿ1;kÿ1�� ri�1;j�1;k�1 � riÿ1;j�1;k�1 � ri�1;jÿ1;k�1 � riÿ1;jÿ1;k�1

� ri�1;j�1;kÿ1 � riÿ1;j�1;kÿ1 � ri�1;jÿ1;kÿ1 � riÿ1;jÿ1;kÿ1

�: �9�

Here ri;j;k is the residual on the ®ne grid at a red point �i; j; k�. �r�i;�j;�k is thequantity to be transferred to the corresponding coarse grid point��i; �j; �k� � �i=2; j=2; k=2�.

3.5. Scaled residual injection operator

Another method to transfer residuals from ®ne to coarse grids is to directlyinject the residual. This approach is based on the observation that the residualis smooth and the residual values at nearby grid points do not di�er by a largeamount after several relaxation sweeps. In this way, only the values of theresidual corresponding to the coarse grid need to be evaluated. The cost of theinjection operator is about one-eighth of that of the full-weighting operatorthough the standard full-weighting operator is usually more accurate than theinjection operator. However, since the cost of the multigrid method using thefull-weighting operator is approximately equivalent to one more relaxationsweep than that of the multigrid method using the injection operator, the ef-®ciency of the overall scheme will be a�ected by the trade-o�s between morerelaxation sweeps (smoother residual) and more accurate (most costly) residualprojection operators. In the case of the 3D 7-point scheme for some di�usionproblems and on a vector computer, these trade-o�s were studied by Gary et al.[10] who preferred simpler inter-grid transfer operators with strong relaxationover the costlier weighting schemes.


When colored relaxations are used as smoothers in the multigrid method,the method of direct residual injection is inaccurate. This was discovered forthe red±black Gauss±Seidel relaxation in our earlier work in two dimensions[22] where we found the scaled injection operator (half-injection or under-in-jection) to be extremely e�ective for all types of 2D problems. We extend thisoperator to 3D problems because of its computational e�ciency.

For the four-color Gauss±Seidel relaxation in 3D case, the residual values atgrid points with di�erent colors may di�er by a substantial amount. In par-ticular, the residual values at all the orange points are zero, because theequations at these points are satis®ed exactly after one full relaxation sweep(one sub-sweep on each of the four colored groups of grid points).

In order to ®nd the correct scaling factor for the four-color Gauss±Seidelrelaxation, we perform a heuristic residual analysis similar to our analysis for2D problems [22,24]. The residual analysis is based on some heuristic as-sumptions on the properties of the residual: First, the residual is su�cientlysmooth after the pre-smoothing sweeps. Second, the residual values in a smallneighborhood are approximately equal provided all grid points are subject tothe same relaxation method. (In a colored relaxation, the second assumptionmeans that only the residual values at the nearest grid points with the samecolor are approximately equal.) Moreover, we assume that the anisotropy isnot strong. (In the current case, this assumption means that the convectioncoe�cients do not di�er by a large magnitude.)

To ®nd the optimal residual injection operator with the optimal scalingparameter, we consider the full-weighting scheme (9) with the four-colorGauss±Seidel relaxation. Since residual values at all the orange points are zero,as argued above, we have

ri�1;j�1;k � ri�1;jÿ1;k � riÿ1;j�1;k � riÿ1;jÿ1;k � ri;j;k�1 � ri;j;kÿ1 � 0: �10�The residual values at grid points with di�erent colors are nonzero. To

determine an approximate relationship among their values, we use the fol-lowing arguments. The four-color Gauss±Seidel relaxation is performed in theorder of red, black, green and orange points. At each full sweep, the red pointsare updated by using old values of all other colors; the black points are updatedby using 6 new red points, 6 old green points and 6 old orange points. Hence,the residual values at black points are smaller than those at red points. Sincethe update of black points uses 1=3 new values, we assume that the residualvalue at a black point is about 2=3 of that at the nearest red points, i.e.

ri�1;j;k � riÿ1;j;k � ri;j�1;k�1 � ri;jÿ1;k�1 � ri;j�1;kÿ1 � ri;jÿ1;k�1 � 2

3ri;j;k: �11�

Analogously, updating a green point uses 6 new red points, 6 new black pointsand 6 old orange points, we assume that the residual value at a green point isabout 1=3 of that of the nearest red points, i.e.


ri;j�1;k � ri;jÿ1;k � ri�1;j;k�1 � riÿ1;j;k�1 � ri�1;j;kÿ1 � riÿ1;j;kÿ1 � 1

3ri;j;k: �12�

We look for an optimal scaling factor a such that ari;j;k approximates �r�i;�j;�k asaccurately as possible. Substituting �r�i;�j;�k � ari;j;k, and Eqs. (10)±(12) into Eq. (9)after simpli®cation, we obtain a � 1=2. Thus, for the 3D four-color Gauss±Seidel relaxation with the 19-point compact scheme, we propose a half-injec-tion operator to transfer residuals from ®ne to coarse grids. A similarhalf-injection operator was used by us with the red±black Gauss±Seidelrelaxation for the 2D 5-point scheme [22].

Suppose the order of the restriction operator is Mr and that of the prolon-gation operator is Mp. Let Md be the order of the underlying di�erentialequation. Brandt [4] argued that the inequality

Mr �Mp P Md �13�must be satis®ed in order to guarantee asymptotic convergence. For the scaledinjection implementation, we have Mr � 0, Mp � 2 and Md � 2 and inequality(13) is satis®ed. It is often suggested in the literature that in order to guaranteeper cycle convergence, we should also have Mr P Md. The latter condition is notsatis®ed with our scaled injection operator. Nevertheless, the numerical resultsin the next section demonstrate that our implementation with scaled injectionmay be more e�cient in terms of less CPU time than the full-weighting op-erator.

4. Numerical results

In our numerical experiments, the domain X is chosen as the unit cube�0; 1�3. we solve Dirichlet problems for the convection±di�usion equations (1)for a number of test problems; the forcing function f �x; y; z� in Eq. (1) isgenerated from the knowledge of exact solution u�x; y; z�. Our codes were runmainly on a C-90 supercomputer at the Pittsburg Supercomputing Centerand we used the Cray Fortran 77 programming language in single precisionarithmetic (roughly equivalent to double precision on conventional ma-chines). C-90 is a vector machine with 16 processors. We precomputed andstored all entries of the coe�cient matrices on all grid levels (this computa-tional cost is included in the CPU time reported). Unless otherwise indicatedexplicitly, we used a uniform meshsize varying from h � 1=16 to h � 1=64,with the ®nest grid containing 250,047 unknowns. The initial approximationfor all iterations was taken to be u�x; y; z� � 0 and the computations wereterminated when the residual in discrete L2-norm was reduced by a factor of1010. We report maximum errors as the maximum absolute error over all gridpoints.


4.1. Test problems

We chose four test problems with both constant and variable coe�cients.All test problems were solved using the standard multigrid technique withstandard full-weighting or scaled injection operators used as restriction oper-ators, and the standard trilinear interpolation used as the prolongation oper-ator. The test problems were solved with our 19-point ®nite di�erenceapproximation scheme and the standard upwind di�erence approximationscheme for the convection±di�usion equations (1). We did not test the 7-pointcentral di�erence approximation because it is unstable for large Re. All testproblems were solved for small to moderate values of the Reynolds number(Re6 103).

In this paper, we present numerical results for small to moderate Re �6 103�,the same range as tested by Dendy [8] and Bandy [2]. Problems with largeReynolds numbers require completely di�erent solution strategies.

Problem 1.

k�x; y; z� � ÿRe cosa cosb;

l�x; y; z� � ÿRe cosa sinb;

/�x; y; z� � ÿRe sina;

u�x; y; z� � sinpx sinpy sinpz;

with a � 35�; b � 45�. This problem has constant coe�cients.

Problem 2.

k�x; y; z� � Re siny sinz cosx;

l�x; y; z� � Re sinx sinz cosy;

/�x; y; z� � Re sinx siny cosz;

u�x; y; z� � xyz�1ÿ x��1ÿ y��1ÿ z�exp�x� y � z�:This problem has smooth solutions and most solution methods should workwell with this test problem.

Problem 3.

k�x; y; z� � Re x�1ÿ 2y��1ÿ z�;l�x; y; z� � Re y�1ÿ 2z��1ÿ x�;


/�x; y; z� � Re z�1ÿ 2x��1ÿ y�;u�x; y; z� � sinpx� sinpy � sinpz� sin3px� sin3py � sin3pz:

This problem has variable coe�cients and there is a 3D stagnation point in thecomputational domain at �0:5; 0:5; 0:5� where all convection coe�cients vanish.There are many 2D stagnation points where at least two convection coe�cientsvanish. These stagnation points present convergence di�culty for the multigridsolution method, especially when Re is large [7].

Problem 4.

k�x; y; z� � Rex�1ÿ y��2ÿ z�;l�x; y; z� � Rey�1ÿ z��2ÿ x�;/�x; y; z� � Re z�1ÿ x��2ÿ y�;u�x; y; z� � �exp�25x� � exp�25y� ÿ 2�=�exp�25� ÿ 1�:

Here the solution is almost zero everywhere except near x � 1 and y � 1, whereit has a very thin boundary layer. Most solution methods have di�culty inaccurately resolving the solution of such problems.

4.2. Computed accuracy

We ®rst tested the computed accuracy of the ®nite di�erence scheme. Forseveral values of Re, we re®ned the meshsize from h � 1=16 to h � 1=64 todemonstrate that the computed accuracy increases rapidly. For a fourth-orderconvergence rate, the errors are expected to decrease approximately by a factorof 16 when the meshsize is halved. The accuracy data are given in Table 1which also contains the estimated order of convergence as computed byln�je32j=je64j�= ln 2; where je32j and je64j are the maximum errors associatedwith h � 1=32 and h � 1=64, respectively. It is clear from Table 1 that our ®nitedi�erence scheme yields a fourth-order convergence rate for small to moderateReynolds numbers though there is a slight degradation in the rates of con-vergence as Re is increased. We still obtain convergence rates of order O�h3� orbetter and expect that fourth-order convergence would be present when thegrid size h is further re®ned. Similar behavior is observed for the fourth-ordercompact scheme, and for all other methods, in the case of large Reynoldsnumber in the two dimensions [13,23].

For comparison, we did similar computations using the standard 7-pointupwind scheme. These results are listed in Table 2 and it is clear that the7-point upwind scheme yields approximate solutions of ®rst-order accuracy. Asexpected, the solutions produced by the fourth-order compact scheme using a


coarse grid (h � 1=16) are much more accurate than those produced by the®rst-order upwind scheme using h � 1=64.

The resolution of thin boundary layers is a di�cult problem with all dis-cretization schemes. If the discretization is not ®ne enough and there are notenough points in the boundary layers, no standard scheme yields good results.The fourth-order compact scheme and the upwind scheme both performedrelatively poorly at Re� 103. However, as the mesh is re®ned to allow somepoints inside the boundary layers, the accuracy of the fourth-order compactscheme increases rapidly. This observation agrees with that made in the 2Dcase by Gupta et al. [13].

Standard treatments for boundary layers include local adaptive mesh re-®nement techniques and graded mesh techniques. The graded mesh techniqueshave been successfully used with the 2D fourth-order compact schemes [14,19]and it has been shown that the fourth-order accuracy can be recovered by suchtreatments. Similar techniques could be utilized for 3D problems as well.

4.3. Vectorization and ordering e�ect

We now compare the four-color Gauss±Seidel relaxation with the lexico-graphic Gauss±Seidel relaxation in terms of the number of iterations and thecorresponding CPU time in seconds. A multigrid V �2; 2�-cycle algorithm was

Table 1

Maximum errors and the estimated order of convergence rate for the fourth-order compact scheme

Test problems Re

100 101 102 103

Problem 1 h � 1=16 1.50(ÿ5) 5.73(ÿ5) 1.59(ÿ3) 9.04(ÿ3)

h � 1=32 9.45(ÿ7) 3.59(ÿ6) 1.03(ÿ4) 1.15(ÿ3)

h � 1=64 5.96(ÿ8) 2.24(ÿ7) 6.45(ÿ6) 8.36(ÿ5)

Convergence rate 4.00 4.00 4.00 3.78

Problem 2 h � 1=16 1.54(ÿ6) 3.69(ÿ6) 4.01(ÿ5) 5.54(ÿ4)

h � 1=32 9.58(ÿ8) 2.33(ÿ7) 2.50(ÿ6) 4.01(ÿ5)

h � 1=64 5.98(ÿ9) 1.45(ÿ8) 1.56(ÿ7) 2.50(ÿ6)


Problem 3 h � 1=16 1.47(ÿ3) 1.49(ÿ3) 9.41(ÿ3) 9.39(ÿ2)

h � 1=32 9.07(ÿ5) 9.23(ÿ5) 6.12(ÿ4) 7.69(ÿ3)

h � 1=64 5.56(ÿ6) 5.75(ÿ6) 3.87(ÿ5) 5.54(ÿ4)


Problem 4 h � 1=16 2.13(ÿ2) 3.49(ÿ2) 1.32(ÿ1) 2.77(ÿ1)

h � 1=32 1.43(ÿ3) 2.46(ÿ3) 1.49(ÿ2) 5.42(ÿ2)

h � 1=64 9.11(ÿ5) 1.60(ÿ4) 1.17(ÿ3) 6.86(ÿ3)



used to solve the resulting linear system. Standard coarsening technique�H=h � 2� was used and the same 19-point discretization scheme was appliedon all grids. On the coarsest grid with only one unknown (h � 1=2), one iter-ation was performed to obtain exact solution. As mentioned earlier, the di-rection of the lexicographic Gauss±Seidel relaxation in¯uences the convergenceof the iterative method. We de®ne the ``forward'' direction such that the re-laxation visits the grid points following the dimensional direction 0! 1.Similarly, the ``backward'' direction is de®ned as following 1! 0. The direc-tion of the four-color Gauss±Seidel relaxation is understood analogouslywithin each sub-sweep.

To test the grid-independent convergence of the multigrid method, we usedh � 1=32 and h � 1=64. Numerical results for several values of Re are given inTables 3±6. we note that for small Reynolds number problems with the lexi-cographic Gauss±Seidel relaxation, the relaxation direction had almost no ef-fect on the convergence, but did a�ect the computational e�ciency on thevector computer. For the current test situation, the backward relaxation wasmore e�cient than the forward relaxation in terms of the CPU time. Onlyminimal direction e�ect was observed with the four-color Gauss±Seidel re-laxation; hence the latter may be more robust with respect to the computa-tional e�ciency.

Table 2

Maximum errors and the estimated order of convergence rate for the standard upwind scheme

Test problems Re

100 101 102 103

Problem 1 h � 1=16 1.45(ÿ2) 1.28(ÿ1) 2.71(ÿ1) 2.98(ÿ1)

h � 1=32 8.09(ÿ3) 6.82(ÿ2) 1.48(ÿ1) 1.66(ÿ1)

h � 1=64 4.28(ÿ3) 3.54(ÿ2) 7.82(ÿ2) 8.87(ÿ2)


Problem 2 h � 1=16 8.17(ÿ4) 6.31(ÿ3) 2.32(ÿ2) 2.98(ÿ2)

h � 1=32 3.65(ÿ4) 3.15(ÿ3) 1.23(ÿ2) 1.60(ÿ2)

h � 1=64 1.71(ÿ4) 1.58(ÿ3) 6.34(ÿ3) 8.33(ÿ3)


Problem 3 h � 1=16 7.93(ÿ2) 7.13(ÿ2) 5.69(ÿ1) 1.05(+0)

h � 1=32 1.93(ÿ2) 4.60(ÿ2) 3.22(ÿ1) 5.92(ÿ1)

h � 1=64 4.69(ÿ3) 2.75(ÿ2) 1.72(ÿ1) 3.04(ÿ1)


Problem 4 h � 1=16 1.74(ÿ1) 3.08(ÿ1) 5.40(ÿ1) 6.04(ÿ1)

h � 1=32 5.15(ÿ2) 1.38(ÿ1) 3.03(ÿ1) 3.65(ÿ1)

h � 1=64 1.51(ÿ2) 6.18(ÿ2) 1.59(ÿ1) 2.04(ÿ1)



For relatively large Reynolds number problems, convergence of the multi-grid method with the lexicographic Gauss±Seidel relaxation was strongly af-fected by the relaxation direction, though the four-color Gauss±Seidelrelaxation was much more robust. On the other hand, even with a favorablerelaxation direction, the lexicographic Gauss±Seidel took more CPU time thanthe four-color Gauss±Seidel method. The gains in computational e�ciency for

Table 4

Number of iterations and the CPU time in seconds for Problem 2

Re Lexicographic Gauss±Seidel Four-color Gauss±Seidel

Forward Backward Forward Backward

Iteration CPU Iteration CPU Iteration CPU Iteration CPU

h � 1=32

100 9 6.09 9 4.49 8 3.23 8 3.26

101 9 6.11 9 4.51 8 3.26 8 3.27

102 17 9.06 9 4.51 14 3.64 14 3.65

103 50 21.29 25 7.61 30 4.62 30 4.65

h � 1=64

100 9 50.97 9 37.48 8 25.69 8 25.71

101 9 50.93 9 37.78 8 26.17 8 26.18

102 16 72.20 10 39.34 14 27.72 14 27.74

103 64 219.65 17 50.51 34 33.40 34 33.43

Table 3





h � 1=32

100 9 3.85 9 2.24 8 1.01 8 1.02

101 9 3.85 10 2.44 10 1.13 10 1.14

102 8 3.45 28 5.92 18 1.62 18 1.64

103 63 23.65 82 16.38 61 4.15 61 4.19

h � 1=64

100 9 31.75 9 18.58 8 6.85 8 6.86

101 9 31.76 10 20.16 10 7.60 10 7.62

102 9 31.77 26 45.41 19 10.45 19 10.47

103 40 126.32 102 165.43 60 22.92 60 22.97


the four-color Gauss±Seidel method are due to the vectorization e�ciencyresulting from a complete decoupling of the grid space and analogous results ofCPU timing may not be reproduced on serial computers.

We observe from Table 6 that the convergence does not seem depend uponthe ordering of the grid points or the relaxation direction when the problemcontains stagnation points (representing the recirculating ¯ow, Problem 3). In

Table 6





h � 1=32

100 8 3.57 8 4.05 7 1.04 7 1.05

101 8 3.59 8 4.09 8 1.13 8 1.14

102 18 7.27 8 4.09 14 1.49 14 1.53

103 78 29.37 74 32.57 70 4.92 70 4.97

h � 1=64

100 8 29.62 8 33.75 7 7.44 7 7.46

101 8 29.77 8 34.03 7 7.63 7 7.65

102 17 57.12 9 37.58 14 9.73 14 9.76

103 66 206.10 52 191.14 54 21.85 54 21.96

Table 5





h � 1=32

100 8 3.59 8 2.18 7 1.06 7 1.08

101 9 3.98 8 2.21 7 1.07 7 1.10

102 11 4.72 11 2.80 10 1.25 10 1.28

103 40 15.46 40 8.44 40 3.11 40 3.16

h � 1=64

100 8 29.81 8 17.97 7 7.63 7 7.67

101 9 32.93 8 18.23 7 7.77 7 7.79

102 10 36.03 10 21.49 10 8.68 10 8.81

103 32 103.04 29 51.50 24 13.14 24 13.19


this case, the four-color Gauss±Seidel relaxation provides substantial savings inthe CPU time by providing better vectorization e�ciency.

For small to moderate Reynolds number cases, our multigrid implementa-tion demonstrates typical grid-independent convergence. For problems withlarge values of Re, our method did not show optimal performance but thiscould possibly be achieved by suitable modi®cation of our code.

4.4. E�ciency of scaled residual injection

In this section, we compare the e�ciency of the standard full-weighting andthe scaled residual injection. The comparisons were done on the C-90 vectorcomputer and on a (serial) SUN workstation. A multigrid V �1; 1�-cycle algo-rithm was used for the comparisons and the results are listed in Tables 7±9.Since the memory of the workstation is not as large as that of the C-90, we onlytested cases with h � 1=32.

We observe that there was no signi®cant di�erence in the number of itera-tions needed for convergence when using di�erent residual projection opera-tors. However, there are substantial savings in CPU time with the scaledinjection operator. The savings are not substantial on the C-90 vector com-puter as the preprocessing procedure (computing the coe�cient matrices on allgrid levels) took a major portion of the CPU time. (It is noted that solving a 3Dconvection±di�usion equation for a 33� 33� 33 grid in 3 seconds is no smallachievement.) On the serial SUN computer, the CPU savings are much moreattractive.

5. Concluding remarks

The traditional second-order central di�erence and the ®rst-order upwindschemes have their inherent di�culties, although some defect-correction tech-niques may be used to combine these two schemes to yield stable and second-order methods for the di�usion-dominated problems. It was remarked by

Table 7

Comparison of full-weighting and scaled residual injection for Problem 1

Re (h � 1=32) Iteration CPU time on C-90 CPU time on SUN

Weighting Injection Weighting Injection Weighting Injection

100 11 11 0.90 0.84 12.33 10.45

101 15 12 1.04 0.87 16.11 11.23

102 35 34 1.70 1.53 33.22 25.78

103 110 102 4.42 3.56 99.25 70.92


Brandt and Yavneh [5] that, at least in the 2D case, some defect-correctiontechniques may fail to improve computed accuracy for high Reynolds number¯ow problems and the ®rst-order upwind schemes may yield unreliable com-putational results. On the other hand, the fourth-order compact schemes,which combine the advantages of both the ®rst- and second-order schemes,provide high accuracy solutions without loss of stability. Our Fouriersmoothing analysis and numerical results con®rm the numerical stability of the19-point compact scheme in the context of the multigrid method.

We studied the parallelization and vectorization potential of the Gauss±Seidel relaxation by partitioning the grid space with four colors. It was arguedby Bandy [2] that eight colors are needed to decouple the grid points if all 27points of the unit cubic grid are used in a 3D ®nite di�erence scheme. Ourcompact scheme is truly advantageous in the sense that we are able to reducethe number of stencil points by a quarter and reduce the number of necessarycolors by a half.

Our test problems included both constant and variable coe�cient cases withrecirculation and boundary layers. The numerical experiments veri®ed thefourth-order convergence rate of our scheme for su�ciently ®ne discretizationand for small to moderate values of the Reynolds number. Computationalresults indicate that the four-color Gauss±Seidel relaxation is more robust and

Table 8




100 11 11 3.14 3.08 14.46 11.84

101 12 11 3.19 3.10 14.34 11.41

102 26 26 3.68 3.55 26.26 21.05

103 49 49 4.50 4.23 46.13 36.01

Table 9




100 11 11 1.02 0.97 10.25 8.39

101 11 11 1.05 0.99 10.49 8.22

102 18 18 1.30 1.20 16.57 12.91

103 79 81 3.45 3.08 67.81 55.10


e�cient than the lexicographic Gauss±Seidel relaxation with respect to thecomputational e�ciency and the relaxation direction.

Based on a heuristic residual analysis, and extending our 2D work, weproposed a scaled injection operator for the four-color Gauss±Seidel multigridmethod. Numerical results on both vector and serial computers show that thescaled injection operator is more cost-e�ective than the standard full-weightingoperator.

For small to moderate Reynolds number problems, the four-color pointGauss±Seidel relaxation seems to provide an e�cient, easily parallelizable andvectorizable approach. It is very easy to implement and should be preferred tothe costlier alternating plane relaxation approach.

As predicted by Fourier smoothing analysis and observed in numerical ex-periments, the rate of convergence of our method deteriorates somewhat as theproblems tend to become convection-dominated. Based on our experience andthe literature on the 2D compact schemes, we note that the convergence ratesare subject to degradation with large values of Reynolds number Re. It isexpected though that a O�h4� convergence would be retrieved with su�ciently®ne grids, as has been observed in 2D.

The ®nite di�erence approximations used here can be derived using symbolicsoftware such as Mathematica [12]. Once a high order di�erence approxima-tion is available, it could be used with multigrid techniques to obtain highaccuracy solutions with high computational e�ciency.

We have used the minimal residual smoothing acceleration to accelerate themultigrid method for solving the 2D convection±di�usion equations with highReynolds numbers [23] and expect this method to work well in the 3D case aswell. With both strategies, a suitable scaling of the residual may be necessary tokeep the ®ne and coarse grid solutions in suitable scale. On the other hand, adi�erent approach using multilevel ILU preconditioning techniques with someiterative accelerators may be more robust with respect to the variation of theconvection magnitude [18].

Acknowledgements

This research was partially supported by a grant (DMS970001P) fromPittsburg Supercomputing Center.

References

[1] U. Ananthakrishnaiah, R. Manohar, J.W. Stephenson, Fourth-order ®nite di�erence methods

for three-dimensional general linear elliptic problems with variable coe�cients, Numer.

Methods Partial Di�erential Equations 3 (1987) 229±240.


[2] V.A. Bandy, Black box multigrid for convection±di�usion equations on advanced computers,

Ph.D. Thesis, University of Colorado at Denver, 1996.

[3] A. Behie, P.A. Forsyth Jr., Multi-grid solution of three dimensional problems with

discontinuous coe�cients, Appl. Math. Comput. 13 (1983) 229±240.

[4] A. Brandt, Rigorous quantitative analysis of multigrid: I. Constant coe�cients two-level cycle

with L2-norm, SIAM J. Numer. Anal. 31 (1994) 1695±1730.

[5] A. Brandt, I. Yavneh, Inadequency of ®rst-order upwind di�erence schemes for some

recirculating ¯ows, J. Comput. Phys. 93 (1991) 128±143.

[6] A. Brandt, I. Yavneh, On multigrid solution of high-Reynolds incompressible entering ¯ows, J.

Comput. Phys. 101 (1992) 151±164.

[7] A. Brandt, I. Yavneh, Accelerated multigrid convergence and high-Reynolds recirculating

¯ows, SIAM J. Sci. Comput. 14 (1993) 607±626.

[8] J.E. Dendy Jr., Two multigrid methods for three-dimensional equations with highly

discontinuous coe�cients, SIAM J. Sci. Statist. Comput. 8 (1987) 673±685.

[9] S.C.R. Dennis, J.D. Hudson, Compact h4 ®nite-di�erence approximations to operators of

Navier±Stokes type, J. Comput. Phys. 85 (1989) 390±416.

[10] J. Gary, S. McCormick, R. Sweet, Successive overrelaxation, multigrid, and preconditioned

conjugate gradient algorithms for solving a di�usion problem on a vector computer, Appl.

Math. Comput. 13 (1983) 285±309.

[11] C. Greif, J.M. Varah, Iterative solution of cyclically reduced systems arising from

discretization of the three dimensional convection±di�usion equation, SIAM J. Sci. Comput.

(to appear).

[12] M.M. Gupta, J. Kouatchou, Symbolic derivation of ®nite di�erence approximations for three

dimensional Poisson equation, Numer. Methods Partial Di�erential Equations 14 (1998) 593±

606.

[13] M.M. Gupta, R.P. Manohar, J.W. Stephenson, A single cell high order scheme for the

convection±di�usion equation with variable coe�cients, Int. J. Numer. Methods Fluids 4

(1984) 641±651.

[14] M.M. Gupta, R.P. Manohar, J.W. Stephenson, High-order di�erence schemes for two-

dimensional elliptic equations, Numer. Methods Partial Di�erential Equations 1 (1985) 71±80.

[15] M.M. Gupta, J. Kouatchou, J. Zhang, A fourth order compact solver for convection±di�usion

equations, J. Comput. Phys. 132 (1997) 226±232.

[16] W.H. Holter, A vectorized multigrid solver for the three dimensional Poisson equation, in:

A.H.L. Emmen (Ed.), Supercomputer Applications, North-Holland, Amsterdam, 1985, pp.

17±32.

[17] M. Li, T. Tang, B. Fornberg, A compact fourth-order ®nite di�erence scheme for the steady

incompressible Navier±Stokes equations, Int. J. Numer. Methods Fluids 20 (1995) 1137±1151.

[18] Y. Saad, J. Zhang, BILUM: block versions of multi-elimination and multi-level ILU

preconditioner for general sparse linear systems, SIAM J. Sci. Comput. (to appear).

[19] W.F. Spotz, G.F. Carey, Formulation and experiments with high-order compact schemes for

nonuniform grids, Int. J, Numer. Methods Heat Fluid Flow 8 (3) (1998) 288±303.

[20] C.-A. Thole, U. Trottenberg, Basic smoothing procedures for the multigrid treatment of

elliptic 3D-operator, in: D. Braess, W. Hackbusch, U. Trottenberg, (Eds.), Advances in Multi-

Grid Methods, Notes on Numerical Fluid Mechanics, vol. 11, Vieweg, Braunschweig, 1985,

pp. 102±111.

[21] P. Wesseling, An Introduction to Multigrid Methods, Wiley, Chichester, 1992.

[22] J. Zhang, A cost-e�ective multigrid projection operator, J. Comput. Appl. Math. 76 (1996)

325±333.

[23] J. Zhang, Accelerated high accuracy multigrid solution of the convection-di�usion equation

with high Reynolds number, Numer. Methods Partial Di�erential Equations 77 (1997) 73±

89.


[24] J. Zhang, Residual scaling techniques in multigrid. II: Practical applications, Appl. Math.

Comput. 90 (1998) 229±252.

[25] J. Zhang, On convergence and performance of iterative methods with fourth-order compact

schemes, Numer. Methods Partial Di�erential Equations 14 (1998) 263±280.


Documents

High accuracy multigrid solution of the 3D …jzhang/pub/MGDIR/zhang-gupta00a.pdfHigh accuracy multigrid solution of the 3D convection–di•usion equation Murli M. Gupta a,1, Jun