8
Geometry Optimization in Density Functional Methods J. ULISES REVELES, ANDREAS M. KO ¨ STER Departamento de Quı ´mica, CINVESTAV, Avenida Instituto Polite ´cnico Nacional 2508, A.P. 14-740 Me ´xico D.F. 07000, Me ´xico Received 23 January 2004; Accepted 23 January 2004 Abstract: The geometry optimization in delocalized internal coordinates is discussed within the framework of the density functional theory program deMon. A new algorithm for the selection of primitive coordinates according to their contribution to the nonredundant coordinate space is presented. With this new selection algorithm the excessive increase in computational time and the deterioration of the performance of the geometry optimization for floppy molecules and systems with high average coordination numbers is avoided. A new step selection based on the Cartesian geometry change is introduced. It combines the trust radius and line search method. The structure of the new geometry optimizer is described. The influence of the SCF convergence criteria and the grid accuracy on the geometry optimization are discussed. A performance analysis of the new geometry optimizer using different start Hessian matrices, basis sets and grid accuracies is given. © 2004 Wiley Periodicals, Inc. J Comput Chem 25: 1109 –1116, 2004 Key words: density functional methods; geometry optimization; deMon Introduction The recent improvements in computer architectures and the devel- opment of efficient density functional theory (DFT) methods have enabled the reliable theoretical investigation of much larger mol- ecules than ever before. As a consequence, first-principle geometry optimizations of systems with up to 100 atoms has become a standard task in modern quantum chemistry applications. These studies require robust and user friendly implementations of geom- etry optimizers. Over the last years several optimization algorithms based on natural, 1,2 redundant, 3 or delocalized 4 internal coordi- nates have been established. Optimizations using these coordinates are normally far more efficient, in terms of optimization cycles, than optimizations in Cartesian coordinates. Moreover, the use of appropriate coordinates is one of the best ways to improve the robustness of the geometry optimization. In particular, this is important for molecules containing unusual types of bonds or high coordinated atoms, such that the estimation of a reliable initial force constant matrix by molecular mechanics or semiempirical methods may fail. Originally, the generation of so-called natural internal coor- dinates for arbitrary molecular topologies was described by Pulay and Fogarasi. 1 A considerable simplification for the au- tomatic generation of redundant coordinates represent the use of the delocalized internal coordinates introduced by Baker, Kessi, and Delley. 4 Both coordinate sets significantly reduce the har- monic and anharmonic coupling between the various degrees of freedom. There are, however, some disadvantages in using the automatic generated delocalized internal coordinates. Gradients and Hessians have to be transformed into the delocalized internal coordinates. This transformation involves the construction and inversion of the transformation matrix (the Wilson B-Matrix) together with a num- ber of matrix multiplications. These operations introduce a cubic scaling with respect to the number of atoms. In most traditional ab initio approaches it matters little how much time is spent in the optimization step because most of the CPU time is used for the energy and gradient evaluation. For DFT methods, the situation is different as the scaling problem for the generation of atomic Becke weights already showed. 5 In the DFT program deMon, 6 which uses the variational approximation of the Coulomb potential 7 and the asymptotic ex- pansion for the calculation of three-center electron repulsion inte- grals, 8 linear algebra operations become dominant for matrix di- mensions above 2000 to 3000. 9 Because the Wilson B-Matrix may become large for automatic generated delocalized internal coordi- nates, the computational impact of the coordinate and gradient transformations has to be considered in DFT methods. Several approaches have been developed that take care of this problem by reducing the scaling 10 –12 of the transformations. Correspondence to: J. U. Reveles; e-mail: [email protected] Contract/grant sponsor: CONACyT; contract/grant numbers: 154871 (to J.U.R.), 36037-E, and 40379-F © 2004 Wiley Periodicals, Inc.

Geometry optimization in density functional methods

Embed Size (px)

Citation preview

Page 1: Geometry optimization in density functional methods

Geometry Optimization in Density Functional Methods

J. ULISES REVELES, ANDREAS M. KOSTERDepartamento de Quımica, CINVESTAV, Avenida Instituto Politecnico Nacional 2508,

A.P. 14-740 Mexico D.F. 07000, Mexico

Received 23 January 2004; Accepted 23 January 2004

Abstract: The geometry optimization in delocalized internal coordinates is discussed within the framework of thedensity functional theory program deMon. A new algorithm for the selection of primitive coordinates according to theircontribution to the nonredundant coordinate space is presented. With this new selection algorithm the excessive increasein computational time and the deterioration of the performance of the geometry optimization for floppy molecules andsystems with high average coordination numbers is avoided. A new step selection based on the Cartesian geometrychange is introduced. It combines the trust radius and line search method. The structure of the new geometry optimizeris described. The influence of the SCF convergence criteria and the grid accuracy on the geometry optimization arediscussed. A performance analysis of the new geometry optimizer using different start Hessian matrices, basis sets andgrid accuracies is given.

© 2004 Wiley Periodicals, Inc. J Comput Chem 25: 1109–1116, 2004

Key words: density functional methods; geometry optimization; deMon

Introduction

The recent improvements in computer architectures and the devel-opment of efficient density functional theory (DFT) methods haveenabled the reliable theoretical investigation of much larger mol-ecules than ever before. As a consequence, first-principle geometryoptimizations of systems with up to 100 atoms has become astandard task in modern quantum chemistry applications. Thesestudies require robust and user friendly implementations of geom-etry optimizers. Over the last years several optimization algorithmsbased on natural,1,2 redundant,3 or delocalized4 internal coordi-nates have been established. Optimizations using these coordinatesare normally far more efficient, in terms of optimization cycles,than optimizations in Cartesian coordinates. Moreover, the use ofappropriate coordinates is one of the best ways to improve therobustness of the geometry optimization. In particular, this isimportant for molecules containing unusual types of bonds or highcoordinated atoms, such that the estimation of a reliable initialforce constant matrix by molecular mechanics or semiempiricalmethods may fail.

Originally, the generation of so-called natural internal coor-dinates for arbitrary molecular topologies was described byPulay and Fogarasi.1 A considerable simplification for the au-tomatic generation of redundant coordinates represent the use ofthe delocalized internal coordinates introduced by Baker, Kessi,and Delley.4 Both coordinate sets significantly reduce the har-monic and anharmonic coupling between the various degrees offreedom.

There are, however, some disadvantages in using the automaticgenerated delocalized internal coordinates. Gradients and Hessianshave to be transformed into the delocalized internal coordinates.This transformation involves the construction and inversion of thetransformation matrix (the Wilson B-Matrix) together with a num-ber of matrix multiplications. These operations introduce a cubicscaling with respect to the number of atoms. In most traditional abinitio approaches it matters little how much time is spent in theoptimization step because most of the CPU time is used for theenergy and gradient evaluation.

For DFT methods, the situation is different as the scalingproblem for the generation of atomic Becke weights alreadyshowed.5 In the DFT program deMon,6 which uses the variationalapproximation of the Coulomb potential7 and the asymptotic ex-pansion for the calculation of three-center electron repulsion inte-grals,8 linear algebra operations become dominant for matrix di-mensions above 2000 to 3000.9 Because the Wilson B-Matrix maybecome large for automatic generated delocalized internal coordi-nates, the computational impact of the coordinate and gradienttransformations has to be considered in DFT methods. Severalapproaches have been developed that take care of this problem byreducing the scaling10–12 of the transformations.

Correspondence to: J. U. Reveles; e-mail: [email protected]

Contract/grant sponsor: CONACyT; contract/grant numbers: 154871 (toJ.U.R.), 36037-E, and 40379-F

© 2004 Wiley Periodicals, Inc.

Page 2: Geometry optimization in density functional methods

For systems with high average coordination numbers or withfloppy coordinates, the selection of primitive coordinates becomesmandatory to avoid an excessive increase in the computationaltime for the optimization steps, and to improve the performance ofthe geometry optimization.13,14 Moreover, a preselection of prim-itive coordinates also considerably reduces the computational ef-fort for the inversion of the B-Matrix, because its dimension isreduced. In this article a new scheme for the selection of primitivecoordinates is presented that avoids an excessive increase in com-putational time and is robust for floppy molecules and systemswith high average coordination numbers.

Besides the selection of coordinates and the initial structure, theefficiency and stability of a geometry optimizer depends on thestart Hessian, the update of the Hessian, and the step selection. Forthe update of the Hessian and the step selection accurate gradientsare mandatory. In DFT methods, using the variational approxima-tion of the Coulomb potential, the gradient accuracy is coupled tothe self-consistent field (SCF) convergence and the accuracy of thenumerical integration of the exchange-correlation potential.15

Therefore, we will describe how the SCF convergence criteria canbe automatically adapted to the geometry convergence. For thenumerical integration of the exchange-correlation energy and po-tential, an adaptive grid procedure is used in deMon.16 The effectof different grid accuracies on the geometry optimization will alsobe discussed.

The article is organized in the following manner. In the nextsection we review the formalism of delocalized internal coordi-nates. A new scheme for the selection of primitive coordinates,based on their contributions to the nonredundant space, is pre-sented in the third section. Then the new optimization algorithm ofdeMon is presented. A new scheme for the delocalized internalstep selection, based on the Cartesian geometry change is dis-cussed. Details of the implementation of the structure optimizer arealso given. The performance of the deMon optimizer is the ana-lyzed, and then conclusions are given.

Delocalized Internal Coordinates

For the structure optimization in delocalized internal coordinates4

the geometry of the system is described using a set of np primitiveinternal coordinates { pi} collected in the vector p. Generally, pwill consist of all stretches, all bends and all dihedrals that areobtained from some atomic connectivity criteria. The displacementvector �p is related to the corresponding Cartesian displacementvector �x by the (primitive) Wilson Matrix,17 P:

�p � P�x (1)

where the elements of P are defined as:

Pij ��pi

� xj(2)

From the P matrix, the quadratic symmetric G matrix of dimensionnp is constructed as:

G � PMP (3)

The auxiliary matrix M depends on the convention used for theconstruction of the Cartesian coordinates from the stretches andbends.18 In deMon, M is a unit matrix with zero diagonal elementsM11, M22, M33, M44, M55, and M88. These elements correspondto the Cartesian coordinates that are arbitrarily fixed (i.e., atom 1is placed at the origin, atom 2 on the z axis, and atom 3 into the xzplane). Thus, the M matrix removes the external degrees of free-dom of the molecule in the transformation of coordinates, gradi-ents, and second derivatives.

Diagonalizing the G matrix results in two sets of eigenvectors,a set of n � 3N � 6 (n � 3N � 5 for linear systems, N beingthe number of atoms in the system) nonredundant eigenvectorswith eigenvalues � � 0 (in deMon a threshold of 10�7 is used) anda set of np � n redundant eigenvectors with eigenvalues � � 0.The eigenvalue equation for G can be written as:

G�UR� � �UR��� 00 0� (4)

Here, U is the set of nonredundant eigenvectors of G with eigen-values � � 0 and R is the redundant set. In this way a naturalseparation of the redundant and nonredundant subspaces of theoriginal coordinate space is achieved. The redundant subspace canbe discarded for the optimization steps.

The n nonredundant eigenvectors collected in U define a trans-formation matrix from the redundant set of primitive coordinatesto a set of delocalized internal coordinates, s. Each delocalizedinternal coordinate is a linear combination of all primitive coordi-nates of the underlying coordinate space p:

s � Up (5)

Using eq. (1) the displacement vector �s in delocalized internalcoordinates can be related to the Cartesian displacement vector by:

�s � UP�x � B�x (6)

Here, we define the (active) Wilson matrix B for the transforma-tion of the Cartesian displacement vector �x to the delocalizedinternal displacement vector �s as:

B � UP (7)

With this B-Matrix of the dimension n � 3N, the Cartesiangradient and Hessian matrix can be calculated from the delocalizedinternal gradient and Hessian as:

gx � Bgs (8)

Hx � BHsB � Bgs (9)

Using eq. (6) and assuming that the U matrix does not depend onthe Cartesian coordinates, we find for the transformation of thedelocalized internal Hessian into the Cartesian Hessian:

1110 Reveles and Koster • Vol. 25, No. 9 • Journal of Computational Chemistry

Page 3: Geometry optimization in density functional methods

Bgs � PUgs (10)

Hx � BHsB � PUgs (11)

Where the elements of P are defined as:

Pijk ��2pi

� xj� xk(12)

The left Morse–Penrose inverse of the transposed B-Matrix is thenconstructed as:

B� � �BMB��1BM

B�B � E (13)

This generalized inverse matrix is used for the calculation of thedelocalized internal gradient and Hessian matrix from the Carte-sian gradient and Hessian matrix. From eq. (8) and (11) follows:

gs � B�gx (14)

Hs � B��Hx � PUgs�B� (15)

The optimization step is then performed in the delocalized internalspace s and a new structure is obtained. The structure definition indelocalized internal coordinates is transformed back to Cartesiancoordinates x, which are used for the next energy and gradientcalculation. The backtransformation is accomplished via the iter-ative formula:

x�k1� � x�k� � �B���k��s � s�k�� (16)

Here, s is the (known) new geometry in delocalized internalcoordinates and the iterative procedure is started (k � 1) using theold Cartesian geometry and the old B-Matrix. The recalculationand inversion of the B-Matrix in each iterative cycle incorporatesthe dependency of the nonredundant space definition on the Car-tesian coordinates. It should be noted that this procedure involvesonly the inversion of a matrix of the size 3N, which is computa-tional negligible for medium size systems in our DFT implemen-tation. The procedure is terminated when the delocalized internalcoordinates s(k) generated at the kth iteration from the Cartesian setx(k) are identical to s within a given tolerance (root-mean-square of�s or �x smaller than 10�4). This iterative procedure has provennumerical stable, and usually fast converging (less than five iter-ative cycles) for medium-size systems with several hundred atoms.In the rare case where the iteration exceeds 50 steps, due to a largestep size, the delocalized internal displacement is scaled by a factorf. This factor f is initially set to 1. Every time the backtransfor-mation is restarted it is reduced to f � f/ 2. If the scaling factorbecomes smaller than 10�3 the optimization is restarted. In thiscase, the primitive coordinates are rebuild and the matrix U and thedelocalized internal Hessian matrix are recalculated.

Primitive Coordinate Selection

The generation of primitive coordinates starts with the determina-tion of chemical bonds. In our case, two atoms are consideredbonded if their separation is less than 1.3 times the sum of thecovalent radii of the two atoms. From these bonds the primitiveangles and dihedral angles are constructed. To ensure the connec-tivity of the whole system a Z-Matrix is build from the Cartesiancoordinates of the system. This is particularly important for weaklybound complexes where the distance criterion alone would gener-ate disconnected fragments. Every near linear bend (�170) in theZ-Matrix is substituted by two collinear bends adding a dummyatom.

For systems with high average coordination numbers, a largenumber of primitive coordinates may be generated. This largenumber of coordinates increases the computational effort for theinversion of the B-Matrix and reduces the performance of thegeometry optimization. Therefore, special methods for the selec-tion of primitive coordinates have been developed. Arnim et al.13

proposed a hybrid method to reduce the number of generatedprimitive coordinates. In this method, natural internal coordinatesbased on a topological analysis are used for all parts of themolecule where it is possible, and delocalized internal coordinatesare taken for the remainder. For van der Waals complexes andtransition states with floppy coordinates, Lindh et al.14 proposed aselection method of primitive coordinates. In this method, thecoordinates are selected according to their significance with re-spect to the energy, through a model Hessian.

In this article, we propose a new and simple algorithm for theselection of primitive coordinates. We perform a singular valuedecomposition (SVD) of the relative weights of the primitivecoordinates. These relative weights are computed as the squaresum of the coefficients of the primitive coordinates in the activedelocalized internal coordinates. We discard the less significantcoordinates, corresponding to small relative weights. The selectionis performed first with the stretches, then with stretches and bends,and finally with stretches, bends, and dihedral angles.

In detail, the procedure is as follows. The stretches from theZ-Matrix (generated or read from the input) initialize the list ofprimitive coordinates. Additional primitive stretches are definedbased on the atomic connectivity criteria given above. With thesenp stretches the primitive Wilson matrix P and the correspondingG matrix for the stretches are constructed according to eqs. (2) and(3). After the diagonalization of the G matrix, a set of m (m � n)eigenvectors U with eigenvalues � � � (in deMon � is set to 0.01)are selected. The relative weights of the primitive coordinates w inthe U vectors are calculated as:

wi � �j�1

m

Uij2 � i � 1, 2, . . . , np (17)

The primitive stretch with the smallest weight is eliminated if itsweight is smaller than (in deMon is set to 0.3) and the numberof primitives is larger than 2N. This elimination procedure con-tinues until the relative weights of the remaining stretches are alllarger than and the number of primitive stretches equals the

Geometry Optimization in Density Functional Methods 1111

Page 4: Geometry optimization in density functional methods

upper limit of 2N. After a set of primitive stretches is defined, thebends from the Z-Matrix are added to the primitive list. Additionalbends are assigned to each two pairs of atoms having one atom incommon, for example, A–B B–C3 A–B–C. Near linear bends(�170) are skipped. The above described selection procedure isrepeated for stretches and bends. The upper limit of stretches andbends together is set to 4N.

Finally, the dihedral angles from the Z-Matrix are added to theprimitive list. Additional dihedral angles are defined for each pairof bends sharing two terminal atoms, for example, A–B–C B–C–D3 A–B–C–D or A–B–C B–A–D3 C–B–A–D. Dihe-dral angles with near linear bends are skipped. The above de-scribed selection procedure is repeated for stretches, bends, anddihedral angles. The upper limit for the number of stretches, bends,and dihedrals angles together is set to 6N. We will refer to thisprocedure as the SVD reduction of primitive coordinates.

If the symmetry of the system should be preserved during thecourse of an optimization, special attention has to be given to theselection of primitive coordinates. In a first step it has to beensured that the underlying coordinate set of primitive coordinates{ pi} is complete with respect to the symmetry operations. For thisreason the total symmetric projector operator,19

P� A1� � N �K

A1�K�K, (18a)

is applied to each primitive coordinate and the resulting symmetryadapted primitive coordinate is added to the set { pi} if it is notalready included.

In the next step, the symmetry adapted coordinates and thecorresponding Wilson matrices are generated. Using these sym-metry-adapted primitive coordinates the same selection procedureas described above can be applied. Instead of the primitivestretches, angles and dihedral angles symmetry adapted stretches,angles, and dihedral angles are now eliminated. This procedureensures the construction of symmetry adapted delocalized internalcoordinates.

Optimization Algorithm

Step Selection

The rational function optimization (RFO) method was used fordetermining the geometry step.20 Because the approximation to thetrue potential energy is based on a second-order Taylor expansionor a corresponding Pade approximation, it is valid only in arestricted region around the point of the expansion. This so-calledtrust region is assumed to have the shape of a hypersphere with atrust radius �.21 In our implementation, we restrict the delocalizedinternal step size based on the root-mean-square of the Cartesiangeometry change (RMSX). We found that this is a more homoge-neous measure of the geometrical step because we do not considerbonds and angles for the restriction of the step. In every iterativebacktransformation step, the total RMSX is compared with thetrust radius. If the RMSX is larger than �, the delocalized internalstep is scaled by �/RMSX and the iterative procedure is restarted.

The initial value of the trust radius � is set to 0.1 a.u., and is onlyupdated if the root-mean-square of the delocalized gradients(RMSG) is smaller than 0.05. The update of the trust radius isbased on the comparison of the quadratic model with the realpotential energy surface. At each optimization step, the predictedenergy change �EP based on the quadratic model for the previousstep is compared with the actual energy change �E.

r ��E

�EP (18b)

If r is smaller than 0.25 and the energy has increased, the trustradius is decreased to �/4. If r is larger than 0.75 and the energyhas decreased, the trust radius is increased to 2�. If r is in the rangeof 0.25 to 0.75, the trust radius remains unchanged. After theupdate, the trust radius is bounded to be in the range of 0.002 to0.30. If the optimization is restarted, the trust radius is initializedto 0.05.

We found that a combination of the trust radius and line searchmethod sometimes improves the performance of the geometryoptimization. Therefore, we incorporated in deMon the partial linesearch method proposed by Head and Zerner.22 We define apenalty variable L, which determines the application of the linesearch and switches off the trust radius step selection. The initialvalue of L is set to 0. If the energy increases in an optimizationstep, L is increased by L � L 1000. If the RMSG is smallerthan 10�3, L is increased by L � L 1600. If the energydecrease in an optimization step, L is decreased by L � L/ 2. Theline search is activated if L is larger than 1500.

Selection of Start Hessian

For the start Hessian we have implemented a diagonal estimationof the initial Hessian in the primitive space Hp with two empiricalformulas. A model Hessian from the empirical formulas of Fisherand Almlof23 and the model Hessian from Lindh et al.24 TheCartesian Hessian is then calculated, using the primitive Wilsonmatrix P, as:

Hx � PHpP (19)

From this Cartesian Hessian the delocalized internal Hessian isobtained by the following equation:

Hs � B�HxB� (20)

Here, B� refers to the inverse Wilson matrix as defined in eq. (13).By default, the delocalized internal Hessian is updated by theBFGS method.25 Only if the RMSG is smaller than 0.15 a.u. theupdate is performed.

Convergence Criteria and Adaptive SCF Convergence

By default, a structure optimization is converged in deMon if theRMSG and the largest component of the delocalized internalgradient is smaller than 3.0 � 10�4 a.u. and 4.5 � 10�4 a.u.,respectively. Additionally, either the root-mean-square of the de-

1112 Reveles and Koster • Vol. 25, No. 9 • Journal of Computational Chemistry

Page 5: Geometry optimization in density functional methods

localized internal displacement and the largest component of thedisplacement must be smaller than 1.2 � 10�3 a.u. and 1.8 � 10�3

a.u., respectively, or the change in energy between the previoustwo geometries must be smaller than 5 � 10�6 a.u.

We use a dynamical SCF convergence criterion in the geometryoptimizations. The convergence criterion is tightened based on theRMSG, ensuring more accurate energies and gradients at the endof the optimization. If the RMSG is smaller than 10�2 the con-vergence criterion is set to 5 � 10�6. If the RMSG is smaller than5 � 10�3 the convergence criterion is set to 10�6. Finally, if theRMSG is smaller than 10�3 the convergence criterion is tightenedto 5 � 10�7.

Discussion

Computational Details

All calculations were performed with the deMon program,6 usingthe local spin density approximation (LSDA) and the Vosko–Wilk–Nusair (VWN)26 correlation functional if not otherwisestated. For the preoptimization, the STO-3G basis set27 and theauxiliary set A228 were used. For the following optimization theDZVP basis set28 and the auxiliary set A2 were employed. ThedeMon default settings for the SCF have been used. A tight-binding start density, an initial SCF convergence criterion of 10�5,an adapted grid tolerance of 10�5 (MEDIUM) and the DIISprocedure were used. The structures were fully optimized withoutsymmetry constrains. Reported computation times refer to a 2.4GHz Xeon processor with 2 Gbyte of RAM.

Performance Analysis

For the performance analysis of the here described geometryoptimization the test suite of Baker was used.29 In Table 1, thenumber of optimization steps for the test molecules are listed. Inour STO-3G calculations, two different initial Hessian approxima-tions from Fisher23 and Lindh24 were used. For comparison theSTO-3G optimization from Baker at the Hartree–Fock level29 islisted, also. In these calculations a molecular mechanics initialHessian was used. As Table 1 shows the optimization at theSTO-3G level performs slightly better using the empirical modelfrom Fisher (236 steps) for the initial Hessian. However, with bothinitial Hessian approximations implemented in deMon the perfor-mance of our DFT geometry optimization is in the same range asthe one from Baker at the Hartree–Fock level (240 steps). Thisindicates that the noise of the numerical integration of the ex-change-correlation energy and potential does not influence thegradient accuracy in our implementation. The optimization of allsystems converged to the same equilibrium structures with the twodifferent initial Hessian approximations. The maximum energydifference between the optimized structures is smaller than 0.075kcal/mol. This difference is within the energy accuracy of thedefault adaptive grid of deMon used in the optimization procedure.

In a next step we repeated the optimization of the test mole-cules in Table 1 using the initial Hessian from Fisher and animproved grid tolerance of 10�6. This increased the CPU time forthe optimization by almost a factor of 2. Compared with the

optimization with the same initial Hessian and the default gridtolerance of 10�5 a maximum energy deviation in the optimizedstructures of 0.048 kcal/mol was found. Therefore, we concludethat the default grid tolerance of 10�5 is sufficient and economicfor the geometry optimization in deMon.

To test the stability of the new geometry optimizer with respectto the basis set size we repeated the optimization of the test suitewith the DZVP basis set using the initial Hessian approximationsfrom Fisher and Lindh. These results are reported in the last twocolumns of Table 1. The sum of all steps decrease from 236 to 233and 239 to 233 for the initial Hessian of Fisher and Lindh,respectively. The optimization of all systems converged to thesame equilibrium structures with the two different initial Hessianapproximations. The maximum energy difference between theoptimized structures is smaller than 0.018 kcal/mol.

Table 1. Performance Analysis of the Geometry Optimization with theVWN/STO-3G and VWN/DZVP Models Using Two Different InitialHessian Matrices.

Molecule

STO-3G basis DZVP basis

Baker29

This work This work

Fisher23 Lindh24 Fisher23 Lindh24

Water 6 4 4 4 5Ammonia 6 7 6 7 6Ethane 5 4 4 5 5Acetylene 6 6 4 5 4Allene 5 5 12 6 6Hydroxysulphane 8 7 8 7 9Benzene 4 4 3 3 3Methylamine 6 5 4 4 4Ethanol 6 6 5 5 4Acetone 6 7 5 6 5Disilyl ether 8 10 13 8 91,3,5-Trisilacyclohexane 8 10 9 6 7Benzaldehyde 6 7 9 9 61,3-Diflurobenzene 5 6 4 7 71,3,5-Trifluorobenzene 5 6 5 5 7Neopentane 5 4 4 5 4Furan 8 7 6 7 7Naphtalene 5 7 6 6 61,5-Difluoronaphtalene 15 17 19 15 182-Hydroxybicyclopentane 15 17 19 15 18ACHTAR10 12 12 12 8 8ACANIL01 8 13 7 15 7Benzidine 9 7 7 7 7Pterin 10 8 9 11 9Difuropyrazine 9 8 8 11 11Mesityl oxide 7 7 6 7 6Histidine 19 16 25 16 32Dimethylpentane 12 6 7 7 7Caffeine 12 9 9 13 8Menthone 13 13 13 11 10

Sum 240 236 239 233 233

Geometry Optimization in Density Functional Methods 1113

Page 6: Geometry optimization in density functional methods

Primitive Coordinate Selection

To test the new algorithm for the selection of primitive coordinatesthe following 10 systems were optimized using the DZVP basis set

and the empirical Hessian matrix from Fisher: the aluminiumnonamer (Al9), cubane (C8H8), polyhedral boron hydride(B12H12

2�), perylene (C20H12), dimethanospiro[2.2]octaplane

Figure 1. Structures of the test set molecules for the selection of primitive coordinates.

1114 Reveles and Koster • Vol. 25, No. 9 • Journal of Computational Chemistry

Page 7: Geometry optimization in density functional methods

(C23H24), yohimbine (C21H26N2O3), tetrahydrocannabinol(C21H32O2), 2-methyl-5-ethyl-9-propylhexadecane (C22H46), tet-raphenylporphyne (C44H30N4) and jawsamicine (C32H43N3O6).The structures of these systems are shown in Figure 1. We selectedthem because they represent cage or ring systems where thegeneration of primitive coordinates yield a large number of coor-dinates. The chain system (2-methyl-5-ethyl-9-propylhexadecane)was included as reference. The aluminium cluster start geometrywas taken from Wales,30 the tetraphenylporphyne start geometrywas taken from Silvers et al.,31 the rest of the geometries wereconstructed with the graphical molecular builder Molden32 andpreoptimized at the VWN/STO-3G level.

In Table 2 the number of degrees of freedom n and the numberof generated primitive bonds, angles, and dihedrals are listed. Thetotal number of primitive coordinates np is given also. The num-bers in parentheses refer to the number of primitive coordinatesbefore SVD reduction. As Table 2 shows, the SVD reduction ofprimitive coordinates is performed for all systems except the2-methyl-5-ethyl-9-propylhexadecane chain and the tetraphenyl-porphyne. Large SVD reductions occur for the B12H12

2�, Al9, andC23H24 cages. For all systems the np/n ratio is less than 2,indicating a relative small set of primitive coordinates for theoptimization. The number of optimization steps for the test sys-tems are in the range of 3 to 35. The cubane molecule requiredonly three optimization steps because the initial geometry was veryclosed to the minimum. Contrary, the jawsamicine moleculeneeded 35 optimization steps because the initial STO-3G geometrywas far away from the minimum. The optimization of jawsamicinewas also most CPU time demanding. The full optimization of this91 atom system took roughly 1 CPU day on the 2.4 GHz XeonProcessor.

To compare the new algorithm for the selection of primitivecoordinates with the selection algorithm from Lindh14 for weaklybound systems, we performed a set of geometry optimizations ofsmall water clusters (H2O)n, n � 2, 5. Because of the hydrogenbonds in these clusters the generalized gradient approximation ofBecke33 for the exchange and from Lee, Yang, and Parr34 for thecorrelation (BLYP) have been used. The starting geometries weretaken from NEMO.35 In Table 3, we report the convergencestatistics. We obtained a total number of 49 optimization cycles,

compared with the 42 cycles from Lindh with his selectionschema. This difference is mainly due to the different theoreticalmodels (MP2/aug-cc-pVDZ for Lindh and BLYP/DZVP/A2 forthe present work) used in the optimization of the water clusters.For all the systems we obtained an np/n ratio smaller than 1.2.After examination of the generated coordinates we observed thatmost of the coordinates are intramolecular coordinates, and inter-molecular coordinates corresponds only to hydrogen bonds. In thisway, the water clusters are defined by the most relevant coordi-nates.

Conclusions

In this article a new scheme for the selection of primitive coordi-nates is presented. For larger cage systems this selection procedurebecomes mandatory to avoid an excessive increase in the compu-tational time for the optimization steps. The new selection scheme(SVD) is implemented in the framework of the geometry optimizerin the DFT program deMon, and it was tested with systems withhigh average coordination numbers and floppy molecules. Thisoptimizer works with delocalized internal coordinates. Its specialadaptation to the DFT implementation in deMon is described. Thetest calculations show that the default adaptive grid in deMon with

Table 2. Selection of Primitive Coordinates by the SVD Method and Geometry Optimization with the VWN/DZVP Model and the Fisher Initial Hessian Matrix.23

Molecule n Bonds Angles Dihedrals np Cycles Time (s)

Al9 21 20 (21) 7 (40) 6 (43) 33 (104) 20 1374.271C8H8 42 20 14 (48) 29 (85) 63 (153) 3 104.332B12H12

2� 66 42 25 (123) 43 (281) 110 (446) 15 1187.324C20H12 90 37 57 (63) 59 (102) 153 (202) 18 3588.834C23H24 129 59 58 (136) 87 (290) 204 (485) 26 10638.443C21H26N2O3 150 56 83 (107) 116 (175) 255 (338) 34 18091.121C21H32O2 159 57 91 (110) 130 (170) 278 (337) 21 9169.285C22H46 198 67 132 191 390 9 4642.250C44H30N4 228 86 140 217 443 12 35992.168C32H43N3O6 246 91 146 (170) 213 (251) 450 (512) 35 86097.135

�: Primitives before SVD reduction of coordinates.

Table 3. Convergence Statistics for Water Clusters, Initial Geometriesfrom NEMO.35

Molecule

Lindh14 a

This workbFCW Standard

Water dimer 9 10 7Water trimer 12 14 10Water tetramer 8 12 14Water pentamer 13 24 18Sum 42 60 49

aMP2/aug-cc-pVDZ.bBLYP/DZVP/A2, Fisher initial Hessian.

Geometry Optimization in Density Functional Methods 1115

Page 8: Geometry optimization in density functional methods

a grid tolerance of 10�5 is well suited for the geometry optimiza-tion. In all cases energy differences with less than 0.1 kcal/molwere found for the different optimized structures. Also, the com-parison with Hartree–Fock optimizations indicate the numericalstability of our implementation. Application to medium-size sys-tems with up to 100 atoms show that these molecules can be fullyoptimized within 1 to 2 CPU days.

Acknowledgment

The authors would like to thank Patrizia Calaminici, Alberto Velaand Gabriel Merino for valuable discussions.

References

1. Pulay, P.; Fogarasi, G.; Pang, F.; Boggs, J. E. J Am Chem Soc 1979,101, 2550.

2. Fogarasi, G.; Zhou, X.; Taylor, P. W.; Pulay, P. J Am Chem Soc 1992,114, 8191.

3. Peng, C.; Ayala, P. Y.; Schlegel, H. B. J Comp Chem 1996, 17, 49.4. Baker, J.; Kessi, A.; Delley, B. J Chem Phys 1996, 105, 192.5. Stratmann, R. E.; Scuseria, G. E.; Frisch, M. J. Chem Phys Lett 1996,

257, 213.6. Koster, A. M.; Geudtner, G.; Goursot, A.; Heine, T.; Reveles, J. U.;

Vela, A.; Salahub, D. R.; deMon; NRC: Canada, 2003, http://www.deMon-software.com

7. Dunlap, B. I.; Connolly, J. W. D.; Sabin, J. R. J Chem Phys 1979, 71,4993; Mintmire, J. W.; Dunlap, B. I. Phys Rev A 1982, 25, 88.

8. Koster, A. M. J Chem Phys 2003, 118, 9943.9. Koster, A. M.; Goursot, A.; Salahub, D. R. In Comprehensive Coor-

dination Chemistry—II, From Biology to Nanotechnology; McClev-erty, J.; Meyer, T. J.; Lever, B., Eds.; Elsevier: New York, 2003,Vol. 1.

10. Farkas, O.; Schlegel, H. B. J Chem Phys 1998, 109, 7100.11. Farkas, O.; Schlegel, H. B. J Chem Phys 1999, 111, 10806.12. Baker, J.; Kinghorn, D.; Pulay, P. J Chem Phys 1999, 110, 4986.13. Arnim, M. V.; Ahlrichs, R. J Chem Phys 1999, 111, 9183.

14. Lindh, R.; Bernhardsson, A.; Schutz, M. Chem Phys Lett 1999, 303,567.

15. Koster, A. M.; Calaminici, P.; Gomez, Z.; Reveles, J. U. In Reviewsin Modern Quantum Chemistry (A Celebration of the contributions ofRobert G. Parr); Sen, K. D., Eds.; World Scientific: Singapore, 2001,Vol. 2.

16. Krack, M.; Koster, A. M. J Chem Phys 1998, 108, 3226.17. Wilson, E. B.; Decius, J. C.; Cross, P. C. Molecular Vibrations;

McGraw-Hill: New York, 1955.18. Pulay, P. Mol Phys 1969, 17, 197.19. Nakamoto, K. Infrared and Raman Spectra of Inorganic and Coordi-

nation Compounds, Part A: Theory and Applications in InorganicChemistry; Wiley–Interscience Publication: New York, 1997, p. 71,5th ed.

20. Banerjee, A.; Adams, N.; Simons, J.; Shepard, R. J Chem Phys 1996,89, 52; Baker, J.; J Comp Chem 1996, 7, 385.

21. Fletcher, R. Practical Methods of Optimization; Wiley: New York,1981, Vol. 1.

22. Head, J. D.; Zerner, M. C. Adv Quantum Chem 1989, 20, 239.23. Fisher, T. H.; Almlof, J. J Phys Chem 1992, 96, 9768.24. Lindh, R.; Bernhardsson, A.; Karlstrom, G.; Malmqvist, P. Chem Phys

Lett 1995, 241, 423.25. Broyden, C. G. J Inst Math Appl 1970, 6, 76; Fletcher, R. Comp J

1970, 13, 317; Goldfarb, D. Math Comput 1970, 24, 23; Shanno, D. F.Math Comput 1970, 24, 647.

26. Vosko, S.; Wilk, L.; Nusair, M. Can J Phys 1980, 58, 1200.27. Hehre, W. J.; Stewart, R. F.; Pople, J. A. J Chem Phys 1969, 51, 2657;

Hehre, W. J.; Ditchfield, R.; Stewart, R. F.; Pople, J. A. J Chem Phys1970, 52, 2769.

28. Godbout, N.; Salahub, D. R.; Andzelm, J.; Wimmer, E. Can J Chem1992, 70, 560.

29. Baker, J. J Comput Chem 1993, 14, 1085.30. www-wales.ch.cam.ac.uk/ jon/structures/SC.31. Silvers, S. J.; Tulinsky, A. J Am Chem Soc 1967, 89, 3331.32. Schaftenaar, G.; Noordik, J. H. J Comput-Aided Mol Design 2000, 14,

123.33. Becke, A. D. Phys Rev A 1988, 38, 3098.34. Lee, C.; Yang, W.; Parr, R. G. Phys Rev B 1988, 37, 785; Colle, R.;

Salvetti, D. Theor Chim Acta 1975, 37, 329; Colle, R.; Salvetti, D.J Chem Phys 1983, 79, 1404.

35. Åstrand, P.-O.; Wallqvist, A.; Karlstrom, G. J Chem Phys 1994, 100,3726.

1116 Reveles and Koster • Vol. 25, No. 9 • Journal of Computational Chemistry