Geometry Optimization in Cartesian Coordinates: The End of the 2-Matrix?
Jon Baker* and Warren J. Hehre Department of Chemistry, University of California, Iruine, California 9271 7
Received 28 August 1990; accepted 3 December 1990
Geometry optimization directly in Cartesian coordinates using the EF and GDIIS algorithms with standard Hessian updating techniques is compared and contrasted with optimization in internal coordinates utilizing the well known 2-matrix formalism. Results on a test set of 20 molecules show that, with an appropriate initial Hessian, optimization in Cartesians is just as efficient as optimization in internals, thus rendering it unnecessary to construct a 2-matrix in situations where Cartesians are readily available, for example from structural databases or graphical model builders.
Geometry optimization-the location of stationary points on potential energy surfaces-is of major im- portance in computational chemistry. All modern methods for geometry optimization are based on gra- dient techniques and analytical gradients are now routinely available for a wide range of ab initio wavefunctions (for a review see reference 1). A num- ber of algorithms exist for efficiently locating both energy minima, e.g., Schlegel's algorithm: Pulay's GDIIS method: and transition states, e.g., Baker's EF algorithm: and all the most widely used ab initio computational programs, e.g., the GAUSSIAN series: CADPAC; GAMESS,7 etc. incorporate modules for geometry optimization. However, despite the fact that gradients are invariably calculated in Cartesian coordinates, the majority of such programs actually perform the optimization in internal coordinates (bond lengths, bond angles, dhedral angles) utilizing the well known Z-matrix formalism. This is also true of the AMPAC program package: which incorpo- rates many of Dewar's semiempirical procedures, e.g., MND0,S AMl.'O Only molecular mechanics pro- grams, which employ empirical force fields, e.g., MM2," TRIPOS," routinely carry out geometry op- timizations directly in Cartesian coordinates (al- though these and other molecular mechanics force fields are actually parameterized in terms of internal coordinates).
There are several reasons why internal coordi- nates, as implemented via a Z-matrix, are so widely used. Perhaps the most important is the fact that internal coordinates (bond lengths and angles) are
*Author to whom all correspondence should be addressed.
central to the way chemists think about molecular geometries. Molecular construction using a Z-matrix is not difficult, at least for small to medium sized acyclic systems, and symmetry can be readily im- posed by constraining appropriate geometrical pa- rameters. Furthermore, many gradient-based al- gorithms for geometry optimization require the sec- ond derivative matrix (the Hessian)-or at least a suitable approximation to it-and this is often guessed empirically to start off the optimization and updated after each cycle.2~'"'' For the most part, purely empirical techniques for obtaining an initial Hessian are best handled in internal coordinates, where the individual elements may be interpreted in terms of bond stretching or angle bending force con- stants. Finally, internal coordinate representations will have already eliminated the (six) degrees of free- dom associated with overall molecular orientation (translation and rotation), as well as any additional redundancies due to symmetry; these will need to be eliminated from treatments based on Cartesian coordinates.
There are, however, significant disadvantages with the use of internal coordinates as implemented in the Z-matrix formalism. The major disadvantage from the user's point of view is that as the system becomes larger it becomes increasingly more diffi- cult to construct a suitable Z-matrix, i.e., one having the correct symmetry with the appropriate number of variables (degrees of freedom). This is particularly true for cyclic molecules and even more so for mol- ecules incorporating fused rings. Additionally, there is always the danger of inadvertently defining non- independent variables and consequently having fewer degrees of freedom than are needed. Even with a "good" 2-matrix, it is possible during the course of the optimization for bond angles to move outside
Journal of Computational Chemistry, Vol. 12, No. 5, 606-610 (1991) 0 1991 by John Wiley & Sons, Inc. CCC 0192-8651 /91/050606-05$04.00
GEOMETRY OPTIMIZATION IN CARTESIAN COORDINATES 607
the range 0 < angle < 180, which may then cause problems for any related diheral angle.
Another disadvantage is that the set of internal coordinates chosen to carry out the optimization can have a significant effect on the rate of convergence; this is especially true for cyclic systems where vari- ables are likely to be highly coupled. The major prob- lem here is that such coupling is not present in the initial Hessian, which is usually diagonally dominant if not diagonal itself, and it can take several cycles before this information is built into the Hessian by the updating procedure. In theory, if exact second derivatives were available at each cycle, the rate of convergence would be insensitive to the choice of coordinates; however, second derivatives are typi- cally only available for SCF wavefunctions and even here are still relatively costly to calculate (approx- imately three times the CPU time of a single gra- dient).
Finally, internal coordinates do not lend them- selves to graphics based molecule building tech- niques or to the use of existing structural databases, e.g., the Cambridge Crystallographic Data Base,16 the entries of which are in terms of Cartesian coordi- nates. Without the ability to perform optimizations directly in terms of Cartesian coordinates neither of these valuable resources are likely to be efficiently utilized by chemists.
The main objective of this article is to demonstrate that, given a suitable approximate initial Hessian, geometry optimization can be carried out just as efficiently in Cartesian coordinates as in internal co- ordinates, at least for those systems currently ame- nable to ab initio treatments (say up to fifteen or so heavy, i.e., non-hydrogen atoms). We see manip- ulations involving Cartesian coordinates replacing those involving internal coordinates particularly for cyclic compounds, where Z-matrix construction is nontrivial and often time consuming. Optimizations in internal coordinates will of course continue to be employed, in particular for small acyclic molecules and in situations where geometrical variables need to be constrained.
We concentrate in this article on minimization, em- ploying two currently available algorithms: the EF algorithm4-which although written primarily to lo- cate transition states is also an efficient minimizer- and a modified version of Pulays GDIIS algorithm? The EF algorithm has been used extensively for ge- ometry ~ptimization.~ GDIIS, based on the popular DIIS method for accelerating SCF convergence,ls has not been widely used, although recent work by Cum- mins and Gready has demonstrated the advantages of this approach at the semiempirical 1e~el. l~ Pulays original algorithm3 involved a static Hessian (taken
to be a unit matrix) with no updating, but using a variable metric, i.e., updating the Hessian, has proved to be generally superior;lg consequently this is the approach adopted in the present work. Two further modifications have been made: (1) a restriction on the total step size (not greater than 0.3 au), and ( 2 ) only those points within a certain distance (again 0.3 au) of the current point are used to obtain the next point; points further than this distance are not included in the iterative subspace. Although perhaps contrary to the philosophy behind the method, in practice (1) prevents wild steps early on in the op- timization, while rejecting inappropriate earlier ge- ometries, as in (2), improves the convergence. GDIIS is not switched on during an optimization unless the Hessian is positive definite and until the rms gradient is below a user supplied tolerance (default 0.1 au); until these conditions are satisfied the EF algorithm is used. With these modifications, GDIIS has proven to be an efficient and reliable minimizer for both noncyclic and cyclic systems (despite rec- ommendations to the contrary in reference 19).
For optimization in Cartesian coordinates to be viable a suitable approximation to the initial Hessian matrix has to be provided and for this purpose we have used an empirical Hessian obtained from the TRIPOS 5.2 force field.12 The TRIPOS force field is geared entirely towards minimization, and conse- quently is unlikely to be appropriate for transition state searches. Obtaining an approximate initial Hes- sian suitable for a transition state optimization is still an unsolved problem.
A few comments are needed to clarify how the Cartesian optimizations are carried out in practice. The full 3N x 3N (Nis the number of atoms) Hessian in Cartesian coordinates is treated by first projecting out vectors corresponding to translations and s i n - itesimal rotations constructed using the Eckart con- ditions;2 the resulting matrix is then diagonalized and eigenvectors with zero eigenvalues rejected. The remaining vectors are checked carefully for sym- metry. On the first optimization cycle the entire Hes- sian is reconstructed using only those eigenvectors which preserve symmetry (these vectors are them- selves symmetry purified if necessary). In this way, a slightly impure Hessian can be input to start off the optimization while still taking advantage of any molecular symmetry. The Hessian inverse required to implement GDIIS is also constructed from the eigenvectors and eigenvalues, again allowing full mo- lecular symmetry to be utilized. At the same time the eigenvalues can be used to check whether or not the Hessian is positive definite; if negative eigenval- ues appear during the course of a GDIIS optimization (perhaps resulting from an inappropriate Hessian up- date or an indefinite initial Hessian) then the EF algorithm is used to calculate the new step on that cycle instead.
Both the EF and GDIIS algorithms have been in-
608 BAKER AND HEHRE
corporated into the SPARTAN ab initio program system.2l This has been employed for all studies reported here.
RESULTS AND DISCUSSION
A set of 20 molecules was selected and equilibrium geometries calculated at the SCF level with a variety of basis sets using both 2-matrix and Cartesian op- timization. Within each optimization type (internal and Cartesian) four calculations were performed on each system: (1) minimization using the EF algo- rithm with a unit matrix as the initial Hessian; (2) minimization with the initial Hessian estimated via the TRIPOS 5.2 force field; (3) minimization using GDIIS with a unit Hessian, and (4) minimization us- ing GDIIS and the TRIPOS Hessian. Convergence criteria were as follows: 0.0003 au on the rms and maximum gradient component, 0.0003 A on bond lengths and 0.05 on both bond and dihedral angles; for Cartesian optimizations the maximum displace- ment on any component was 0.0003 A. Results are given in Table I.
Of the molecules chosen (Table I) the fist three are small organics, the next ten form part of a test suite used to check the performance of the SPAR- TAN package, benzaldehyde, pterin and 1,4,5-trihy- droxyanthraquinone were taken from reference 19
(f, g, and i in that article, respectively) and the last four were taken from reference 12 (these were orig- inally taken from the Cambridge Structural Data Base13). This test set contains several potentially awkward cyclic systems and two molecules for which there are no standard parameters in the force field (entries four and five in Table I). The majority of compounds have either very low (C2, C,) or no symmetry.
Several features emerge from the results in Table I: For optimizations in internal coordinates, use of the molecular mechanics Hessian (instead of a unit Hessian) typically reduces the number of cycles re- quired to reach convergence by one or two cycles, although in some cases the savings are even greater. Use of the TRIPOS Hessian is much more important for Cartesian optimization. The number of cycles required to achieve convergence is typically reduced by more than 50%. Comparing internal and Cartesian optimizations, this reduction on using the TRIPOS Hessian changes Cartesian optimizations from com- pletely uncompetitive (vis a vis internal coordinates and a unit Hessian) to extremely competitive. The performance of the two optimization algorithms (EF and GDIIS) is markedly similar, for both cyclic and noncyclic molecules.
Some of the larger molecules in Table I warrant specific mention. No attempt was made for these systems to utilize optimum 2-matrix variables
Table I. Number of optimization cycles to reach convergence for EF and GDIIS algorithms for minimization using both internal and Cartesian coordinates.
EF GDIIS EF GDIIS Number Basis Number of TRIFOS TRIPOS TRIPOS TRIPOS
Molecule set of atoms Symmetry variables Unit 5.2 Unit 5.2 Unit 5.2 Unit 5.2
CH,CH,F 6-31G* CH,NH, 6-31G* HCONH, 6-31G* C,H,Li, 6-31G* FC10, STO-3G* O,S(CH3), 3-21G* HzOz 6-31G** CH,CH,OH STOSG C,H,OFCl STOSG P F o l e STO-3G C,H,OF STO-3G CH,CHFCl STOSG 2-fluoro furan STO-3G benzaldehyde STOSG pterin STO-3G 1,4,5-trihydro~y STO-3G
anthraquinone ACTHCPd STO-3G ACYGLYlld STOSG ACHTAJ310d STOSG ACANILO Id STOSG
8 7 6 8 4
11 4 9
12 10 12 8 9
14 17 27
16 15 16 19
11 8 6 9 9 7 9 9 6 4 7 10 4 8 6 9 8 8 4 14 7
13 16 15 21 10 10 9 10 9
18 10 9 18 11 10 15 14 11 25 13 6 31 19 8 51 F Fb
42 F Fb 39 35 20 42 18 9 34 54 8
8 6 13 8 7 14 8 6 17 7 9 15 8 6 11 8 7 23
11 7 17 15 16 30 8 10 35
10 8 18 8 9 31
15 9 22 14 9 19 16 5 26 20 9 23 F Fb 39
F F F 31 21 F 17 8 47 16 8 37
6 7 6
10 5 7
10 24 14 8
10 11 10 8
90 66 15 7
10 6 13 7 15 7 21 9 9 5
19 7 22 10 27 23 38 11 16 7 27 10 31 11 18 10 24 8 20 10 35 11
F F F 67 32 14 35 6
optimization aborted as atoms moved too close (see text). boptimization aborted due to convergence failure in SCF (see text). failed to converge within 100 cycles. dnomenclature as per Cambridge database. See ref. [ 171
GEOMETRY OPTIMIZATION IN CARTESIAN COORDINATES 609
(including dummy atoms where appropriate); all Z-matrices were constructed using valence-type coordinates, with each atom being joined to its neighbor by a bond length, a bond angle and a di- hedral angle. Such a description is often very poor for cyclic systems. This explains the relatively poor performance of the internal coordinate optimization in several cases, and illustrates the pitfalls in using an unsuitable Z-matrix. The benzaldehyde and pterin optimizations are fairly straightforward, but for 1,4,5- trihydroxyanthraquinone the internal coordinate op- timization failed completely. With...