20
CCP6 Conference on Computational Physics, Gyeongju , Republic of Korea, August 2006 Order-N DFT calculations with the Conquest code Michael J. Gillan 1,2 , T. Miyazaki 3 and D. Bowler 1,2 1 Physics and Astronomy Department, University College London, U.K. 2 London Centre for Nanotechnology, University College London, U.K. 3 Computational Materials Science Center, NIMS, Tsukuba, Japan Summary of O(N) DFT: the Conquest code Summary of practical operation of Conquest: linear scaling, parallel scaling Validation against standard plane-wave codes Work in progress on Ge/Si hut-clusters

Summary of O(N) DFT: the Conquest code

  • Upload
    jovan

  • View
    28

  • Download
    0

Embed Size (px)

DESCRIPTION

- PowerPoint PPT Presentation

Citation preview

Page 1: Summary of  O(N)  DFT: the Conquest code

CCP6 Conference on Computational Physics,Gyeongju , Republic of Korea, August 2006

Order-N DFT calculations with the Conquest code

Michael J. Gillan1,2, T. Miyazaki3 and D. Bowler1,2

1Physics and Astronomy Department, University College London, U.K.

2London Centre for Nanotechnology, University College London, U.K.

3Computational Materials Science Center, NIMS, Tsukuba, Japan Summary of O(N) DFT: the Conquest code

Summary of practical operation of Conquest: linear scaling, parallel scaling

Validation against standard plane-wave codes

Work in progress on Ge/Si hut-clusters

Page 2: Summary of  O(N)  DFT: the Conquest code

CCP6 Conference on Computational Physics,Gyeongju , Republic of Korea, August 2006

Si/Ge nanostructures

Epitaxial deposition of Ge on the Si (001) surface. There is a strain mismatch of a few percent, and the strain is relieved by formation of missing-row trenches. Evolution of surface structure with increasing coverage...

Early 2 x N

Late 2 x N

M x N

‘Hut’ clusters

Page 3: Summary of  O(N)  DFT: the Conquest code

CCP6 Conference on Computational Physics,Gyeongju , Republic of Korea, August 2006

Why traditional methods scale as N2 or worse

• There are N Kohn-Sham orbitals

• Each Kohn-Sham orbital extends over the entire volume V, so information in each is proportional to N. Hence, amount of information to be calculated is proportional to N 2.

• In standard methods, we have to calculate all overlap integrals

• Hence, number of operations is proportional to N 3. Prefactor of N 3 is small, so practical dependence is N 2, except for very large systems.

( )n r

( )n r( )n r

( ) ( )mn m nS d r r r

Page 4: Summary of  O(N)  DFT: the Conquest code

CCP6 Conference on Computational Physics,Gyeongju , Republic of Korea, August 2006

Density matrix is key to O(N)

• But Etot in DFT can be expressed entirely in terms of , so we have a variational principle: minimize Etot with respect to density matrix, subject to conditions that : (i) is symmetric; (ii) is idempotent; (iii) gives the correct number of electrons. It is enough to require that be weakly idempotent: its eigenvalues lie between 0 and 1.

( , ') r r

( , ') r r( , ') r r

• Density matrix is defined in terms of Kohn-Sham eigenfunctions as:

with f n orbital occupation numbers: at T = 0, f n = 0 or 1. The operator is idempotent: its eigenvalues are 0 or 1 (it is a projector).

( , ') r r

( , ') r r

( , ') ( ) ( )n n nn

f r r r r

• Now as . So we get an upper bound to Etot if we minimize Etot with the additional constraint that for .

( , ') 0 r r ' r r( , ') 0 r r ' cR r r

Page 5: Summary of  O(N)  DFT: the Conquest code

CCP6 Conference on Computational Physics,Gyeongju , Republic of Korea, August 2006

Density matrix and localised orbitals

• In practice, we cannot work directly with , since it is a function of two vector variables.

( , ') r r

with localized orbitals that vanish outside ‘localization regions’. In practice, take the localization regions spherical, radius Rreg and centred on the atoms. Label specifies different localized orbitals on given atom i.

• Instead, express as:( , ') r r

,,

( , ') ( ) ( ')i i j ji j

K

r r r r

( )i r

• Matrix is the density matrix in the (non-orthogonal) representation of .

,i jK ( )i r

• Search for ground state by minimizing Etot with respect to and subject to idempotency and correct electron number.

( )i r ,i jK

Page 6: Summary of  O(N)  DFT: the Conquest code

CCP6 Conference on Computational Physics,Gyeongju , Republic of Korea, August 2006

Enforcing idempotency

• In orthogonal tight-binding, the equivalent problem is to minimize:

subject to weak idempotency of K, and fixed electron number. tot TrE KH

• The same scheme is applied in CONQUEST, but in a non-orthogonal version:

where

3 2K LSL LSLSL

, ( ) ( )i j i jS d r r r

• An effective method is that of Li, Nunes and Vanderbilt, in which:

with L the ‘auxiliary density matrix’. This works because of the properties of the polynomial

2 33 2K L L

2 33 2x x

Page 7: Summary of  O(N)  DFT: the Conquest code

CCP6 Conference on Computational Physics,Gyeongju , Republic of Korea, August 2006

Summary of CONQUEST

• Ground-state search is done as three nested loops:

1. inner: minimization with respect to auxiliary density matrix , with cut-off distance RL.

2. middle: self-consistency

3. outer: minimization with respect to with cut-off distance Rreg

• Minimization with respect to at fixed Kohn-Sham potential and fixed is equivalent to non-orthogonal tight-binding; done by a combination of McWeeny purification and Li-Nunes-Vanderbilt.

• Self-consistency: reduction of density residual to zero: done by ‘Guaranteed Reduction Pulay’.

• Localized orbitals represented by B-splines, or by pseudo-atomic orbitals; minimization by conjugate gradients.

• Written as a parallel code from an early stage.

i jL

i jL

i

i

i

Page 8: Summary of  O(N)  DFT: the Conquest code

CCP6 Conference on Computational Physics,Gyeongju , Republic of Korea, August 2006

Parallel operation

Main aspects of CONQUEST parallel coding:

• Formation of overlap and Hamiltonian matrix elements by integration over grid points: integration grid is divided ito ‘domains’, with one processor responsible for a domain.

• Matrix operations: atoms divided into ‘primary sets’, with one processor responsible for rows of matrices associated with one primary set.

• Fourier transformation (Hartree potential etc...) is done in parallel.

Page 9: Summary of  O(N)  DFT: the Conquest code

CCP6 Conference on Computational Physics,Gyeongju , Republic of Korea, August 2006

Practical linear scaling

Tests of convergence of total energy with respect to L-matrix cut-off RL and localization-region radius Rreg. Bulk silicon.

Page 10: Summary of  O(N)  DFT: the Conquest code

CCP6 Conference on Computational Physics,Gyeongju , Republic of Korea, August 2006

Practical parallel scaling

Parallel scaling: increasing number of processors, constant number of atoms per processor

O(N) scaling: constant number of processors, increasing number of atoms

Page 11: Summary of  O(N)  DFT: the Conquest code

CCP6 Conference on Computational Physics,Gyeongju , Republic of Korea, August 2006

Relaxation of Si (001) surface

Method Basis Bond length (A)

Bond angle

SIESTA/NSC SZ 2.50 15.9

CONQUEST/NSC SZ 2.50 14.5

CONQUEST/NSC O(N)

SZ 2.42 15.0

CONQUEST/SC B-spline 2.37 22.8

SIESTA/SC DZP 2.40 19.9

VASP PW 2.41 19.7

Comparison of relaxed structure of Si (001) surface from non-self-consistent and self-consistent SIESTA and CONQUEST calculations using different basis sets with plane-wave results (VASP). Quantities compared are bond length and tilt angle of surface dimer pair.

Page 12: Summary of  O(N)  DFT: the Conquest code

CCP6 Conference on Computational Physics,Gyeongju , Republic of Korea, August 2006

Structure and energetics of Ge/Si hut clusters

Energetics of hut clusters with O(N) electronic-structure calculations: aims of the investigation:

• The smallest stable hut-clusters contain ~10,000 atoms of Ge. Previous investigations have used continuum elasticity theory combined with DFT calculations. Energetics of faces and edges is important.

• Combination of DFT and elasticity theory is not easy, and it is not known how large the clusters must be for this approach to be valid.

• Our aim: to use the hierarchy of electronic-structure methods to test previous calculations. We are investigating the energetics using: tight-binding; non-self-consistent DFT tight-binding, and self-consistent DFT tight-binding. In the future, also full DFT with plane-wave accuracy.

The faces of the hut-clusters are Ge (105) surfaces

Page 13: Summary of  O(N)  DFT: the Conquest code

CCP6 Conference on Computational Physics,Gyeongju , Republic of Korea, August 2006

Ge(105) : surface energy by different methods

Ge(105): surface energy (LDA)

empirical TB : 74.3 meV/Å2

NSC-AITB : 76.5 meV/Å2

SC-AITB : 81.5 meV/Å2

full DFT(blip) : 74.8 meV/Å2

STATE(planewave): 70.0 meV/Å2

Ge/Si(105): surface energy (GGA)

STATE(12.25 Ry): 43.7 meV/Å2

STATE( 36 Ry ): 45.6 meV/Å2

by Hashimoto et al.

Page 14: Summary of  O(N)  DFT: the Conquest code

CCP6 Conference on Computational Physics,Gyeongju , Republic of Korea, August 2006

Ge(105): cutoff of (auxiliary) density matrix L

Ge(105) : NSC-AITB

RL(bohr)Etot(Ha/atom)

Fmax(Ha) Esurf(eV/A2)

15.4 -4.847635 0.0027 0.0801

20.4 -4.850438 0.0014 0.0753

25.4 -4.851420 0.0007 0.0752

30.4 -4.851811 0.0004 0.0755

diag -4.852048 0.0001 0.0765

RL=20.4 (or 25.4) bohr is enough.

Page 15: Summary of  O(N)  DFT: the Conquest code

CCP6 Conference on Computational Physics,Gyeongju , Republic of Korea, August 2006

Strategy for calculations on hut clusters

Si (001)

Ge: total energy per 1 Ge atom of hut cluster and its dependence on • size of hut cluster: a• spacings of hut clusters: b• thickness of Si substrate: d

Ge of 2xN Ge/Si(001) surface

d

Ge= (Etot-Esub)/(# of Ge hut atoms)

(empirical) TB DFT calculations

a

b

Ge hut cluster

Page 16: Summary of  O(N)  DFT: the Conquest code

CCP6 Conference on Computational Physics,Gyeongju , Republic of Korea, August 2006

Size: • 28x28 on 32x32, thickness 8 layer

– 22,746 atoms

• Calculations are done on the Earth Simulator using 64 nodes (512 processors, 1/10 of the ES)– DMM : ~ 10min.– Initial forces are already small.– Structure Optimisation within NSC-AITB level

174Å

48Å

Page 17: Summary of  O(N)  DFT: the Conquest code

CCP6 Conference on Computational Physics,Gyeongju , Republic of Korea, August 2006

Ge hut cluster v.s. 2xN on Si(001): Tight-binding O(N)

•Type 2 ‘hut’ is more stable than type 1 for small coverage. •crossing of (Ge) between 2xN and hut?

• substrate of hut cluster: p2x2 (without dimer vacancies)• thicker substrates will reduce (Ge) of larger hut clusters.

-> hut should be more stable

• Similar calculations by order-N DFT in progress....

Page 18: Summary of  O(N)  DFT: the Conquest code

CCP6 Conference on Computational Physics,Gyeongju , Republic of Korea, August 2006

Conquest O(N) DFT on Ge/Si hut-clusters

• Type 1 ‘hut’ is more stable than type2.

• Further calculations being done now.

• Self-consistent DFT has also been achieved for this size of system.

Minimal-basis, non-self-consistent DFT. Size of system: 22,746 atoms

Page 19: Summary of  O(N)  DFT: the Conquest code

CCP6 Conference on Computational Physics,Gyeongju , Republic of Korea, August 2006

Summary and continuation...

Conquest O(N) DFT has been tested on simple systems, and shows satisfactory agreement with calculations using pseudo-atomic orbitals (Siesta) and with plane-wave calculations.

For Ge (105) surface, O(N) DFT calculations with different density-matrix cut-off radius RL show rapid convergence with respect to RL. Full blip-function basis in Conquest (equivalent to plane waves) gives good agreement with plane-wave results for surface formation energy.

Ge/Si hut clusters: O(N) tight-binding calculations on systems of 20,000 – 30,000 atoms are straightforward. O(N) self-consistent DFT calculations on this size of system are feasible.

Preliminary results on Ge/Si hut-clusters suggest cross-over to stable hut-clusters at coverage of about 3 monolayers. However, much more calculation and analysis is still needed.

Page 20: Summary of  O(N)  DFT: the Conquest code

CCP6 Conference on Computational Physics,Gyeongju , Republic of Korea, August 2006

Credits

• National Institute for Materials Science, Tsukuba

• International Center for Young Scientists, NIMS, Tsukuba

• MEXT grant for Scientific Research in Priority Areas “Development of New Quantum Simulators and Quantum Design” (No. 17064017)

• Earth Simulator Center, Yokohama

• UK Engineering and Physical Sciences Research Council

• Royal Society