31
Scaling Quantum Mechanics and First Principles Dynamics for Accuracy and Efficiency Todd J. Martínez SLAC National Accelerator Laboratory Stanford University SciDAC4 Project: Designing Photocatalysts Through Scalable Quantum Mechanics and Dynamics

Scaling Quantum Mechanics and First Principles Dynamics ... › scidac4pi2018 › presentations › 3...Scaling Quantum Mechanics and First Principles Dynamics for Accuracy and Efficiency

  • Upload
    others

  • View
    11

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Scaling Quantum Mechanics and First Principles Dynamics ... › scidac4pi2018 › presentations › 3...Scaling Quantum Mechanics and First Principles Dynamics for Accuracy and Efficiency

Scaling Quantum Mechanics and First Principles Dynamics for Accuracy and Efficiency

Todd J. Martínez SLAC National Accelerator Laboratory

Stanford University

SciDAC4 Project: Designing Photocatalysts Through Scalable Quantum Mechanics and Dynamics

Page 2: Scaling Quantum Mechanics and First Principles Dynamics ... › scidac4pi2018 › presentations › 3...Scaling Quantum Mechanics and First Principles Dynamics for Accuracy and Efficiency

Project Overview • Improve computational modeling tools for modern

architectures to enable design of improved photocatalysts • Major Thrusts

– Enable Rapid Prototyping and deployment of multi-level parallel algorithms (Alex Aiken, Kunle Olukotun, Lexing Ying)

• Legion/Regent for high-level parallel expression • DeLite to build domain specific languages for quantum chemistry

and first principle dynamics – Develop and implement modular library frameworks for large-scale

atomistic simulations (Ed Hohenstein, Todd Martínez, Robert Parrish) • Lightspeed framework to build electronic structure from highly

tuned primitives – Design of photoactivated enzymes (Ron Dror, Possu Huang, TJ Lane,

Henry van den Bedem) • Protein design with novel cofactors • QM/MM investigations of photoenzymes

Page 3: Scaling Quantum Mechanics and First Principles Dynamics ... › scidac4pi2018 › presentations › 3...Scaling Quantum Mechanics and First Principles Dynamics for Accuracy and Efficiency

FAP: CO2 Photochemistry in Nature

hv

Fatty Acid Photodecarboxylase

• Photo-activated production of alkanes

• Requirement for light: not yet clear

• Mechanism: not known

• Project goal: QM calculations to predict mechanism, verify by comparing to spectroscopy, crystallography

Page 4: Scaling Quantum Mechanics and First Principles Dynamics ... › scidac4pi2018 › presentations › 3...Scaling Quantum Mechanics and First Principles Dynamics for Accuracy and Efficiency

Chromophore in FAP

FAD

Like all photoenzymes, FAP contains an “antenna” molecule (cofactor) that absorbs photons and plays a key role in the chemistry

Exp

QM/MM

Absorption Spectrum

QM/MM calculations of excited state mechanism are in progress Large QM regions: Need fast and accurate QM!

See Poster by TJ Lane

Page 5: Scaling Quantum Mechanics and First Principles Dynamics ... › scidac4pi2018 › presentations › 3...Scaling Quantum Mechanics and First Principles Dynamics for Accuracy and Efficiency

Legion

• Task-based programming model – Heterogeneous machines – Distributed memory

• Key features – Programs are written without specifying

where computations will run and where data will be placed

– Separate mapping phase allows program to be tuned to a particular machine

• Currently working to apply Legion to electron repulsion integral generation

Task graph for one time step of a simple application

Page 6: Scaling Quantum Mechanics and First Principles Dynamics ... › scidac4pi2018 › presentations › 3...Scaling Quantum Mechanics and First Principles Dynamics for Accuracy and Efficiency

DeLite

• DeLite framework allows for easy specification of domain specific languages – Domain specific optimizations – Structure computation and data – Optimize mapping to different hardware targets such as GPUs or

FPGAs

• Working to develop DSL for J/K matrix build • Using DeLite multiscale neural network DSL to machine learn

better exchange-correlation functionals

Learn the nonlinear map from potential to density

Page 7: Scaling Quantum Mechanics and First Principles Dynamics ... › scidac4pi2018 › presentations › 3...Scaling Quantum Mechanics and First Principles Dynamics for Accuracy and Efficiency

Primitives for Electronic Structure?

k

Electron Repulsion Integrals (ERIs)

Density Matrix

SCF Energy

SCF Gradients

We built fast GPU versions of these primitives Build a framework that allows easy reuse?

Continuum Solvation and QM/MM

Page 8: Scaling Quantum Mechanics and First Principles Dynamics ... › scidac4pi2018 › presentations › 3...Scaling Quantum Mechanics and First Principles Dynamics for Accuracy and Efficiency

Lightspeed: EST Primitives ----------------------------------------- Python/C++ API ------------------------------------------

Lightspeed: Library Layout

R.M. Parrish, T.J. Martinez and co-workers, in preparation.

Geometry RHF CIS CASCI THC-PT2

Psidewinder: EST Methods -------------------------------------------- Python API ---------------------------------------------

TeraChem (GPU-based IntBox, CASBox, etc)

OpenMM (CPU/GPU-based MM)

LibXC (CPU-based XC kernels)

Box Layer: Performant Code (Embedded or Linked as needed/available) ------------------------------------------ Wrapped API -------------------------------------------

Users Students Dynamics Codes Applications Codes

Resource List Tensor Basis DFTGrid THCGrid

Data Structures (Simple C++ Classes)

IntBox GridBox DFTBox CASBox THCBox

Compute Primitives (Static Functions)

Future Extensions

Page 9: Scaling Quantum Mechanics and First Principles Dynamics ... › scidac4pi2018 › presentations › 3...Scaling Quantum Mechanics and First Principles Dynamics for Accuracy and Efficiency

Lightspeed

R.M. Parrish, T.J. Martinez and co-workers, in preparation.

Page 10: Scaling Quantum Mechanics and First Principles Dynamics ... › scidac4pi2018 › presentations › 3...Scaling Quantum Mechanics and First Principles Dynamics for Accuracy and Efficiency

Lightspeed Example: QM/MM FOMO-CASCI

Page 11: Scaling Quantum Mechanics and First Principles Dynamics ... › scidac4pi2018 › presentations › 3...Scaling Quantum Mechanics and First Principles Dynamics for Accuracy and Efficiency

Lightspeed Example: QM/MM FOMO-CASCI (continued)

And another dozen lines to production-scale AIMD (see Parrish poster)...

Key Methods:

Page 12: Scaling Quantum Mechanics and First Principles Dynamics ... › scidac4pi2018 › presentations › 3...Scaling Quantum Mechanics and First Principles Dynamics for Accuracy and Efficiency

Lightspeed Enables Rapid Prototyping: Parallax – Neglect of Fragment Differential Overlap (NFDO)

Next Target: Current State of the Art:

Single Points Dynamics

Large Gas-Phase Large PBC

At the Least, Coulomb Embedding is Required: => Parallax Idea:

Page 13: Scaling Quantum Mechanics and First Principles Dynamics ... › scidac4pi2018 › presentations › 3...Scaling Quantum Mechanics and First Principles Dynamics for Accuracy and Efficiency

Parallax: PME Approach

T. Darden, D. York, and L. Pedersen, J. Chem. Phys., 98, 10089 (1993). L. Füsti-Molnár and Peter Pulay, J. Chem. Phys., 117, 7827 (2002).

C.-M. Chang, Y. Shao, and J. Kong, J. Chem. Phys., 136, 114112 (2012).

Raw Density:

Not Fourier Representable

Gaussian-Blurred Density:

Fourier Representable

ESP of blurred density (long-range Ewald ESP):

Page 14: Scaling Quantum Mechanics and First Principles Dynamics ... › scidac4pi2018 › presentations › 3...Scaling Quantum Mechanics and First Principles Dynamics for Accuracy and Efficiency

Parallax: Full Coulomb Embedding (Long Range)

Page 15: Scaling Quantum Mechanics and First Principles Dynamics ... › scidac4pi2018 › presentations › 3...Scaling Quantum Mechanics and First Principles Dynamics for Accuracy and Efficiency

Parallax: Full Coulomb Embedding (Short Range)

=> Linear Scaling, highly but not embarrassingly parallel.

Page 16: Scaling Quantum Mechanics and First Principles Dynamics ... › scidac4pi2018 › presentations › 3...Scaling Quantum Mechanics and First Principles Dynamics for Accuracy and Efficiency

Parallax: Sensitivity to Grid/Ewald Parameters Fourier Grid Spacing: Short-Range Cutoff Distance:

Parallax: RHF/STO-3G 648-Atom Unit Cell -1.6x104 Eh SCF Energy

Page 17: Scaling Quantum Mechanics and First Principles Dynamics ... › scidac4pi2018 › presentations › 3...Scaling Quantum Mechanics and First Principles Dynamics for Accuracy and Efficiency

Parallax: Scaling for Simple Water Box

N=1 N=2 N=4

Parallax: RHF/STO-3G Timings on PBC Water Boxes (10 SCF Iterations + Gradient)

12-Core Intel [email protected] GHz/400 GB RAM

Page 18: Scaling Quantum Mechanics and First Principles Dynamics ... › scidac4pi2018 › presentations › 3...Scaling Quantum Mechanics and First Principles Dynamics for Accuracy and Efficiency

Parallax – Next Steps

• Benchmark accuracy of approach • Application to chemical reactions in solution

– Reacting system treated as a single fragment

• Extend to allow for fragmentation across covalent bonds • Extend to allow for dynamic fragmentation

Page 19: Scaling Quantum Mechanics and First Principles Dynamics ... › scidac4pi2018 › presentations › 3...Scaling Quantum Mechanics and First Principles Dynamics for Accuracy and Efficiency

Recommendation Systems

Page 20: Scaling Quantum Mechanics and First Principles Dynamics ... › scidac4pi2018 › presentations › 3...Scaling Quantum Mechanics and First Principles Dynamics for Accuracy and Efficiency

Recommendation Systems

Dark City Star Wars Zero Dark Thirty

Steel Magnolias

Joe 1 ? 2 1

Fred 5 5 1 1

Jane ? ? 5 ?

Alex ? 3 ? ?

Sara 1 ? ? 5

Need to complete the matrix… Is this possible? Only if the matrix does not have much information! Machine learning strategy: ASSUME the problem is well-posed In example above – 10 numbers out of 20, assume this is enough to determine remaining 10 numbers…

Page 21: Scaling Quantum Mechanics and First Principles Dynamics ... › scidac4pi2018 › presentations › 3...Scaling Quantum Mechanics and First Principles Dynamics for Accuracy and Efficiency

Singular Value Decomposition

= + + …

Diagonal Rank = number nonzero diagonal elements

Assume rank is small and find best low rank approximation that reproduces known entries in the matrix…

Page 22: Scaling Quantum Mechanics and First Principles Dynamics ... › scidac4pi2018 › presentations › 3...Scaling Quantum Mechanics and First Principles Dynamics for Accuracy and Efficiency

Full CI Full Configuration Interaction:

All possible arrangements of N electrons in M orbitals

Need matrix-vector products Hc, vector length NI is factorial in N,M Formal cost: Accounting for sparsity of H:

c1 +c2 +c3 +c4

Page 23: Scaling Quantum Mechanics and First Principles Dynamics ... › scidac4pi2018 › presentations › 3...Scaling Quantum Mechanics and First Principles Dynamics for Accuracy and Efficiency

Rank Sparsity in Full CI? How can we reveal noninformative nature of CI vector?

Rearrange vector to a matrix:

“α-string” “β-string”

c C

Now, play same trick from recommendation systems:

Memory requirement, Computational cost:

Note early work by Koch and later also Taylor…

Page 24: Scaling Quantum Mechanics and First Principles Dynamics ... › scidac4pi2018 › presentations › 3...Scaling Quantum Mechanics and First Principles Dynamics for Accuracy and Efficiency

Are Electronic Wavefunctions Informative?

50 terms (out of 104) are enough for kcal/mol accuracy! Even fewer terms needed

for accurate energy differences!

Page 25: Scaling Quantum Mechanics and First Principles Dynamics ... › scidac4pi2018 › presentations › 3...Scaling Quantum Mechanics and First Principles Dynamics for Accuracy and Efficiency

rr-FCI Performance

≈1010 determinants! Even the “conventional” FCI timings here are FAR faster than expected – this is because of Fales-Levine work on GPU-based Full CI

1016 determinants

Number of grains of sand on Earth: 1018

Age of the universe: 1017 sec

≈1015 seconds w/ conventional FCI

Page 26: Scaling Quantum Mechanics and First Principles Dynamics ... › scidac4pi2018 › presentations › 3...Scaling Quantum Mechanics and First Principles Dynamics for Accuracy and Efficiency

rr-FCI • Renaissance in selected CI methods – exploiting element

sparsity of CI vector (Evangelista, Head-Gordon, Umrigar, Alavi): FCI-QMC, Heat Bath-CI, Adaptive Sampling CI

• These also increase the efficiency of full CI – some are deterministic (ASCI) and others are stochastic (FCI-QMC, HBCI)

• rr-FCI exploits rank sparsity of CI vector – There is an rr-FCI value for every element of the CI vector – This is qualitatively different from approximations which neglect small

elements • Deterministic, so gradient is straightforward for dynamics • Working to improve convergence and to exploit mixed

precision

Page 27: Scaling Quantum Mechanics and First Principles Dynamics ... › scidac4pi2018 › presentations › 3...Scaling Quantum Mechanics and First Principles Dynamics for Accuracy and Efficiency

Tensor Hypercontraction

Applying decomposition techniques to 4th order tensor:

P,Q indices have approx. same range as basis

i,j,k,l are unpinned! Can sum over any of these without carrying along the others… Numerical methods to determine X and Z simultaneously… Analytic formulas to determine Z, given X Numerical methods to determine X, given Z Formal scaling of all pair methods reduced to O(N4)!

Page 28: Scaling Quantum Mechanics and First Principles Dynamics ... › scidac4pi2018 › presentations › 3...Scaling Quantum Mechanics and First Principles Dynamics for Accuracy and Efficiency

Implementation of THC

Graphical Representation of J-like term

Where to put parenthesis?

Page 29: Scaling Quantum Mechanics and First Principles Dynamics ... › scidac4pi2018 › presentations › 3...Scaling Quantum Mechanics and First Principles Dynamics for Accuracy and Efficiency

Automatic Factorization for THC

See poster by Ed Hohenstein

Page 30: Scaling Quantum Mechanics and First Principles Dynamics ... › scidac4pi2018 › presentations › 3...Scaling Quantum Mechanics and First Principles Dynamics for Accuracy and Efficiency

Conclusions • Lightspeed framework being developed to rapidly prototype new

algorithms: will provide lessons for more flexible DSLs and implementation of core “boxes” in Legion

• NFDO method being developed using the Lightspeed framework – demonstrates ability to rapidly prototype in this environment

• Rank reduced Full CI leverages low complexity of CI wavefunctions • Rank reduction techniques can also be used for electron repulsion

integrals with tensor hypercontraction – we are exploring automated code generation and optimization to implement THC algorithms

• Simulations of photoenzyme activity in FAP are underway and will be enabled/facilitated by quantum chemistry and first principles dynamics developments

Page 31: Scaling Quantum Mechanics and First Principles Dynamics ... › scidac4pi2018 › presentations › 3...Scaling Quantum Mechanics and First Principles Dynamics for Accuracy and Efficiency

Acknowledgments

• Alex Aiken • Kunle Olukotun • Lexing Ying • TJ Lane • Henry Van Den Bedem • Ron Dror • Possu Huang • Ed Hohenstein • Robert Parrish • Alice Walker • Scott Fales

$$: DOE ASCR / DOE BES

• Henrik Koch • Stefan Seritan • Benjamin Levine • Nick Settje