15
Edoardo Aprà Software needs for Quantum Chemistry Software OAK RIDGE NATIONAL LABORATORY U. S. DEPARTMENT OF ENERGY 1 Acknowledgments Part of this work is funded by the U.S. Part of this work is funded by the U.S. Department of Energy, Office of Advanced Scientific Computing Research and by the division of Basic Energy Science, Office of Science, under contract DE-AC05-00OR22725 with Oak Ridge National Laboratory. This research used resources of the National OAK RIDGE NATIONAL LABORATORY U. S. DEPARTMENT OF ENERGY 2 This research used resources of the National Center for Computational Sciences at Oak Ridge National Laboratory under contract DE-AC05- 00OR22725.

Software needs for Quantum Chemistry Softwarecscads.rice.edu/talk09-Edo-Apra.pdfEdoardo Aprà Software needs for Quantum Chemistry Software OAK RIDGE NATIONAL LABORATORY U. S. DEPARTMENT

  • Upload
    haphuc

  • View
    214

  • Download
    0

Embed Size (px)

Citation preview

Edoardo Aprà

Software needs for Quantum Chemistry

Software

OAK RIDGE NATIONAL LABORATORYU. S. DEPARTMENT OF ENERGY

1

Acknowledgments

• Part of this work is funded by the U.S.• Part of this work is funded by the U.S. Department of Energy, Office of Advanced Scientific Computing Research and by the division of Basic Energy Science, Office of Science, under contract DE-AC05-00OR22725 with Oak Ridge National Laboratory.

• This research used resources of the National

OAK RIDGE NATIONAL LABORATORYU. S. DEPARTMENT OF ENERGY

2

This research used resources of the National Center for Computational Sciences at Oak Ridge National Laboratory under contract DE-AC05-00OR22725.

Outline

• NWChem and Madness introduction• NWChem and Madness introduction• List of Software requirements

OAK RIDGE NATIONAL LABORATORYU. S. DEPARTMENT OF ENERGY

3

Why Was NWChem Developed?• Developed as part of the construction of the

Environmental Molecular Sciences Laboratory (EMSL) at Pacific Northwest National Lab

• Designed and developed to be a highly efficient and portable Massively Parallel computational chemistry package

• Provides computational chemistry solutions that are scalable with respect to chemical system

OAK RIDGE NATIONAL LABORATORYU. S. DEPARTMENT OF ENERGY

4

are scalable with respect to chemical system size as well as MPP hardware size

• Provides major modeling and simulation capability for molecular science

What is NWChem used for?

capability for molecular science−Broad range of molecules, including

biomolecules, nanoparticles and heavy elements− Electronic structure of molecules (non-

relativistic, relativistic, structural optimizations and vibrational analysis)I i l t i lid t t bilit (DFT

OAK RIDGE NATIONAL LABORATORYU. S. DEPARTMENT OF ENERGY

5

− Increasingly extensive solid state capability (DFT plane-wave, CPMD)

−Molecular dynamics, molecular mechanics

Molecular Science Software Molecular Science Software GroupGroup

OAK RIDGE NATIONAL LABORATORYU. S. DEPARTMENT OF ENERGY

6

GA Tools Overview• Shared memory model in context of distributed

dense arrays• Complete environment for parallel code p p

development• Compatible with MPI• Data locality control similar to distributed

memory/message passing model• Compatible with other libraries: ScaLapack,

Peigs, …

OAK RIDGE NATIONAL LABORATORYU. S. DEPARTMENT OF ENERGY

7

• Parallel and local I/O library (Pario)De-Facto Standard communication library for Quantum Chemistry Software (NWChem, Molpro, Molcas, MPQC)Other fields expressed interest: Fusion, Nuclear Physics, Astrophysics, Climate

Structure of GAF90 JavaApplication

programming

Message PassingARMCI

distributed arrays layermemory management, index translationGlobal Arrays

and MPI are completely interoperable.

Fortran 77 C C++ BabelPython

programming language interface

OAK RIDGE NATIONAL LABORATORYU. S. DEPARTMENT OF ENERGY

8

Message PassingGlobal operations

portable 1-sidedcommunication

put,get, locks, etc

system specific interfacesLAPI, GM, threads, QSnet, IB, SHMEM, Portlals

pCode can contain calls to both libraries.

NWChem Architecture

base DFT energy, gradient, …

MD NMR S l ti

SCF energy, gradient, …Molecular

CalculationModules

GenericTasksEnergy, structure, …

• Object-oriented design• abstraction, data hiding,

APIs

Run

-tim

e da

tab MD, NMR, Solvation, …

Optimize, Dynamics, …

egra

l API

omet

ry O

bjec

t

sis S

et O

bjec

t

GS

MolecularModeling

Toolkit

Modules APIs• Parallel programming model

• non-uniform memory access, Global Arrays, MPI

• Infrastructure• GA, Parallel I/O, RTDB, MA,

Peigs, ...

OAK RIDGE NATIONAL LABORATORYU. S. DEPARTMENT OF ENERGY

9

Inte

Geo

Bas

...PeI G

...

Global Arrays

Memory Allocator

Parallel IO MolecularSoftware

DevelopmentToolkit

g ,• Program modules

• communication only through the database

• persistence for easy restart

Gaussian DFT computational kernelEvaluation of XC potential matrix element

my next task = SharedCounter() Dmy_next_task = SharedCounter()do i=1,max_i

if(i.eq.my_next_task) thencall ga_get()

(do work)call ga_acc()my_next_task = SharedCounter()

endifenddo

barrier()

D μν

F λσ

ρ(xq) = ∑μν Dμν χκ(xq) χν(xq)

Fλσ+= ∑q wq χλ(xq) Vxc[ρ(xq)] χσ(xq)

OAK RIDGE NATIONAL LABORATORYU. S. DEPARTMENT OF ENERGY

10

()

Both GA operations are greatly dependent on the communication latency

Multiresolution Adaptive Numerical Scientific Simulation

Ariana Beste1, George I. Fann1, Robert J. Harrison1,2, Rebecca Hartman-Baker1, Shinichiro Sugiki1g

1Oak Ridge National Laboratory2University of Tennessee, Knoxville

In collaboration with

Gregory Beylkin4, Fernando Perez4, Lucas Monzon4, Martin Mohlenkamp5 and others

OAK RIDGE NATIONAL LABORATORYU. S. DEPARTMENT OF ENERGY

11

Martin Mohlenkamp and others4University of Colorado

5Ohio University

[email protected]

Multiresolution chemistry objectives

• Complete elimination of the basis error− One-electron models (e.g., HF, DFT)− Pair models (e.g., MP2, CCSD, …)

Bound and continuum states on equal footing− Bound and continuum states on equal footing• Correct scaling of cost with system size• General approach

− Readily accessible by students and researchers− Higher level of composition − Direct computation of chemical energy differences

• New computational approaches

OAK RIDGE NATIONAL LABORATORYU. S. DEPARTMENT OF ENERGY

12

New computational approaches • Fast algorithms with guaranteed precision

MADNESS applications – chemistry, physics, nuclear, ...

MADNESS Structure

MADNESS parallel runtime

MADNESS math and numerics

OAK RIDGE NATIONAL LABORATORYU. S. DEPARTMENT OF ENERGY

13

MPI Global Arrays ARMCI GPC/GASNET

PhysicsOpen Espresso

Linear AlgebraScaLAPACKLAPACK I/O

HDF5

MADNESS

VisualizationVisitOpenDXVMD

RADPythonOctave

SolversSundanceTOPS

ChemistryNWChemGAMESSMPQC

Parallel tools Math

OAK RIDGE NATIONAL LABORATORYU. S. DEPARTMENT OF ENERGY

14

OctaveSciLab

TOPSPetSc

Parallel toolsMPIGlobal ArraysGPC/GASNETSengal

MathSage Maple

MADNESS Runtime Objectives• Scalability to 1+M processors ASAP• Runtime responsible for

• scheduling and placement, MADNESS applications – chemistry, physics, nuclear, ...

• managing data dependencies, • hiding latency, and• Medium to coarse grain concurrency

• Compatible with existing models• MPI, Global Arrays

• Borrow successful concepts from Cilk, Charm++,

MADNESS parallel runtime

MPI Global Arrays ARMCI GPC/GASNET

MADNESS math and numerics

OAK RIDGE NATIONAL LABORATORYU. S. DEPARTMENT OF ENERGY

15

p , ,Python

• Anticipating next gen. languages

Key elements• Futures for

− hiding latency and − automating dependency management

• Global names and name spaces

• Non-process centric computing− One-sided messaging between objects− Retain place=process for MPI compatibility

OAK RIDGE NATIONAL LABORATORYU. S. DEPARTMENT OF ENERGY

16

Retain place process for MPI compatibility

• Dynamic load balancing

I/O Requirements

• I/O models in NWCHEM− local I/O (e.g. internal HD in a node of a cluster)

P ll l I/O− Parallel I/O• I/O usage:

− Checkpointing (size: O(N2) where N is ~104)− Scratch data: compute and written once, read many

times (size: ask much as we can get)• MPI/IO

OAK RIDGE NATIONAL LABORATORYU. S. DEPARTMENT OF ENERGY

17

• HDF5• Reliability (parallel filesystems …)

1-sided Libs req

• GA is used for the bulk of NWChem• GA is used for the bulk of NWChem comm.

• ARMCI• GPC• Gasnet

OAK RIDGE NATIONAL LABORATORYU. S. DEPARTMENT OF ENERGY

18

MPI Requirements

• 64-bit integers (again!)• 64-bit integers (again!)• Compatibility with other communication

libraries in use.

OAK RIDGE NATIONAL LABORATORYU. S. DEPARTMENT OF ENERGY

19

Multicore programming

• Task based parallelism (Intel Threading• Task based parallelism (Intel Threading Building Blocks ?)

• Compatible with MPI and other communication libraries

• Profiling & debugging tools

OAK RIDGE NATIONAL LABORATORYU. S. DEPARTMENT OF ENERGY

20

Linear Algebra - Serial

• Serial (e g Lapack/Blas)• Serial (e.g. Lapack/Blas)− Requirement: ease of installation− Optimized perf. Other large range of N− 64-bit integers (if possible)

OAK RIDGE NATIONAL LABORATORYU. S. DEPARTMENT OF ENERGY

21

Linear Algebra - Parallel

− Distributed arrays: interface with GA− Installation− Extensive use of dense symmetric eigensolver

• MRRR algorithm in Scalapack (PDSYEVR by C. Voemel)

OAK RIDGE NATIONAL LABORATORYU. S. DEPARTMENT OF ENERGY

22

Optimization/Tuning

• Profiling tools• Profiling tools− Gprof, TAU− PAPI− Code coverage & instrumentation

OAK RIDGE NATIONAL LABORATORYU. S. DEPARTMENT OF ENERGY

23

Quality/Assurance?

• QA suite in NWChem• QA suite in NWChem − Perl scripts− Reference output files− Checks results’ correctness− Database of performance on various HW?

OAK RIDGE NATIONAL LABORATORYU. S. DEPARTMENT OF ENERGY

24

Debugging

• Parallel debuggers• Parallel debuggers• Printf?

OAK RIDGE NATIONAL LABORATORYU. S. DEPARTMENT OF ENERGY

25

Thanks

• NWChem group and GA Group - PNNL• Robert Harrison• NCCS & CCS – ORNL• Workshop organizers

OAK RIDGE NATIONAL LABORATORYU. S. DEPARTMENT OF ENERGY

26

Backup Slides

OAK RIDGE NATIONAL LABORATORYU. S. DEPARTMENT OF ENERGY

27

NWChem HW and SW requirements

• low latency and high bandwidth for − I/O− communication

• availability of large amount of aggregate memoryavailability of large amount of aggregate memory and disk

• flat I/O model using local disk• 64-bit addressing (required by highly correlated

methods)• extensive use of Linear Algebra:

− BLAS− FFT

OAK RIDGE NATIONAL LABORATORYU. S. DEPARTMENT OF ENERGY

28

FFT− Eigensolvers

• use of other scientif libraries (e.g. MASS, ACML_vec for exp and erfc)

NWChem Porting issues on the XT3

• Re-use of existing serial port for x86 64• Re-use of existing serial port for x86_64 (compilers)

• Communication library: GA/ARMCI− Two ARMCI ports: CRAY-SHMEM & Portals

• Made Pario library compatible with Catamount (using glibc calls)

OAK RIDGE NATIONAL LABORATORYU. S. DEPARTMENT OF ENERGY

29

Catamount (using glibc calls)

NWChem Porting - II

• Stability issue with SHMEM port• Stability issue with SHMEM port− Uncovered portal problem: Cray SHMEM group provided

workaround− NWChem crashes the whole XT running large jobs

• Performance of SHMEM port− Latency ~ 10 μsec− BW

OAK RIDGE NATIONAL LABORATORYU. S. DEPARTMENT OF ENERGY

30

• Contiguous put/get ~ 2GB/sec• Strided put/get ~ 0.3GB/sec

• ARMCI using Portal in progress