29
Use all the codes! Enabling portable researchers Adam J. Jackson Email: adam.jackson{at}stfc.ac.uk Gitlab/Github: @ajjackson Twitter: @binarystate

Use all the codes!€¦ · Use all the codes! Enabling portable researchers Adam J. Jackson Email: adam.jackson{at}stfc.ac.uk Gitlab/Github:@ajjackson Twitter: @binarystate

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Use all the codes!€¦ · Use all the codes! Enabling portable researchers Adam J. Jackson Email: adam.jackson{at}stfc.ac.uk Gitlab/Github:@ajjackson Twitter: @binarystate

Use all the codes!Enabling portable researchers

Adam J. Jackson

Email: adam.jackson{at}stfc.ac.ukGitlab/Github: @ajjackson Twitter: @binarystate

Page 2: Use all the codes!€¦ · Use all the codes! Enabling portable researchers Adam J. Jackson Email: adam.jackson{at}stfc.ac.uk Gitlab/Github:@ajjackson Twitter: @binarystate

1 BackgroundWhy is everything blue?

2 Portable researchersand their tools

3 Plane-wave DFT in ASEVASP and CASTEP: similar codes, different interfaces

Overview

Imag

e ©

STF

C A

lan

Ford

Page 3: Use all the codes!€¦ · Use all the codes! Enabling portable researchers Adam J. Jackson Email: adam.jackson{at}stfc.ac.uk Gitlab/Github:@ajjackson Twitter: @binarystate

About me• 2006-2011 MEng Chemical Engineering (University of Bath)• 2011-2016 MRes → PhD → Research Chemistry (University of Bath)

• Ab initio thermodynamics• Photovoltaics

• 2016-2019 Post-doc Chemistry (University College London)• Semiconductor defects• Transparent conductors

• 2019- Computational Scientist Theoretical Physics (STFC)• Inelastic neutron scattering

“DFT practitioner”

Page 4: Use all the codes!€¦ · Use all the codes! Enabling portable researchers Adam J. Jackson Email: adam.jackson{at}stfc.ac.uk Gitlab/Github:@ajjackson Twitter: @binarystate

Rutherford Appleton Laboratory

Rutherford Appleton Laboratory,

Harwell Campus,Oxfordshire

STFC is a UK research council with a mission to:Promote and support high-quality scientific and engineering research by developing and providing, by any means, facilities and technical expertise in support of basic, strategic and applied research programmes funded by persons established in the UK and elsewhere.i.e. SFTC does research

RAL site includes• computing/data infrastructure,• Central Laser Facility• ISIS neutron and muon source• Diamond light source (joint venture with Wellcome Trust)

Imagery ©2019 Getmapping plc, Infoterra Ltd & Bluesky, Maxar Technologies, The GeoInformation Group, Map data©2019

50 m

Google Maps https://www.google.com/maps/@51.5726085,-1....

1 of 1 29/08/2019, 15:58

Page 5: Use all the codes!€¦ · Use all the codes! Enabling portable researchers Adam J. Jackson Email: adam.jackson{at}stfc.ac.uk Gitlab/Github:@ajjackson Twitter: @binarystate

• Inelastic neutron scattering (INS) simulation

• Abins is an “Algorithm”within Mantid, a framework for instrument scientists

• Currently imports vibration/phonon calculation data from CASTEP, CRYSTAL, Gaussian, DMOL3

Current work: Abins / Mantid

Page 6: Use all the codes!€¦ · Use all the codes! Enabling portable researchers Adam J. Jackson Email: adam.jackson{at}stfc.ac.uk Gitlab/Github:@ajjackson Twitter: @binarystate

Portable researchersSo many codes…

Page 7: Use all the codes!€¦ · Use all the codes! Enabling portable researchers Adam J. Jackson Email: adam.jackson{at}stfc.ac.uk Gitlab/Github:@ajjackson Twitter: @binarystate

(sample of 1)• 2011-2016 (Masters, PhD) Ab initio thermodynamics, photovoltaics

• FHI-aims (DFT, atom-centred numerical orbitals, all-electron)• GULP (forcefield)• VASP (DFT, plane-wave basis, pseudopotential)

• 2016-19 (Post-doc) Defects, optics, transparent conductors• VASP• GPAW (DFT, numerical basis sets, pseudopotential)• Questaal (GW, LMTO, all-electron)

• 2019-? (National lab) Phonons, INS simulation• CASTEP (DFT, plane-wave basis, pseudopotential)• CRYSTAL (DFT, Gaussian basis, all-electron)

Codes used in an early research career

Page 8: Use all the codes!€¦ · Use all the codes! Enabling portable researchers Adam J. Jackson Email: adam.jackson{at}stfc.ac.uk Gitlab/Github:@ajjackson Twitter: @binarystate

Good reasons to choose a particular code• Available levels of theory (forcefields, DFT, GW etc.)• Available analysis tools• Performance optimisation for available HPC• Accuracy/reliability established by benchmarking• Reproducibility of calculations and long-term sustainability of project

How to choose an atomistic code

Realistic reasons to choose a particular code• Your supervisor/colleagues already know it• Cost/availability of licenses• Your supervisor/colleagues are friends with the developers• All of your group’s existing tools were written for it

Page 9: Use all the codes!€¦ · Use all the codes! Enabling portable researchers Adam J. Jackson Email: adam.jackson{at}stfc.ac.uk Gitlab/Github:@ajjackson Twitter: @binarystate

Good reasons to choose a particular code• Available levels of theory (forcefields, DFT, GW etc.)• Available analysis tools• Performance optimisation for available HPC• Accuracy/reliability established by benchmarking• Reproducibility of calculations and long-term sustainability of project

How to choose an atomistic code

Realistic reasons to choose a particular code• Your supervisor/colleagues already know it• Cost/availability of licenses• Your supervisor/colleagues are friends with the developers• All of your group’s existing tools were written for it

Becoming independent doesn’t actually change this?

Page 10: Use all the codes!€¦ · Use all the codes! Enabling portable researchers Adam J. Jackson Email: adam.jackson{at}stfc.ac.uk Gitlab/Github:@ajjackson Twitter: @binarystate

(sample of 1)• 2011-2016 (Masters, PhD) Ab initio thermodynamics, photovoltaics

• FHI-aims (DFT, atom-centred numerical orbitals, all-electron)• GULP (forcefield)• VASP (DFT, plane-wave basis, pseudopotential)

• 2016-19 (Post-doc) Defects, optics, transparent conductors• VASP• GPAW (DFT, numerical basis sets, pseudopotential)• Questaal (GW, LMTO, all-electron)

• 2019-? (National lab) Phonons, INS simulation• CASTEP (DFT, plane-wave basis, pseudopotential)• CRYSTAL (DFT, Gaussian basis, all-electron)

Codes used in an early research career

Phonopy

BANDUP TDEP Galore

Sumo

OptaDOS

Abins

ATAT

Page 11: Use all the codes!€¦ · Use all the codes! Enabling portable researchers Adam J. Jackson Email: adam.jackson{at}stfc.ac.uk Gitlab/Github:@ajjackson Twitter: @binarystate

ASE addresses the “tool compatibility” problems:• By writing personal/internal tools with ASE, we have a low barrier to

supporting more atomistic codes

• By developing community tools and new methods with ASE, we bring them to more researchers

Importance of good tools

A question for coffee / beer / hackathon: can ASE help with the other factors?

Page 12: Use all the codes!€¦ · Use all the codes! Enabling portable researchers Adam J. Jackson Email: adam.jackson{at}stfc.ac.uk Gitlab/Github:@ajjackson Twitter: @binarystate

Anecdote 1: Everyone needs file converters• 2012 – I write a shell script aims2vasp to convert aims structures to VASP input.

It’s not very good.• 2014 – I write ase-convert, a wrapper around ASE io functions. With only a few

lines of code it’s much more useful but needs maintenance.• 2017 – @ithod adds ase convert command to ASE. It’s essentially identical to

ase-convert.• 2019 – ase convert has had 13 more commits for improvements and

maintenance.

Importance of good tools

Page 13: Use all the codes!€¦ · Use all the codes! Enabling portable researchers Adam J. Jackson Email: adam.jackson{at}stfc.ac.uk Gitlab/Github:@ajjackson Twitter: @binarystate

Anecdote 2: Sumo• Sumo is a package developed in the Scanlon Materials Theory Group at UCL• Started as “script pile”; tools for setting up and plotting VASP calculations

Importance of good tools

• Then we made a few strong design decisions:• Use the Pymatgen library for Vasp IO, data containers and utility functions• Matplotlib, with opinionated styling• Develop openly on Github, package nicely on PyPI

Page 14: Use all the codes!€¦ · Use all the codes! Enabling portable researchers Adam J. Jackson Email: adam.jackson{at}stfc.ac.uk Gitlab/Github:@ajjackson Twitter: @binarystate

Anecdote 2: Sumo – the good• Training time/effort drastically reduced• Nice student reports, less mistakes• As people left the group they retain access to their work, including updates• Support was added for Questaal, and is in-progress for CASTEP• External users provide bug fixes and ideas for new features

Importance of good tools

Page 15: Use all the codes!€¦ · Use all the codes! Enabling portable researchers Adam J. Jackson Email: adam.jackson{at}stfc.ac.uk Gitlab/Github:@ajjackson Twitter: @binarystate

Anecdote 2: Sumo – the bad• Pymatgen tends to move fast and break things; not ideal for a dependency.• Adding support for other codes is cumbersome; partly because Pymatgen is

Vasp-centric, and partly because every code has a different workflow!• Both of the main developers have moved to development-heavy roles elsewhere.

Issues and PRs can sit dormant for a while…

Importance of good tools

Page 16: Use all the codes!€¦ · Use all the codes! Enabling portable researchers Adam J. Jackson Email: adam.jackson{at}stfc.ac.uk Gitlab/Github:@ajjackson Twitter: @binarystate

Anecdote 3: Galore• Galore is a little package for simulating photoelectron spectra• It automates some simple but tedious transformations to an orbital-projected DOS• It was mostly developed while working for Scanlon group at UCL, so the priority

was VASP

Importance of good tools

6B;m`2 R, S`Q+2/m`2 UH27i iQ `B;?iV 7Q` bBKmH�i2/ T?QiQ2H2+i`QM bT2+i`mK 7`QK �# BMBiBQ .Pa

S?QiQ2KBbbBQM bT2+i`� 7Q` i?2 b�K2 K�i2`B�H rBHH p�`v /2T2M/BM; QM i?2 `�/B�iBQM bQm`+2mb2/X h?2 T`Q#�#BHBiB2b Q7 i?2 mM/2`HvBM; T?QiQBQMBb�iBQM 2p2Mib �`2 #�b2/ QM i?2 `�/B�@iBQM �M/ Q`#Bi�H 2M2`;B2b- �b r2HH �b i?2 b?�T2 Q7 i?2 Q`#Bi�HX AM Q`/2` iQ �++QmMi 7Q` i?Bb BiBb M2+2bb�`v iQ �TTHv r2B;?iBM; iQ bi�i2b �++Q`/BM; iQ i?2B` T?QiQBQMBb�iBQM +`Qbb@b2+iBQMbXh?Bb Bb �++QKTHBb?2/ #v T`QD2+iBM; i?2 7mHH .Pa QMiQ +QMi`B#miBQMb 7`QK �iQKB+ b- T- /-7 Q`#Bi�Hb US.PaV- �b Bb /QM2 `QmiBM2Hv BM �M�HvbBb Q7 �# BMBiBQ +�H+mH�iBQMbX Ai Bb i?2M�bbmK2/ i?�i i?2 +QMi`B#miBQMb Q7 i?2b2 Q`#Bi�H@T`QD2+i2/ bi�i2b iQ i?2 iQi�H T?QiQ2H2+@i`QM bT2+i`mK rBHH #2 T`QTQ`iBQM�H iQ i?2 T?QiQBQMBb�iBQM +`Qbb@b2+iBQMb Q7 +Q``2bTQM/BM;Q`#Bi�Hb BM 7`22 �iQKbX Uh?Bb Bb i?2 :2HBmb KQ/2HXRyĜRkV h?2 7`22 �iQK +`Qbb@b2+iBQMb ?�p2#22M +QKTmi2/ #v b2p2`�H K2i?Q/b �M/ �`2 �p�BH�#H2 �b `272`2M+2 /�i� U2X;X RjVX AKTH2@K2Mi2/ rBi? �/@?Q+ b+`BTib �M/ bT`2�/b?22ib- i?Bb K2i?Q/ ?�b �H`2�/v #22M mb2/ BM K�Mv�+�/2KB+ bim/B2bXR9ĜR3

EMQrM HBKBi�iBQMb �M/ BKT`Qp2K2Mib

h?2 :2HBmb KQ/2H Bb #�b2/ QM T?QiQ2H2+i`QMb rBi? b?Q`i r�p2H2M;i?b- bm+? i?�i i?2 T?Q@iQBMBQMBb�iBQM K�i`Bt 2H2K2Mib �`2 /QKBM�i2/ #v `2;BQMb Q7 bT�+2 +HQb2 iQ i?2 �iQKB+Mm+H2BXRyĜRk h?Bb KQ/2H Bb 2tT2+i2/ iQ #`2�F /QrM �i HQM;2` r�p2H2M;i?b- �M/ BM i?2 +�b2Q7 HQr@2M2`;v lSa Bi Bb HBF2Hv i?�i bB;MB}+�Mi b+�ii2`BM; 2z2+ib rQmH/ #2 M2;H2+i2/X�bvKK2i`v +Q``2+iBQMb K�v #2 �TTHB2/ iQ i?2 T?QiQBQMBb�iBQM +`Qbb@b2+iBQMb iQ �++QmMi 7Q`i?2 TQH�`Bb�iBQM Q7 bQm`+2b �M/ �M;mH�` �++2Ti�M+2 `�M;2 Q7 2H2+i`QM /2i2+iQ`bX h?Bb Bb 2bT2@+B�HHv `2H2p�Mi 7Q` >�sS1a K2�bm`2K2Mib r?2`2 i?2 2tT2`BK2Mi�H ;2QK2i`v K�v #2 /2HB#@2`�i2Hv +?�M;2/ BM Q`/2` iQ K�MBTmH�i2 i?2 bT2+i`mK �M/ 2tTQb2 /Bz2`2Mi +QMi`B#miBQMbXRN

� #2bi@T`�+iB+2 �TT`Q�+? Bb iQ BMi2;`�i2 i?2 `2H2p�Mi 2[m�iBQMb Qp2` � `�M;2 Q7 �M;H2b /2@T2M/BM; QM i?2 2[mBTK2Mi ;2QK2i`vXky *m``2MiHv i?Bb /�i� Bb MQi BM+Hm/2/ BM :�HQ`2-#mi mb2`b �`2 �#H2 iQ BM+Hm/2 +Q``2+i2/ +`Qbb@b2+iBQMb 7`QK � CaPL@7Q`K�ii2/ /�i� }H2 B7�p�BH�#H2X�i ?B;? T?QiQM 2M2`;B2b- Bi ?�b #22M Q#b2`p2/ i?�i BMi2MbBiv +?�M;2b BM QtB/2b /Q MQi+Q``2H�i2 rBi? T?QiQM 2M2`;v �b T`2/B+i2/ #v i?2 �p�BH�#H2 i�#mH�i2/ /�i�c BM T�`iB+mH�`i?2 BMi2MbBiv Q7 P@kT bi�i2b BM */P- S#P2 �M/ AM2P3 b22K iQ p�`v KQ`2 HBM2�`Hv i?�MT`2/B+i2/Xky

6m`i?2` `2�/BM;

6Q` 7m`i?2` BM7Q`K�iBQM �#Qmi S1a i?2`2 �`2 bQK2 ?2HT7mH `2pB2rb BM i?2 �+�/2KB+ HBi2`�@im`2- BM+Hm/BM; _27b kR- kk �M/ kjX

C�+FbQM 2i �HX- UkyR3VX :�HQ`2, "`Q�/2MBM; �M/ r2B;?iBM; 7Q` bBKmH�iBQM Q7 T?QiQ2H2+i`QM bT2+i`Qb+QTvX CQm`M�H Q7 PT2M aQm`+2 aQ7ir�`2- jUkeV-ddjX ?iiTb,ff/QBXQ`;fRyXkRRy8fDQbbXyyddj

k

Page 17: Use all the codes!€¦ · Use all the codes! Enabling portable researchers Adam J. Jackson Email: adam.jackson{at}stfc.ac.uk Gitlab/Github:@ajjackson Twitter: @binarystate

Importance of good toolsAnecdote 3: Galore• In Feb 2018 there was a pensions dispute and many

university workers went on strike• One challenge for striking academic researchers is that

their future career depends on their outputs

• Adding support for GPAW was really easy

• “What can I do today that won’t help UCL but furthers my selfish research interests?”

Page 18: Use all the codes!€¦ · Use all the codes! Enabling portable researchers Adam J. Jackson Email: adam.jackson{at}stfc.ac.uk Gitlab/Github:@ajjackson Twitter: @binarystate

• Researchers can be brought into contact with many different atomistic codes – not always for obvious reasons!

• Analysis packages enable new and different science, while simple utilities can dramatically affect productivity

• Open-source development ensures access and rights to benefit from your previous work

• Technical compatibility is also needed if we are to take advantage of this• Some codes are easier to interface than others

Conclusions

Page 19: Use all the codes!€¦ · Use all the codes! Enabling portable researchers Adam J. Jackson Email: adam.jackson{at}stfc.ac.uk Gitlab/Github:@ajjackson Twitter: @binarystate

ASE, Vasp and CASTEPSo close, yet so far…

Page 20: Use all the codes!€¦ · Use all the codes! Enabling portable researchers Adam J. Jackson Email: adam.jackson{at}stfc.ac.uk Gitlab/Github:@ajjackson Twitter: @binarystate

Two plane-wave pseudopotential codesVASP CASTEP

Development led in Vienna, Austria Development led in Cambridge, UK

Fortran, MPI Fortran, MPI

Somewhat expensive Free (for non-commercial use)Pretty expensive (for commercial use)

Commercial framework MedeA Commercial framework Materials Studio

Highlights: post-HF (RPA, MP2, BSE), meta-GGA, library of trusted PPs, magnetism, FFT parallelism, TD-HF

Highlights: DFPT, NMR shifts, EELS, on-the-fly PPs, inbuilt help and unit conversions, preconditioned geometry optimization, efficient phonon methods

Interfaces: ASE, Phonopy, ShengBTE, Pymatgen, TDEP, ATAT, BANDUP, Calypso, USPEX, AIRSS, KLMC, OCLIMAX, Sumo, Galore…

Interfaces: ASE, ShengBTE, OptaDOS, Calypso, USPEX, AIRSS, Abins, OCLIMAX, Soprano, …

Page 21: Use all the codes!€¦ · Use all the codes! Enabling portable researchers Adam J. Jackson Email: adam.jackson{at}stfc.ac.uk Gitlab/Github:@ajjackson Twitter: @binarystate

Two plane-wave pseudopotential codesVASP CASTEP

Input files: fixed namesPOSCAR, POTCAR, KPOINTS, INCAR

Input files: two files, “seedname” with extensionsseedname.cell, seedname.param

Output files: binary (e.g. WAVECAR), text (e.g. DOSCAR) and XML (vasprun.xml)

Output files: binary (e.g. seedname.pdos_weights) and text (e.g. seedname.phonon)

Esoteric options: EDIFFG sets an energy criterion if positive and a force criterion if negative

Forgiving input: e.g. KPOINTS_MP_GRID is aliased to KPOINT_MP_GRID

Documentation: online wiki Documentation: Inbuilt help

Workflow: Tasks such as band structures and RPA can involve multiple runs of VASP with different input and precalculated density, orbitals. Limited restart ability.

Workflow: Band structure and optics “tasks” allow separate k-point meshes for SCF and properties. Powerful “phonon” task can restart unfinished DFPT calc.

Missing batteries: No plotting included (Sumo). External tools needed for off-Gamma phonons (Phonopy).

Missing batteries: OptaDOS is required for “basic” PDOS, optics analysis.

Page 22: Use all the codes!€¦ · Use all the codes! Enabling portable researchers Adam J. Jackson Email: adam.jackson{at}stfc.ac.uk Gitlab/Github:@ajjackson Twitter: @binarystate

Nice features in ASE CalculatorsVASP• The Vasp calculator provides “recipes” for methods which would usually

require several flags to be set.• These have less confusing names (‘revpbe’ -> ‘RE’)• Per-atom properties (Hubbard U and initial magnetic moments) have

awkward INCAR syntax. ASE takes care of this for you.• Structure is transparently sorted/unsorted to reduce file size/complexity

Page 23: Use all the codes!€¦ · Use all the codes! Enabling portable researchers Adam J. Jackson Email: adam.jackson{at}stfc.ac.uk Gitlab/Github:@ajjackson Twitter: @binarystate

Nice features in ASE CalculatorsCASTEP• The inbuilt help system is automatically read to collect information about valid

keys for your build of CASTEP. This allows tab completion.• Dict-style k-point settings have been implemented, providing access to Gamma-

centered meshes – more convenient than working out offsets.• CASTEP is case-insensitive… and so is its ASE property interface.

Page 24: Use all the codes!€¦ · Use all the codes! Enabling portable researchers Adam J. Jackson Email: adam.jackson{at}stfc.ac.uk Gitlab/Github:@ajjackson Twitter: @binarystate

Challenges for ASE CalculatorsVASP• Many keywords undocumented or part of community mods/extensions• Sorting/unsorting is complicated, leads to surprises for users – edge cases• Standard output file easily corrupted, overflows fields• XML file is invalid if run killed while a tag is open• Multiple competing implementations within ASE project

Page 25: Use all the codes!€¦ · Use all the codes! Enabling portable researchers Adam J. Jackson Email: adam.jackson{at}stfc.ac.uk Gitlab/Github:@ajjackson Twitter: @binarystate

Challenges for ASE CalculatorsCASTEP• The main output files are a “human-readable” .castep file and “machine-

readable” .castep_bin file• Currently we read the .castep file – sensitive to verbosity, missing data• The .castep_bin file should be more reliable, but…

• Formatted binary from Fortran• No formal specification, version-dependant• In discussion with CASTEP developers to make a Python parsing library

Page 26: Use all the codes!€¦ · Use all the codes! Enabling portable researchers Adam J. Jackson Email: adam.jackson{at}stfc.ac.uk Gitlab/Github:@ajjackson Twitter: @binarystate

Challenges for ASE CalculatorsCASTEP• Parameter system is complicated

• The seedname.cell + seedname.param file does not fit ASE’s Atoms + Calculator model – settings must be dispatched to the correct file

• Parameter setting is not consistent with other ASE Calculators• Reading too much information from output – explicit defaults, redundancy• Incompatible parameters require logic to “clobber” existing keys

Page 27: Use all the codes!€¦ · Use all the codes! Enabling portable researchers Adam J. Jackson Email: adam.jackson{at}stfc.ac.uk Gitlab/Github:@ajjackson Twitter: @binarystate

• CASTEP generally has a superior user interface• Several features that are helpful to humans (aliasing, auto verbosity, file

distribution, units) make it harder to wrap in ASE

• VASP has a poor user interface, but is generally simpler to wrap due to• More predictable file contents• Limited formatting options within a single file• Use of machine-readable XML

• In both cases we have used the ASE Calculator to provide additional convenience tools.

• Vasp interface development is more active – but coordination is an issue

Observations

Page 28: Use all the codes!€¦ · Use all the codes! Enabling portable researchers Adam J. Jackson Email: adam.jackson{at}stfc.ac.uk Gitlab/Github:@ajjackson Twitter: @binarystate

Acknowledgements• Thanks to the conference organisers• Other people did most of the work on the VASP and CASTEP calculators

• Alex Ganose (Sumo lead, now LBNL)

• STFC colleagues• Dominik Jochym (my boss – email dominik.jochym{at}stfc.ac.uk for

academic CASTEP licenses. One email per group…)• Simone Sturniolo (CASTEP calculator, Python/binary interface)

• CoSeC / UKCP (supporting this trip)

Page 29: Use all the codes!€¦ · Use all the codes! Enabling portable researchers Adam J. Jackson Email: adam.jackson{at}stfc.ac.uk Gitlab/Github:@ajjackson Twitter: @binarystate

@STFC_Matters Science and Technology Facilities CouncilScience and Technology Facilities Council