37
Intro Correctness Usage Road-map VASP on GPUs When and how Max Hutchinson University of Chicago November 18, 2015 Max Hutchinson (UofC) VASP on GPUs November 18, 2015 1 / 19

VASP on GPUs - Nvidiaimages.nvidia.com/events/sc15/pdfs/SC5120-vasp-gpus.pdf · IntroCorrectnessUsageRoad-map VASP on GPUs When and how Max Hutchinson University of Chicago November

  • Upload
    others

  • View
    48

  • Download
    1

Embed Size (px)

Citation preview

Page 1: VASP on GPUs - Nvidiaimages.nvidia.com/events/sc15/pdfs/SC5120-vasp-gpus.pdf · IntroCorrectnessUsageRoad-map VASP on GPUs When and how Max Hutchinson University of Chicago November

Intro Correctness Usage Road-map

VASP on GPUsWhen and how

Max HutchinsonUniversity of Chicago

November 18, 2015

Max Hutchinson (UofC) VASP on GPUs November 18, 2015 1 / 19

Page 2: VASP on GPUs - Nvidiaimages.nvidia.com/events/sc15/pdfs/SC5120-vasp-gpus.pdf · IntroCorrectnessUsageRoad-map VASP on GPUs When and how Max Hutchinson University of Chicago November

Intro Correctness Usage Road-map

Big thanks to

Carnegie Mellon group

Michael Widom

ENS/IFPEN group

Paul Fleurat-Lessard

Thomas Guignon

Ani Anciaux-Sedrakian

Philippe Sautet

RWTH Aachen Group

Stefan Maintz Bernhard Eck Richard Dronskowski

Max Hutchinson (UofC) VASP on GPUs November 18, 2015 2 / 19

Page 3: VASP on GPUs - Nvidiaimages.nvidia.com/events/sc15/pdfs/SC5120-vasp-gpus.pdf · IntroCorrectnessUsageRoad-map VASP on GPUs When and how Max Hutchinson University of Chicago November

Intro Correctness Usage Road-map

Big thanks to

University of Vienna group

Georg Kresse Martijn Marsman Doris Vogtenhuber

NVIDIAChristoph Angerer

Jeroen Bedorf

Arash Ashari

Mark Berger

Sarah Tariq

Dusan Stosic

Paul Springer

Jerry Chen

Anthony Scudiero

Darko Stosic

Przemek Tredak

Cliff Woolley

Max Hutchinson (UofC) VASP on GPUs November 18, 2015 3 / 19

Page 4: VASP on GPUs - Nvidiaimages.nvidia.com/events/sc15/pdfs/SC5120-vasp-gpus.pdf · IntroCorrectnessUsageRoad-map VASP on GPUs When and how Max Hutchinson University of Chicago November

Intro Correctness Usage Road-map

VASP on GPUsWhen and how

Max HutchinsonUniversity of Chicago

November 18, 2015

Max Hutchinson (UofC) VASP on GPUs November 18, 2015 4 / 19

Page 5: VASP on GPUs - Nvidiaimages.nvidia.com/events/sc15/pdfs/SC5120-vasp-gpus.pdf · IntroCorrectnessUsageRoad-map VASP on GPUs When and how Max Hutchinson University of Chicago November

Intro Correctness Usage Road-map

What is VASP?

VASP is a complex package for performing ab-initioquantum-mechanical molecular dynamics (MD) simulationsusing pseudopotentials or the projector-augmented wavemethod and a plane wave basis set1.

1VASP the GUIDEMax Hutchinson (UofC) VASP on GPUs November 18, 2015 5 / 19

Page 6: VASP on GPUs - Nvidiaimages.nvidia.com/events/sc15/pdfs/SC5120-vasp-gpus.pdf · IntroCorrectnessUsageRoad-map VASP on GPUs When and how Max Hutchinson University of Chicago November

Intro Correctness Usage Road-map

Why VASP?12-20% of CPU cycles @ HPC centers

Academia

Physics

Materials science

Physical chemistry

Chemical engineering

Industry

Materials

Oil and gas

Big semiconductor

Chemicals

Usage @ Ohio SC’s Oakley 2

212/14 – 2/15, via pbsacctMax Hutchinson (UofC) VASP on GPUs November 18, 2015 6 / 19

Page 7: VASP on GPUs - Nvidiaimages.nvidia.com/events/sc15/pdfs/SC5120-vasp-gpus.pdf · IntroCorrectnessUsageRoad-map VASP on GPUs When and how Max Hutchinson University of Chicago November

Intro Correctness Usage Road-map

A brief historyMultiple prototypes (2009-2012)

Diagonalization for traditional DFT34(IFPEN, ENS, Aachen)

Exact-exchange for hybrid functionals5(CMU, UChicago)

Cooperation and tuning (2012 - 2014)

Merge prototypes with VASP 5.3.1

Performance tune with NVIDIA engineers3M. Hacene et al., DOI:10.1002/jcc.230964S. Maintz et al., DOI:10.1016/j.cpc.2011.03.0105M. Hutchinson and M. Widom, DOI:10.1016/j.cpc.2012.02.017

Max Hutchinson (UofC) VASP on GPUs November 18, 2015 7 / 19

Page 8: VASP on GPUs - Nvidiaimages.nvidia.com/events/sc15/pdfs/SC5120-vasp-gpus.pdf · IntroCorrectnessUsageRoad-map VASP on GPUs When and how Max Hutchinson University of Chicago November

Intro Correctness Usage Road-map

A brief historyMultiple prototypes (2009-2012)

Diagonalization for traditional DFT34(IFPEN, ENS, Aachen)

Exact-exchange for hybrid functionals5(CMU, UChicago)

Cooperation and tuning (2012 - 2014)

Merge prototypes with VASP 5.3.1

Performance tune with NVIDIA engineers3M. Hacene et al., DOI:10.1002/jcc.230964S. Maintz et al., DOI:10.1016/j.cpc.2011.03.0105M. Hutchinson and M. Widom, DOI:10.1016/j.cpc.2012.02.017

Max Hutchinson (UofC) VASP on GPUs November 18, 2015 7 / 19

Page 9: VASP on GPUs - Nvidiaimages.nvidia.com/events/sc15/pdfs/SC5120-vasp-gpus.pdf · IntroCorrectnessUsageRoad-map VASP on GPUs When and how Max Hutchinson University of Chicago November

Intro Correctness Usage Road-map

A brief historyMultiple prototypes (2009-2012)

Diagonalization for traditional DFT34(IFPEN, ENS, Aachen)

Exact-exchange for hybrid functionals5(CMU, UChicago)

Cooperation and tuning (2012 - 2014)

Merge prototypes with VASP 5.3.1

Performance tune with NVIDIA engineers3M. Hacene et al., DOI:10.1002/jcc.230964S. Maintz et al., DOI:10.1016/j.cpc.2011.03.0105M. Hutchinson and M. Widom, DOI:10.1016/j.cpc.2012.02.017

Max Hutchinson (UofC) VASP on GPUs November 18, 2015 7 / 19

Page 10: VASP on GPUs - Nvidiaimages.nvidia.com/events/sc15/pdfs/SC5120-vasp-gpus.pdf · IntroCorrectnessUsageRoad-map VASP on GPUs When and how Max Hutchinson University of Chicago November

Intro Correctness Usage Road-map

A brief history

Acceptance and distribution (2015)

GPU support accepted by Vienna

Integrated development environments

Established correctness

To be included in standard VASP releases

Max Hutchinson (UofC) VASP on GPUs November 18, 2015 8 / 19

Page 11: VASP on GPUs - Nvidiaimages.nvidia.com/events/sc15/pdfs/SC5120-vasp-gpus.pdf · IntroCorrectnessUsageRoad-map VASP on GPUs When and how Max Hutchinson University of Chicago November

Intro Correctness Usage Road-map

Establishing correctness

We’ve taken a three-pronged approach to validation:1. Internal testing against ∼ 50 cases collected from collaborators

Focus on actively ported algorithms and models

2. Acceptance testing against ∼ 100 cases by ViennaCover wider variety of VASP usage patterns

3. Beta testing by 37 early access groupsCover a wider variety of hardware and environments

Max Hutchinson (UofC) VASP on GPUs November 18, 2015 9 / 19

Page 12: VASP on GPUs - Nvidiaimages.nvidia.com/events/sc15/pdfs/SC5120-vasp-gpus.pdf · IntroCorrectnessUsageRoad-map VASP on GPUs When and how Max Hutchinson University of Chicago November

Intro Correctness Usage Road-map

Establishing correctness

We’ve taken a three-pronged approach to validation:1. Internal testing against ∼ 50 cases collected from collaborators

Focus on actively ported algorithms and models

2. Acceptance testing against ∼ 100 cases by ViennaCover wider variety of VASP usage patterns

3. Beta testing by 37 early access groupsCover a wider variety of hardware and environments

Max Hutchinson (UofC) VASP on GPUs November 18, 2015 9 / 19

Page 13: VASP on GPUs - Nvidiaimages.nvidia.com/events/sc15/pdfs/SC5120-vasp-gpus.pdf · IntroCorrectnessUsageRoad-map VASP on GPUs When and how Max Hutchinson University of Chicago November

Intro Correctness Usage Road-map

Establishing correctness

We’ve taken a three-pronged approach to validation:1. Internal testing against ∼ 50 cases collected from collaborators

Focus on actively ported algorithms and models

2. Acceptance testing against ∼ 100 cases by ViennaCover wider variety of VASP usage patterns

3. Beta testing by 37 early access groupsCover a wider variety of hardware and environments

Max Hutchinson (UofC) VASP on GPUs November 18, 2015 9 / 19

Page 14: VASP on GPUs - Nvidiaimages.nvidia.com/events/sc15/pdfs/SC5120-vasp-gpus.pdf · IntroCorrectnessUsageRoad-map VASP on GPUs When and how Max Hutchinson University of Chicago November

Intro Correctness Usage Road-map

Establishing correctness

We’ve taken a three-pronged approach to validation:1. Internal testing against ∼ 50 cases collected from collaborators

Focus on actively ported algorithms and models

2. Acceptance testing against ∼ 100 cases by ViennaCover wider variety of VASP usage patterns

3. Beta testing by 37 early access groupsCover a wider variety of hardware and environments

Max Hutchinson (UofC) VASP on GPUs November 18, 2015 9 / 19

Page 15: VASP on GPUs - Nvidiaimages.nvidia.com/events/sc15/pdfs/SC5120-vasp-gpus.pdf · IntroCorrectnessUsageRoad-map VASP on GPUs When and how Max Hutchinson University of Chicago November

Intro Correctness Usage Road-map

Beta testingThree types of issues

Use of unsupported features

Merge with site-customized files (esp. main.F)

Bugs in edge cases

Generally positive feedback

“The short version is ‘it works”’

“So far I found no problems, the code is fast and stable.”

“Absolute time to solution is faster with GPUs.”Max Hutchinson (UofC) VASP on GPUs November 18, 2015 10 / 19

Page 16: VASP on GPUs - Nvidiaimages.nvidia.com/events/sc15/pdfs/SC5120-vasp-gpus.pdf · IntroCorrectnessUsageRoad-map VASP on GPUs When and how Max Hutchinson University of Chicago November

Intro Correctness Usage Road-map

Release schedule

GPU support in official release

Add CUDA paths and libraries to makefile.include

make gpu gpu ncl

Executables are bin/gpu and bin/gpu ncl

We expect the release by the end of the 2015.

Max Hutchinson (UofC) VASP on GPUs November 18, 2015 11 / 19

Page 17: VASP on GPUs - Nvidiaimages.nvidia.com/events/sc15/pdfs/SC5120-vasp-gpus.pdf · IntroCorrectnessUsageRoad-map VASP on GPUs When and how Max Hutchinson University of Chicago November

Intro Correctness Usage Road-map

Release schedule

GPU support in official release

Add CUDA paths and libraries to makefile.include

make gpu gpu ncl

Executables are bin/gpu and bin/gpu ncl

We expect the release by the end of the 2015.

Max Hutchinson (UofC) VASP on GPUs November 18, 2015 11 / 19

Page 18: VASP on GPUs - Nvidiaimages.nvidia.com/events/sc15/pdfs/SC5120-vasp-gpus.pdf · IntroCorrectnessUsageRoad-map VASP on GPUs When and how Max Hutchinson University of Chicago November

Intro Correctness Usage Road-map

Feature support

Fully supported

Davidson

R-space projection

RMM-DIIS

Non-collinear

Exact-exchange

KPAR

Passively supported

[sc]GW[0] Damped All (Algo)

Unsupported

G-space projection NCORE > 1 EFIELD PEAD

Max Hutchinson (UofC) VASP on GPUs November 18, 2015 12 / 19

Page 19: VASP on GPUs - Nvidiaimages.nvidia.com/events/sc15/pdfs/SC5120-vasp-gpus.pdf · IntroCorrectnessUsageRoad-map VASP on GPUs When and how Max Hutchinson University of Chicago November

Intro Correctness Usage Road-map

Feature support

Fully supported

Davidson

R-space projection

RMM-DIIS

Non-collinear

Exact-exchange

KPAR

Passively supported

[sc]GW[0] Damped All (Algo)

Unsupported

G-space projection NCORE > 1 EFIELD PEAD

Max Hutchinson (UofC) VASP on GPUs November 18, 2015 12 / 19

Page 20: VASP on GPUs - Nvidiaimages.nvidia.com/events/sc15/pdfs/SC5120-vasp-gpus.pdf · IntroCorrectnessUsageRoad-map VASP on GPUs When and how Max Hutchinson University of Chicago November

Intro Correctness Usage Road-map

Feature support

Fully supported

Davidson

R-space projection

RMM-DIIS

Non-collinear

Exact-exchange

KPAR

Passively supported

[sc]GW[0] Damped All (Algo)

Unsupported

G-space projection NCORE > 1 EFIELD PEAD

Max Hutchinson (UofC) VASP on GPUs November 18, 2015 12 / 19

Page 21: VASP on GPUs - Nvidiaimages.nvidia.com/events/sc15/pdfs/SC5120-vasp-gpus.pdf · IntroCorrectnessUsageRoad-map VASP on GPUs When and how Max Hutchinson University of Chicago November

Intro Correctness Usage Road-map

Feature support

Fully supported

Davidson

R-space projection

RMM-DIIS

Non-collinear

Exact-exchange

KPAR

Passively supported

[sc]GW[0] Damped All (Algo)

Unsupported

G-space projection NCORE > 1 EFIELD PEAD

Max Hutchinson (UofC) VASP on GPUs November 18, 2015 12 / 19

Page 22: VASP on GPUs - Nvidiaimages.nvidia.com/events/sc15/pdfs/SC5120-vasp-gpus.pdf · IntroCorrectnessUsageRoad-map VASP on GPUs When and how Max Hutchinson University of Chicago November

Intro Correctness Usage Road-map

Traditional DFT

You shouldRun with MPS (multi-process service)

Experiment with multiple CPU ranks per GPU

Works bestLarge numbers of bands

Large numbers of plane-waves

You can expect 2-4x for large systems with CPU/GPU balance; better on GPU-heavyworkstations.

Max Hutchinson (UofC) VASP on GPUs November 18, 2015 13 / 19

Page 23: VASP on GPUs - Nvidiaimages.nvidia.com/events/sc15/pdfs/SC5120-vasp-gpus.pdf · IntroCorrectnessUsageRoad-map VASP on GPUs When and how Max Hutchinson University of Chicago November

Intro Correctness Usage Road-map

Example: Si super-cell512 Si atoms

1282 bands

864000 PWs

Algo = Normal

1 2 4 80

1

2

3

4

Nodes

2xK80 vs 2xHaswell-EP

Max Hutchinson (UofC) VASP on GPUs November 18, 2015 14 / 19

Page 24: VASP on GPUs - Nvidiaimages.nvidia.com/events/sc15/pdfs/SC5120-vasp-gpus.pdf · IntroCorrectnessUsageRoad-map VASP on GPUs When and how Max Hutchinson University of Chicago November

Intro Correctness Usage Road-map

Hybrid functionals (exact-exchange)

You shouldUse 1 or 2 CPUs rank per GPU

Set NSIM = NBAND / (2*NCPU)

Works bestLarge numbers of plane-waves

Small number of ionic types

You can expect 1.5-6x, highly dependent on system size; better on GPU-heavyworkstations.

Max Hutchinson (UofC) VASP on GPUs November 18, 2015 15 / 19

Page 25: VASP on GPUs - Nvidiaimages.nvidia.com/events/sc15/pdfs/SC5120-vasp-gpus.pdf · IntroCorrectnessUsageRoad-map VASP on GPUs When and how Max Hutchinson University of Chicago November

Intro Correctness Usage Road-map

Example: β-rhombohedral boron105 Boron atoms

216 bands

110592 PWs

Algo = Normal

1 2 4 80

1

2

3

4

5

Nodes

2xK80 vs 2xHaswell-EP

Max Hutchinson (UofC) VASP on GPUs November 18, 2015 16 / 19

Page 26: VASP on GPUs - Nvidiaimages.nvidia.com/events/sc15/pdfs/SC5120-vasp-gpus.pdf · IntroCorrectnessUsageRoad-map VASP on GPUs When and how Max Hutchinson University of Chicago November

Intro Correctness Usage Road-map

Road-map: Features

1. Gamma-point for very large unit cells

2. G-space projection for small to medium unit cells

3. Van der Waals density functional (vdF-DF)

4. Random phase approximation (RPA)

5. Active support for [sc]GW[0]

6. NCORE > 1 for highly parallel runs

Max Hutchinson (UofC) VASP on GPUs November 18, 2015 17 / 19

Page 27: VASP on GPUs - Nvidiaimages.nvidia.com/events/sc15/pdfs/SC5120-vasp-gpus.pdf · IntroCorrectnessUsageRoad-map VASP on GPUs When and how Max Hutchinson University of Chicago November

Intro Correctness Usage Road-map

Road-map: Performance

Better performance for moderate sizesAdd blocking to all core kernels

Add batching to all library calls

Better performance for large sizesUpdate Magma support

Merge with threaded code base to reduce ranks per GPU

Better performance for hybrid functionalsParallelize outer loops

Pad projection sizes

Max Hutchinson (UofC) VASP on GPUs November 18, 2015 18 / 19

Page 28: VASP on GPUs - Nvidiaimages.nvidia.com/events/sc15/pdfs/SC5120-vasp-gpus.pdf · IntroCorrectnessUsageRoad-map VASP on GPUs When and how Max Hutchinson University of Chicago November

Intro Correctness Usage Road-map

Road-map: Performance

Better performance for moderate sizesAdd blocking to all core kernels

Add batching to all library calls

Better performance for large sizesUpdate Magma support

Merge with threaded code base to reduce ranks per GPU

Better performance for hybrid functionalsParallelize outer loops

Pad projection sizes

Max Hutchinson (UofC) VASP on GPUs November 18, 2015 18 / 19

Page 29: VASP on GPUs - Nvidiaimages.nvidia.com/events/sc15/pdfs/SC5120-vasp-gpus.pdf · IntroCorrectnessUsageRoad-map VASP on GPUs When and how Max Hutchinson University of Chicago November

Intro Correctness Usage Road-map

Road-map: Performance

Better performance for moderate sizesAdd blocking to all core kernels

Add batching to all library calls

Better performance for large sizesUpdate Magma support

Merge with threaded code base to reduce ranks per GPU

Better performance for hybrid functionalsParallelize outer loops

Pad projection sizes

Max Hutchinson (UofC) VASP on GPUs November 18, 2015 18 / 19

Page 30: VASP on GPUs - Nvidiaimages.nvidia.com/events/sc15/pdfs/SC5120-vasp-gpus.pdf · IntroCorrectnessUsageRoad-map VASP on GPUs When and how Max Hutchinson University of Chicago November

Intro Correctness Usage Road-map

Road-map: Performance

Better performance for moderate sizesAdd blocking to all core kernels

Add batching to all library calls

Better performance for large sizesUpdate Magma support

Merge with threaded code base to reduce ranks per GPU

Better performance for hybrid functionalsParallelize outer loops

Pad projection sizes

Max Hutchinson (UofC) VASP on GPUs November 18, 2015 18 / 19

Page 31: VASP on GPUs - Nvidiaimages.nvidia.com/events/sc15/pdfs/SC5120-vasp-gpus.pdf · IntroCorrectnessUsageRoad-map VASP on GPUs When and how Max Hutchinson University of Chicago November

Intro Correctness Usage Road-map

Summary

GPU VASP will give you the right answerExtensive testing in Beta and for Vienna’s acceptance

GPU VASP will give 2-4x performance on moderate to large systemsThe bigger the better

We are continuing to add feature support and improve performanceGamma-point is next on the list

When you get GPU support in your next VASP release, try it.

Max Hutchinson (UofC) VASP on GPUs November 18, 2015 19 / 19

Page 32: VASP on GPUs - Nvidiaimages.nvidia.com/events/sc15/pdfs/SC5120-vasp-gpus.pdf · IntroCorrectnessUsageRoad-map VASP on GPUs When and how Max Hutchinson University of Chicago November

Intro Correctness Usage Road-map

Summary

GPU VASP will give you the right answerExtensive testing in Beta and for Vienna’s acceptance

GPU VASP will give 2-4x performance on moderate to large systemsThe bigger the better

We are continuing to add feature support and improve performanceGamma-point is next on the list

When you get GPU support in your next VASP release, try it.

Max Hutchinson (UofC) VASP on GPUs November 18, 2015 19 / 19

Page 33: VASP on GPUs - Nvidiaimages.nvidia.com/events/sc15/pdfs/SC5120-vasp-gpus.pdf · IntroCorrectnessUsageRoad-map VASP on GPUs When and how Max Hutchinson University of Chicago November

Intro Correctness Usage Road-map

Summary

GPU VASP will give you the right answerExtensive testing in Beta and for Vienna’s acceptance

GPU VASP will give 2-4x performance on moderate to large systemsThe bigger the better

We are continuing to add feature support and improve performanceGamma-point is next on the list

When you get GPU support in your next VASP release, try it.

Max Hutchinson (UofC) VASP on GPUs November 18, 2015 19 / 19

Page 34: VASP on GPUs - Nvidiaimages.nvidia.com/events/sc15/pdfs/SC5120-vasp-gpus.pdf · IntroCorrectnessUsageRoad-map VASP on GPUs When and how Max Hutchinson University of Chicago November

Intro Correctness Usage Road-map

Summary

GPU VASP will give you the right answerExtensive testing in Beta and for Vienna’s acceptance

GPU VASP will give 2-4x performance on moderate to large systemsThe bigger the better

We are continuing to add feature support and improve performanceGamma-point is next on the list

When you get GPU support in your next VASP release, try it.

Max Hutchinson (UofC) VASP on GPUs November 18, 2015 19 / 19

Page 35: VASP on GPUs - Nvidiaimages.nvidia.com/events/sc15/pdfs/SC5120-vasp-gpus.pdf · IntroCorrectnessUsageRoad-map VASP on GPUs When and how Max Hutchinson University of Chicago November

Intro Correctness Usage Road-map

Summary

GPU VASP will give you the right answerExtensive testing in Beta and for Vienna’s acceptance

GPU VASP will give 2-4x performance on moderate to large systemsThe bigger the better

We are continuing to add feature support and improve performanceGamma-point is next on the list

When you get GPU support in your next VASP release, try it.

Max Hutchinson (UofC) VASP on GPUs November 18, 2015 19 / 19

Page 36: VASP on GPUs - Nvidiaimages.nvidia.com/events/sc15/pdfs/SC5120-vasp-gpus.pdf · IntroCorrectnessUsageRoad-map VASP on GPUs When and how Max Hutchinson University of Chicago November

Performance examples

Max Hutchinson (UofC) VASP on GPUs November 18, 2015 1 / 2

Page 37: VASP on GPUs - Nvidiaimages.nvidia.com/events/sc15/pdfs/SC5120-vasp-gpus.pdf · IntroCorrectnessUsageRoad-map VASP on GPUs When and how Max Hutchinson University of Chicago November

Performance examples

More performance

Max Hutchinson (UofC) VASP on GPUs November 18, 2015 2 / 2