29
High-Performance Inverse Modeling with Reverse Monte Carlo Simulations Abhinav Sarje 1 Xiaoye Sherry Li Alexander Hexemer 1 Computational Research Advanced Light Source Lawrence Berkeley National Laboratory 09.10.14 International Conference for Parallel Processing 2014, Minneapolis

High-Performance Inverse Modeling with Reverse … Inverse Modeling with Reverse Monte Carlo Simulations Abhinav Sarje1 Xiaoye Sherry Li Alexander Hexemer 1Computational Research Advanced

Embed Size (px)

Citation preview

Page 1: High-Performance Inverse Modeling with Reverse … Inverse Modeling with Reverse Monte Carlo Simulations Abhinav Sarje1 Xiaoye Sherry Li Alexander Hexemer 1Computational Research Advanced

High-Performance Inverse Modelingwith Reverse Monte Carlo Simulations

Abhinav Sarje1

Xiaoye Sherry Li Alexander Hexemer1Computational Research Advanced Light Source

Lawrence Berkeley National Laboratory

09.10.14

International Conference for Parallel Processing 2014, Minneapolis

Page 2: High-Performance Inverse Modeling with Reverse … Inverse Modeling with Reverse Monte Carlo Simulations Abhinav Sarje1 Xiaoye Sherry Li Alexander Hexemer 1Computational Research Advanced

Introduction The RMC Methodology A High-Performance RMC Parallel RMC Performance Conclusions

Background and Motivation

• X-ray scattering is an important tool for scientists to probe structuralproperties of materials at nano-scales.

• SAXS, Small Angle X-ray scattering, is a widely used type.

• SAXS is primarily used for non-crystalline/amorphous materials.

• Data challenge: Currently about 25-50 TB / month of raw data fromONE beamline, and growing nearly exponentially. A Synchrotron has10s-100 beamlines.

• Beamline scientists have not had HPC resources available. Need tobridge the gap.

• Many materials are too complex to use sophisticated modelingtechniques.

• Reverse Monte Carlo simulation works well with SAXS data.

High-Performance Inverse Modeling with Reverse Monte Carlo Simulations. ICPP’14 asarje @ lbl.gov

Page 3: High-Performance Inverse Modeling with Reverse … Inverse Modeling with Reverse Monte Carlo Simulations Abhinav Sarje1 Xiaoye Sherry Li Alexander Hexemer 1Computational Research Advanced

Introduction The RMC Methodology A High-Performance RMC Parallel RMC Performance Conclusions

The General RMC Modeling Algorithm

High-Performance Inverse Modeling with Reverse Monte Carlo Simulations. ICPP’14 asarje @ lbl.gov

Page 4: High-Performance Inverse Modeling with Reverse … Inverse Modeling with Reverse Monte Carlo Simulations Abhinav Sarje1 Xiaoye Sherry Li Alexander Hexemer 1Computational Research Advanced

Introduction The RMC Methodology A High-Performance RMC Parallel RMC Performance Conclusions

Faster Updates of Fourier Transform

High-Performance Inverse Modeling with Reverse Monte Carlo Simulations. ICPP’14 asarje @ lbl.gov

Page 5: High-Performance Inverse Modeling with Reverse … Inverse Modeling with Reverse Monte Carlo Simulations Abhinav Sarje1 Xiaoye Sherry Li Alexander Hexemer 1Computational Research Advanced

Introduction The RMC Methodology A High-Performance RMC Parallel RMC Performance Conclusions

A Scaled RMC Modeling Algorithm

High-Performance Inverse Modeling with Reverse Monte Carlo Simulations. ICPP’14 asarje @ lbl.gov

Page 6: High-Performance Inverse Modeling with Reverse … Inverse Modeling with Reverse Monte Carlo Simulations Abhinav Sarje1 Xiaoye Sherry Li Alexander Hexemer 1Computational Research Advanced

Introduction The RMC Methodology A High-Performance RMC Parallel RMC Performance Conclusions

A Scaled RMC Modeling Algorithm

High-Performance Inverse Modeling with Reverse Monte Carlo Simulations. ICPP’14 asarje @ lbl.gov

Page 7: High-Performance Inverse Modeling with Reverse … Inverse Modeling with Reverse Monte Carlo Simulations Abhinav Sarje1 Xiaoye Sherry Li Alexander Hexemer 1Computational Research Advanced

Introduction The RMC Methodology A High-Performance RMC Parallel RMC Performance Conclusions

A Scaled RMC Modeling Algorithm

High-Performance Inverse Modeling with Reverse Monte Carlo Simulations. ICPP’14 asarje @ lbl.gov

Page 8: High-Performance Inverse Modeling with Reverse … Inverse Modeling with Reverse Monte Carlo Simulations Abhinav Sarje1 Xiaoye Sherry Li Alexander Hexemer 1Computational Research Advanced

Introduction The RMC Methodology A High-Performance RMC Parallel RMC Performance Conclusions

A Scaled RMC Modeling Algorithm

High-Performance Inverse Modeling with Reverse Monte Carlo Simulations. ICPP’14 asarje @ lbl.gov

Page 9: High-Performance Inverse Modeling with Reverse … Inverse Modeling with Reverse Monte Carlo Simulations Abhinav Sarje1 Xiaoye Sherry Li Alexander Hexemer 1Computational Research Advanced

Introduction The RMC Methodology A High-Performance RMC Parallel RMC Performance Conclusions

A Scaled RMC Modeling Algorithm

High-Performance Inverse Modeling with Reverse Monte Carlo Simulations. ICPP’14 asarje @ lbl.gov

Page 10: High-Performance Inverse Modeling with Reverse … Inverse Modeling with Reverse Monte Carlo Simulations Abhinav Sarje1 Xiaoye Sherry Li Alexander Hexemer 1Computational Research Advanced

Introduction The RMC Methodology A High-Performance RMC Parallel RMC Performance Conclusions

Autotuned Model Temperature

High-Performance Inverse Modeling with Reverse Monte Carlo Simulations. ICPP’14 asarje @ lbl.gov

Page 11: High-Performance Inverse Modeling with Reverse … Inverse Modeling with Reverse Monte Carlo Simulations Abhinav Sarje1 Xiaoye Sherry Li Alexander Hexemer 1Computational Research Advanced

Introduction The RMC Methodology A High-Performance RMC Parallel RMC Performance Conclusions

Distributed-Memory Parallelization over Multiple Tiles

• MPI for distributed memory level.

High-Performance Inverse Modeling with Reverse Monte Carlo Simulations. ICPP’14 asarje @ lbl.gov

Page 12: High-Performance Inverse Modeling with Reverse … Inverse Modeling with Reverse Monte Carlo Simulations Abhinav Sarje1 Xiaoye Sherry Li Alexander Hexemer 1Computational Research Advanced

Introduction The RMC Methodology A High-Performance RMC Parallel RMC Performance Conclusions

Distributed-Memory Parallelization over Multiple Tiles

High-Performance Inverse Modeling with Reverse Monte Carlo Simulations. ICPP’14 asarje @ lbl.gov

Page 13: High-Performance Inverse Modeling with Reverse … Inverse Modeling with Reverse Monte Carlo Simulations Abhinav Sarje1 Xiaoye Sherry Li Alexander Hexemer 1Computational Research Advanced

Introduction The RMC Methodology A High-Performance RMC Parallel RMC Performance Conclusions

Distributed-Memory Parallelization of a Single Tile

High-Performance Inverse Modeling with Reverse Monte Carlo Simulations. ICPP’14 asarje @ lbl.gov

Page 14: High-Performance Inverse Modeling with Reverse … Inverse Modeling with Reverse Monte Carlo Simulations Abhinav Sarje1 Xiaoye Sherry Li Alexander Hexemer 1Computational Research Advanced

Introduction The RMC Methodology A High-Performance RMC Parallel RMC Performance Conclusions

Subtile Implementations

• Two main types of kernels:1 Data parallel.2 Reduction.

• Multicore CPU implementation with OpenMP.

• Graphics Processor (GPU) implementation with Nvidia CUDA.

High-Performance Inverse Modeling with Reverse Monte Carlo Simulations. ICPP’14 asarje @ lbl.gov

Page 15: High-Performance Inverse Modeling with Reverse … Inverse Modeling with Reverse Monte Carlo Simulations Abhinav Sarje1 Xiaoye Sherry Li Alexander Hexemer 1Computational Research Advanced

Introduction The RMC Methodology A High-Performance RMC Parallel RMC Performance Conclusions

Validation with Model Reconstruction

ActualModels

High-Performance Inverse Modeling with Reverse Monte Carlo Simulations. ICPP’14 asarje @ lbl.gov

Page 16: High-Performance Inverse Modeling with Reverse … Inverse Modeling with Reverse Monte Carlo Simulations Abhinav Sarje1 Xiaoye Sherry Li Alexander Hexemer 1Computational Research Advanced

Introduction The RMC Methodology A High-Performance RMC Parallel RMC Performance Conclusions

Validation with Model Reconstruction

ActualModels

Start Models

High-Performance Inverse Modeling with Reverse Monte Carlo Simulations. ICPP’14 asarje @ lbl.gov

Page 17: High-Performance Inverse Modeling with Reverse … Inverse Modeling with Reverse Monte Carlo Simulations Abhinav Sarje1 Xiaoye Sherry Li Alexander Hexemer 1Computational Research Advanced

Introduction The RMC Methodology A High-Performance RMC Parallel RMC Performance Conclusions

Validation with Model Reconstruction

ActualModels

Start Models

ReconstructedModels

High-Performance Inverse Modeling with Reverse Monte Carlo Simulations. ICPP’14 asarje @ lbl.gov

Page 18: High-Performance Inverse Modeling with Reverse … Inverse Modeling with Reverse Monte Carlo Simulations Abhinav Sarje1 Xiaoye Sherry Li Alexander Hexemer 1Computational Research Advanced

Introduction The RMC Methodology A High-Performance RMC Parallel RMC Performance Conclusions

Model Reconstruction

Simulated and Experimental Patterns Computed Model

High-Performance Inverse Modeling with Reverse Monte Carlo Simulations. ICPP’14 asarje @ lbl.gov

Page 19: High-Performance Inverse Modeling with Reverse … Inverse Modeling with Reverse Monte Carlo Simulations Abhinav Sarje1 Xiaoye Sherry Li Alexander Hexemer 1Computational Research Advanced

Introduction The RMC Methodology A High-Performance RMC Parallel RMC Performance Conclusions

Experimental Performance: Environment

• All computations are in double precision.

1 Single node platform:• Dual-socket 2.2 GHz 8-core Intel Xeon E5-2660 (Sandy Bridge)• 16 cores and 32 hardware threads with 2 NUMA regions, and 64 GB RAM• Nvidia K20X (Kepler) graphics processor.

High-Performance Inverse Modeling with Reverse Monte Carlo Simulations. ICPP’14 asarje @ lbl.gov

Page 20: High-Performance Inverse Modeling with Reverse … Inverse Modeling with Reverse Monte Carlo Simulations Abhinav Sarje1 Xiaoye Sherry Li Alexander Hexemer 1Computational Research Advanced

Introduction The RMC Methodology A High-Performance RMC Parallel RMC Performance Conclusions

Experimental Performance: Number of iterations

101

102

103

104

105

106

107

108

5 10

20

40

80

160

320

500

Tim

e [m

s]

Number of iterations [x1000]

TotalFFT Update

ReductionMiscMPI

Multicore CPU

101

102

103

104

105

106

107

108

5 10

20

40

80

160

320

500

Tim

e [m

s]

Number of iterations [x1000]

TotalFFT Update

ReductionMiscMPI

Kepler GPU

• Tile size of 512 × 512.

High-Performance Inverse Modeling with Reverse Monte Carlo Simulations. ICPP’14 asarje @ lbl.gov

Page 21: High-Performance Inverse Modeling with Reverse … Inverse Modeling with Reverse Monte Carlo Simulations Abhinav Sarje1 Xiaoye Sherry Li Alexander Hexemer 1Computational Research Advanced

Introduction The RMC Methodology A High-Performance RMC Parallel RMC Performance Conclusions

Experimental Performance: Scaling with number of tiles

101

102

103

104

105

106

107

108

1 2 4 8 16

32

64

Tim

e [m

s]

Number of tiles

TotalFFT Update

ReductionMiscMPI

Multicore CPU

101

102

103

104

105

106

107

108

1 2 4 8 16

32

64

Tim

e [m

s]

Number of tiles

TotalFFT Update

ReductionMiscMPI

Kepler GPU

• Total of 10,000 iterations.

High-Performance Inverse Modeling with Reverse Monte Carlo Simulations. ICPP’14 asarje @ lbl.gov

Page 22: High-Performance Inverse Modeling with Reverse … Inverse Modeling with Reverse Monte Carlo Simulations Abhinav Sarje1 Xiaoye Sherry Li Alexander Hexemer 1Computational Research Advanced

Introduction The RMC Methodology A High-Performance RMC Parallel RMC Performance Conclusions

Experimental Performance: Scaling with tile size

101

102

103

104

105

106

107

108

64

128

256

512

102

4

204

8

409

6

Tim

e [m

s]

Tile size

TotalFFT Update

ReductionMiscMPI

Multicore CPU

101

102

103

104

105

106

107

108

64

128

256

512

102

4

204

8

409

6

Tim

e [m

s]

Tile size

TotalFFT Update

ReductionMiscMPI

Kepler GPU

• Total of 40,000 iterations.

High-Performance Inverse Modeling with Reverse Monte Carlo Simulations. ICPP’14 asarje @ lbl.gov

Page 23: High-Performance Inverse Modeling with Reverse … Inverse Modeling with Reverse Monte Carlo Simulations Abhinav Sarje1 Xiaoye Sherry Li Alexander Hexemer 1Computational Research Advanced

Introduction The RMC Methodology A High-Performance RMC Parallel RMC Performance Conclusions

Experimental Performance: Environment

1 Multicore CPU cluster:• Cray XE6 supercomputer (‘Hopper’ at NERSC/LBNL)• Each node is a dual-socket 2.1 GHz 12-core AMD MagnyCours processors• 24 cores per node with 4 NUMA domains, and 32 GB RAM

2 GPU cluster:• Cray XK7 supercomputer (‘Titan’ at OLCF/ORNL)• Each node is a 2.2 GHz 16-core AMD Opteron 6274 (Interlagos)• 16 cores per node with 2 NUMA domains, and 32 GB RAM• One Nvidia K20X (Kepler) GPU on each node.

High-Performance Inverse Modeling with Reverse Monte Carlo Simulations. ICPP’14 asarje @ lbl.gov

Page 24: High-Performance Inverse Modeling with Reverse … Inverse Modeling with Reverse Monte Carlo Simulations Abhinav Sarje1 Xiaoye Sherry Li Alexander Hexemer 1Computational Research Advanced

Introduction The RMC Methodology A High-Performance RMC Parallel RMC Performance Conclusions

Experimental Performance: Weak scaling

104

105

106

107

24

48

96

192

384

768

153

6

307

2

614

4

122

88

Tim

e [m

s]

Number of Cores

t=2p, s=2048t=p, s=2048t=p, s=512

t=p/2, s=512

Hopper

104

105

106

1 2 4 8 16

32

64

128

256

512

102

4

Tim

e [m

s]

Number of Nodes (GPUs)

t=2p, s=2048t=p, s=2048t=2p, s=512t=p, s=512

Titan

High-Performance Inverse Modeling with Reverse Monte Carlo Simulations. ICPP’14 asarje @ lbl.gov

Page 25: High-Performance Inverse Modeling with Reverse … Inverse Modeling with Reverse Monte Carlo Simulations Abhinav Sarje1 Xiaoye Sherry Li Alexander Hexemer 1Computational Research Advanced

Introduction The RMC Methodology A High-Performance RMC Parallel RMC Performance Conclusions

Experimental Performance: Strong scaling

104

105

106

107

108

109

1010

24

48

96

192

384

768

153

6

307

2

614

4

122

88

Tim

e [m

s]

Number of Cores

t=1024, s=2048t=1024, s=512t=128, s=2048t=128, s=512

Hopper

103

104

105

106

107

108

1 2 4 8 16

32

64

128

256

512

Tim

e [m

s]

Number of Nodes (GPUs)

t=1024, s=2048t=1024, s=512t=128, s=512

Titan

High-Performance Inverse Modeling with Reverse Monte Carlo Simulations. ICPP’14 asarje @ lbl.gov

Page 26: High-Performance Inverse Modeling with Reverse … Inverse Modeling with Reverse Monte Carlo Simulations Abhinav Sarje1 Xiaoye Sherry Li Alexander Hexemer 1Computational Research Advanced

Introduction The RMC Methodology A High-Performance RMC Parallel RMC Performance Conclusions

Conclusions

1 Synchrotron beamlines need HPC for data processing to be efficient.

2 Monte-Carlo and Reverse Monte-Carlo simulations are highly usedmethods in scientific computing, and same parallelization techniquescan be applied.

3 RMC works when all other sophisticated methods fail.

4 Reduction operations are nearly one order of magnitude slower onGPUs compared to multicore CPUs. Better caching and lesssynchronizations on CPUs are two primary factors.

5 Even with large number of reduction kernels in the application, at scalethe overall performance on GPUs is about an order of magnitudebetter than on multicore CPUs!

High-Performance Inverse Modeling with Reverse Monte Carlo Simulations. ICPP’14 asarje @ lbl.gov

Page 27: High-Performance Inverse Modeling with Reverse … Inverse Modeling with Reverse Monte Carlo Simulations Abhinav Sarje1 Xiaoye Sherry Li Alexander Hexemer 1Computational Research Advanced

Introduction The RMC Methodology A High-Performance RMC Parallel RMC Performance Conclusions

Future Work

1 Non-binary model reconstruction.

2 3-D structures reconstruction.

3 Time-series fitting.

4 Realtime?

High-Performance Inverse Modeling with Reverse Monte Carlo Simulations. ICPP’14 asarje @ lbl.gov

Page 28: High-Performance Inverse Modeling with Reverse … Inverse Modeling with Reverse Monte Carlo Simulations Abhinav Sarje1 Xiaoye Sherry Li Alexander Hexemer 1Computational Research Advanced

Introduction The RMC Methodology A High-Performance RMC Parallel RMC Performance Conclusions

Acknowledgements

• Supported by the Director, Office of Science, of the U.S. Department ofEnergy under Contract No. DE-AC02-05CH11231, and DOE EarlyCareer Research grant awarded to Alexander Hexemer.

• Resource usage at National Energy Research Scientific Computing Center(NERSC) at the Lawrence Berkeley National Laboratory, supported bythe Office of Science of the U.S. Department of Energy under ContractNo. DE-AC02- 05CH11231.

• Resource usage at Oak Ridge Leadership Computing Facility (OLCF) at theOak Ridge National Laboratory, supported by the Office of Science of theU.S. Department of Energy under Contract No. DE-AC05-00OR22725.

High-Performance Inverse Modeling with Reverse Monte Carlo Simulations. ICPP’14 asarje @ lbl.gov

Page 29: High-Performance Inverse Modeling with Reverse … Inverse Modeling with Reverse Monte Carlo Simulations Abhinav Sarje1 Xiaoye Sherry Li Alexander Hexemer 1Computational Research Advanced

Introduction The RMC Methodology A High-Performance RMC Parallel RMC Performance Conclusions

Thank you!

High-Performance Inverse Modeling with Reverse Monte Carlo Simulations. ICPP’14 asarje @ lbl.gov