96
Determination of line tension in the 3D Ising model on GPUs Benjamin Block , Tobias Preis, David Winter, Suam Kim, Peter Virnau, Kurt Binder University of Mainz, Institute for Physics SimGPU 2013

Determination of line tension in the 3D Ising model on GPUs

Embed Size (px)

DESCRIPTION

This is the talk I gave at the 2nd International Symposium “Computer Simulations on GPU” (SimGPU 2013)

Citation preview

Page 1: Determination of line tension in the 3D Ising model on GPUs

Determination of line tension

in the 3D Ising model on GPUs

Benjamin Block, Tobias Preis, David Winter, Suam Kim,

Peter Virnau, Kurt Binder

University of Mainz, Institute for Physics

SimGPU 2013

Page 2: Determination of line tension in the 3D Ising model on GPUs

Topic Touched

1. Ising Model on GPU

Page 3: Determination of line tension in the 3D Ising model on GPUs

Topic Touched

1. Ising Model on GPU

2. Line Tension Estimation

Page 4: Determination of line tension in the 3D Ising model on GPUs

Ising Model

OrderedRandom Transition

+ nearest neighbor interaction <

Page 5: Determination of line tension in the 3D Ising model on GPUs

Monte Carlo

Perform successive spin flips!

Probability: Metropolis criterion

Inherently serial... but

Page 6: Determination of line tension in the 3D Ising model on GPUs

GPU Implementation

• GPUs: massively parallel processing

T. Preis, P. Virnau, W. Paul, J. J. Schneider:

GPU Accelerated Monte Carlo Simulation of

the 2D and 3D Ising Model, J. Comp. Phys.,

228 (2009)

• Architecture specific optimization

• Multi GPU implementation

Page 7: Determination of line tension in the 3D Ising model on GPUs

Parallelization of Lattice Updates

Idea: Update non-interacting domains in parallel

Checkerboard Update

Page 8: Determination of line tension in the 3D Ising model on GPUs

Reduce slow memory access

Page 9: Determination of line tension in the 3D Ising model on GPUs

Reduce slow memory access

uint4 blocks

in global

memory

Idea: Store spins in 128 bit (uint4) chunks

Page 10: Determination of line tension in the 3D Ising model on GPUs

Reduce slow memory access

uint4 blocks

in global

memory

Idea: Store spins in 128 bit (uint4) chunks

Access 128 spins with one memory lookup

Page 11: Determination of line tension in the 3D Ising model on GPUs

Reduce slow memory access

uint4 blocks

in global

memory

One

thread

Idea: Store spins in 128 bit (uint4) chunks

Access 128 spins with one memory lookup

Extract spins in local thread memory (registers) for

computation

Page 12: Determination of line tension in the 3D Ising model on GPUs

Update scheme

uint4

Page 13: Determination of line tension in the 3D Ising model on GPUs

Update scheme

uint4

Page 14: Determination of line tension in the 3D Ising model on GPUs

Update schemeExtract chunk in

thread

uint4

Page 15: Determination of line tension in the 3D Ising model on GPUs

Update schemeExtract chunk in

thread

Perform

Computations(draw random

number, evaluate

Metropolis criterion)

uint4

Page 16: Determination of line tension in the 3D Ising model on GPUs

Update schemeExtract chunk in

thread

Perform

Computations(draw random

number, evaluate

Metropolis criterion)

Update pattern

uint4

Page 17: Determination of line tension in the 3D Ising model on GPUs

XOR

Update schemeExtract chunk in

thread

Perform

Computations(draw random

number, evaluate

Metropolis criterion)

Old spins New spinsUpdate pattern

=

uint4

Page 18: Determination of line tension in the 3D Ising model on GPUs

Multispin Coding?

• Multiple spins are coded in memory unit (128

spins in 128 bit)

Page 19: Determination of line tension in the 3D Ising model on GPUs

Multispin Coding?

• Multiple spins are coded in memory unit (128

spins in 128 bit)

• Computation is not done on encoded spins in

parallel but serial in each chunk

Page 20: Determination of line tension in the 3D Ising model on GPUs

Multispin Coding?

• Multiple spins are coded in memory unit (128

spins in 128 bit)

• Computation is not done on encoded spins in

parallel but serial in each chunk

• Multispin coding algorithms designed for CPUs

were not efficient on GPU

Page 21: Determination of line tension in the 3D Ising model on GPUs

Multispin Coding?

• Multiple spins are coded in memory unit (128

spins in 128 bit)

• Computation is not done on encoded spins in

parallel but serial in each chunk

• Multispin coding algorithms designed for CPUs

were not efficient on GPU

Why??

Page 22: Determination of line tension in the 3D Ising model on GPUs

Multispin Coding

Page 23: Determination of line tension in the 3D Ising model on GPUs

Array of spins (1 bit = 1 spin)

Page 24: Determination of line tension in the 3D Ising model on GPUs

?

Array of spins (1 bit = 1 spin)

MC step:

Page 25: Determination of line tension in the 3D Ising model on GPUs

?

Array of spins (1 bit = 1 spin)

MC step:

Page 26: Determination of line tension in the 3D Ising model on GPUs

?

Array of spins (1 bit = 1 spin)

MC step:

In advance:

Page 27: Determination of line tension in the 3D Ising model on GPUs

?

Array of spins (1 bit = 1 spin)

MC step:Pooled

random

patterns

Neighbors

(Bitwise)

Judgement function:

(for each

energy level)

Page 28: Determination of line tension in the 3D Ising model on GPUs

?

Array of spins (1 bit = 1 spin)

MC step:

Pool of random

patterns

Page 29: Determination of line tension in the 3D Ising model on GPUs

?

Array of spins (1 bit = 1 spin)

MC step:

select one

pattern

randomly

Construct update pattern

Page 30: Determination of line tension in the 3D Ising model on GPUs

Array of spins (1 bit = 1 spin)

XOR

Page 31: Determination of line tension in the 3D Ising model on GPUs

Array of spins (1 bit = 1 spin)

XOR

=

Spins for next step

Page 32: Determination of line tension in the 3D Ising model on GPUs

Downsides of Pooling

• Impairs quality of simulation (the smaller the

pool the less random)

Page 33: Determination of line tension in the 3D Ising model on GPUs

Downsides of Pooling

• Impairs quality of simulation (the smaller the

pool the less random)

• Low flexibility (external fields...)

Page 34: Determination of line tension in the 3D Ising model on GPUs

Downsides of Pooling

• Impairs quality of simulation (the smaller the

pool the less random)

• Low flexibility (external fields...)

• Relies on a lot of precomputation and random

memory lookups (GPU killer)

Page 35: Determination of line tension in the 3D Ising model on GPUs

Performance

CPU

simple

CPU

multispin

coding

GPU

simple

GPU

optimized

~ 20x

~ 200x

Results from 2011

2D Ising

GPU: NVIDIA Tesla S1070

CPU: Intel i7 (2.67 GHz, 1 core)

Page 36: Determination of line tension in the 3D Ising model on GPUs

Performance

CPU

simple

CPU

multispin

coding

GPU

simple

GPU

optimized

~ 20xGPU: NVIDIA Tesla S1070

CPU: Intel i7 (2.67 GHz, 1 core)

Results from 2011

2D Ising

Page 37: Determination of line tension in the 3D Ising model on GPUs

Performance

CPU

simple

CPU

multispin

coding

GPU

simple

GPU

optimized

~ 20xGPU: NVIDIA Tesla S1070

CPU: Intel i7 (2.67 GHz, 1 core)

Results from 2011

2D Ising

8x, still one core!

Page 38: Determination of line tension in the 3D Ising model on GPUs

Performance

CPU

simple

CPU

multispin

coding

GPU

simple

GPU

optimizedResults from 2011

2D Ising

GPU: NVIDIA Tesla S1070

CPU: Intel i7 (2.67 GHz, 1 core)

Page 39: Determination of line tension in the 3D Ising model on GPUs

Performance

CPU

simple

CPU

multispin

coding

GPU

simple

GPU

optimized

~ 20xResults from 2011

2D Ising

GPU: NVIDIA Tesla S1070

CPU: Intel i7 (2.67 GHz, 1 core)

Page 40: Determination of line tension in the 3D Ising model on GPUs

Performance

CPU

simple

CPU

multispin

coding

GPU

simple

GPU

optimized

~ 20x

~ 200x

Results from 2011

2D Ising

GPU: NVIDIA Tesla S1070

CPU: Intel i7 (2.67 GHz, 1 core)

Page 41: Determination of line tension in the 3D Ising model on GPUs

Simulation on multiple GPUs

Spread spin lattice over many GPUs

in different machines

Exchange border information

between machines via MPI

Page 42: Determination of line tension in the 3D Ising model on GPUs

Simulation Domains per GPU Border Arrays

Page 43: Determination of line tension in the 3D Ising model on GPUs

Multi-GPU Performance

Measure: Single spin flips per GPU

Communication

overhead

Bottleneck for

small system sizes

Page 44: Determination of line tension in the 3D Ising model on GPUs

• 64 GPUs: 256 GB video memory

• Enough for a lattice of 800.000 x 800.000 spins

• One lattice sweep: 3 seconds on pre-Fermi (S1070)

hardware

Page 45: Determination of line tension in the 3D Ising model on GPUs

?

Page 46: Determination of line tension in the 3D Ising model on GPUs

?

OpenCL?

?

Page 47: Determination of line tension in the 3D Ising model on GPUs

Platform independence

51

Page 48: Determination of line tension in the 3D Ising model on GPUs

KernelsIdea: Hide language differences in macros

Page 49: Determination of line tension in the 3D Ising model on GPUs

Macros expand to different expressions on each platform

•CUDA (Driver API)

•OpenCL

•Host C

Page 50: Determination of line tension in the 3D Ising model on GPUs

Initialization

• Initialize

• Load “Device Programs” (Kernels) from source

• Create Data Containers that take care of data

Page 51: Determination of line tension in the 3D Ising model on GPUs

Run kernel with parameters

Use data on host

Page 52: Determination of line tension in the 3D Ising model on GPUs

Cross platform performance

56

CPU: i7

Nehalem

Nvidia:

Geforce GTX

580

AMD: HD 6970

3D Ising

Example

Page 53: Determination of line tension in the 3D Ising model on GPUs

Results

Page 54: Determination of line tension in the 3D Ising model on GPUs

Results

• Downside: Lowest common denominator

(CUDA has a lot more features by now)

Page 55: Determination of line tension in the 3D Ising model on GPUs

Results

• Downside: Lowest common denominator

(CUDA has a lot more features by now)

• No explicit copying needed (containers job)

Page 56: Determination of line tension in the 3D Ising model on GPUs

Results

• Downside: Lowest common denominator

(CUDA has a lot more features by now)

• No explicit copying needed (containers job)

• In our case: OpenCL was 10% slower on NVIDIA card

(Geforce GTX580)

Page 57: Determination of line tension in the 3D Ising model on GPUs

Results

• Downside: Lowest common denominator

(CUDA has a lot more features by now)

• No explicit copying needed (containers job)

• In our case: OpenCL was 10% slower on NVIDIA card

(Geforce GTX580)

• slower on comparable AMD card (Radeon HD 6970)

Page 58: Determination of line tension in the 3D Ising model on GPUs

Results

• Downside: Lowest common denominator

(CUDA has a lot more features by now)

• No explicit copying needed (containers job)

• In our case: OpenCL was 10% slower on NVIDIA card

(Geforce GTX580)

• slower on comparable AMD card (Radeon HD 6970)

• Take this with a grain of salt

Page 59: Determination of line tension in the 3D Ising model on GPUs

Nucleation

Page 60: Determination of line tension in the 3D Ising model on GPUs

Nucleation phenomena

• Nucleation important in materials

research, atmosphere, etc

Page 61: Determination of line tension in the 3D Ising model on GPUs

Nucleation

Phase 1 Phase 2

Page 62: Determination of line tension in the 3D Ising model on GPUs

Nucleation

Phase 1 Phase 2

Induced by nuclei!

Page 63: Determination of line tension in the 3D Ising model on GPUs

Most spins up Most spins down

Page 64: Determination of line tension in the 3D Ising model on GPUs

Heterogeneous Nucleation

Wall attached droplet

Page 65: Determination of line tension in the 3D Ising model on GPUs

=

Page 66: Determination of line tension in the 3D Ising model on GPUs

Simulation in the Ising Model

Winter D., Virnau P., Binder K., PRL Volume 103 Issue 22 (2009)

Page 67: Determination of line tension in the 3D Ising model on GPUs
Page 68: Determination of line tension in the 3D Ising model on GPUs

Young

Page 69: Determination of line tension in the 3D Ising model on GPUs

Free Energy of Droplet

Η=0, Θ=90o

Winter D., Virnau P., Binder K., PRL Volume 103 Issue 22 (2009)

Page 70: Determination of line tension in the 3D Ising model on GPUs

Young

Page 71: Determination of line tension in the 3D Ising model on GPUs

Line Contribution

Page 72: Determination of line tension in the 3D Ising model on GPUs

Line Contribution

Page 73: Determination of line tension in the 3D Ising model on GPUs

A different method...

Page 74: Determination of line tension in the 3D Ising model on GPUs

A different method...

Surface field H > 0 which tilts interface

Page 75: Determination of line tension in the 3D Ising model on GPUs

A different method...

Surface field H > 0 which tilts interface

Page 76: Determination of line tension in the 3D Ising model on GPUs

A different method...

Antiperiodic Boundary

Conditions force and stabilize

an interface

Surface field H > 0 which tilts interface

Page 77: Determination of line tension in the 3D Ising model on GPUs

A different method...

Antiperiodic Boundary

Conditions force and stabilize

an interface

Surface field H > 0 which tilts interface

Angle is limited by geometry...

Page 78: Determination of line tension in the 3D Ising model on GPUs

Flatten geometry

Lx

Ly

Flattened geometry in dimension X allows for stronger tilt

Lz

Page 79: Determination of line tension in the 3D Ising model on GPUs

Boundary Condition

Implementation

83Simulate one extra chunk in each dimension

Page 80: Determination of line tension in the 3D Ising model on GPUs

Boundary Condition

Implementation

Periodic: Exchange borders

Page 81: Determination of line tension in the 3D Ising model on GPUs

Boundary Condition

Implementation

APBC: Read, XOR 1, Write

Page 82: Determination of line tension in the 3D Ising model on GPUs

Thermodynamic integration

• Vary box size in all dimensions

• Measure Free Energies of surfaces by

integration over magnetization

Page 83: Determination of line tension in the 3D Ising model on GPUs

• Expressions can be derived for the Free Energy

differences in each dimension

Young’s Equation

(1)

(2)

(3)

Page 84: Determination of line tension in the 3D Ising model on GPUs

• Expressions can be derived for the Free Energy

differences in each dimension

Young’s Equation

Combination of the first two expressions

Allows extraction of Line Tension

(1)

(2)

(3)

Page 85: Determination of line tension in the 3D Ising model on GPUs

• Which can be combined to an expression for the

line tension:

(1) (2)(3)

Page 86: Determination of line tension in the 3D Ising model on GPUs

Putting it together

- -

Page 87: Determination of line tension in the 3D Ising model on GPUs

9191(2011) Kim et al.

T=3.0

Page 88: Determination of line tension in the 3D Ising model on GPUs

Side view

Top view

Density Profile

3D System:

56x120x120 spins

Page 89: Determination of line tension in the 3D Ising model on GPUs

9393

Page 90: Determination of line tension in the 3D Ising model on GPUs

Conclusion

Page 91: Determination of line tension in the 3D Ising model on GPUs

Conclusion

• Direct method to measure line tension for tilted

surfaces

Page 92: Determination of line tension in the 3D Ising model on GPUs

Conclusion

• Direct method to measure line tension for tilted

surfaces

• Our first real world use of the Ising Model on

GPUs

Page 93: Determination of line tension in the 3D Ising model on GPUs

Conclusion

• Direct method to measure line tension for tilted

surfaces

• Our first real world use of the Ising Model on

GPUs

• Optimization is important (CPU and GPU) for

fair comparison

Page 94: Determination of line tension in the 3D Ising model on GPUs

Conclusion

• Direct method to measure line tension for tilted

surfaces

• Our first real world use of the Ising Model on

GPUs

• Optimization is important (CPU and GPU) for

fair comparison

• Platform independence is possible (useful?)

Page 95: Determination of line tension in the 3D Ising model on GPUs

Conclusion

• Direct method to measure line tension for tilted

surfaces

• Our first real world use of the Ising Model on

GPUs

• Optimization is important (CPU and GPU) for

fair comparison

• Platform independence is possible (useful?)

• The Ising model is a good candidate for parallel

processing on GPU clusters

Page 96: Determination of line tension in the 3D Ising model on GPUs

Publications

• Monte Carlo Test of the Classical Theory for Heterogeneous

Nucleation Barriers

Winter D., Virnau P., Binder K., Phys.Rev.Let. 103, 22 (2009)

• Multi-GPU Accelerated Multi-Spin Monte Carlo Simulations of

the 2D Ising model

Block, B., Virnau, P., Preis, T.:, Computer Physics Communications,

Volume 181, Issue 9 (2010)

• Monte Carlo Methods for Estimating Interfacial Free Energies

and Line Tensions

Binder, K., Block., B., Das, S. K., Virnau, P., Winter, D., J. Stat.

Phys (2011)

• Platform independent, efficient implementation of the Ising

model on parallel acceleration devices

Block B. J., Eur. Phys. J. Spec. Top. (2012)