31
155 South 1452 East Room 380 Salt Lake City, Utah 84112 1-801-585-1233 This research was sponsored by the National Nuclear Security Administration under the Accelerating Development of Retrofitable CO2 Capture Technologies through Predictivity program through DOE Cooperative Agreement DE-NA0000740 Integration of Reverse Monte-Carlo Ray Tracing within Uintah Todd Harman Department of Mechanical Engineering Jeremy Thornock Department of Chemical Engineering Isaac Hunsaker Graduate Student Department of Chemical Engineering

155 South 1452 East Room 380 Salt Lake City, Utah 84112 1-801-585-1233 This research was sponsored by the National Nuclear Security Administration

Embed Size (px)

Citation preview

Page 1: 155 South 1452 East Room 380  Salt Lake City, Utah 84112  1-801-585-1233 This research was sponsored by the National Nuclear Security Administration

155 South 1452 East Room 380 Salt Lake City, Utah 84112

1-801-585-1233

This research was sponsored by the National Nuclear Security Administration under the Accelerating Development of Retrofitable CO2 Capture Technologies through Predictivity program through DOE Cooperative Agreement DE-NA0000740

Integration of Reverse Monte-Carlo Ray Tracing within Uintah

Todd HarmanDepartment of Mechanical Engineering

Jeremy Thornock

Department of Chemical Engineering

Isaac HunsakerGraduate Student

Department of Chemical Engineering

Page 2: 155 South 1452 East Room 380  Salt Lake City, Utah 84112  1-801-585-1233 This research was sponsored by the National Nuclear Security Administration

• Year 2: Demonstration of a fully-coupled problem using RMCRT within ARCHES.

Scalability demonstration.

DeliverablesDeliverables

Page 3: 155 South 1452 East Room 380  Salt Lake City, Utah 84112  1-801-585-1233 This research was sponsored by the National Nuclear Security Administration

ApproachApproach

CFD:

Finest level, (always)

RMCRT:

1 Level: CFD

2 Level: coarsest level

“Data Onion”: finest level,

Research Topic: Region of Interest (ROI)

Page 4: 155 South 1452 East Room 380  Salt Lake City, Utah 84112  1-801-585-1233 This research was sponsored by the National Nuclear Security Administration

2 Levels2 Levels

2 Levels

Page 5: 155 South 1452 East Room 380  Salt Lake City, Utah 84112  1-801-585-1233 This research was sponsored by the National Nuclear Security Administration

Data OnionData Onion

3-Levels

Page 6: 155 South 1452 East Room 380  Salt Lake City, Utah 84112  1-801-585-1233 This research was sponsored by the National Nuclear Security Administration

Data Onion: ROIData Onion: ROI Implemented

Research Topic: ROI location?

Static:

• User defined region?

Page 7: 155 South 1452 East Room 380  Salt Lake City, Utah 84112  1-801-585-1233 This research was sponsored by the National Nuclear Security Administration

Data Onion: ROIData Onion: ROI Implemented

Research Topic: ROI location

Dynamic: • ROI computed every timestep? (abskg sigmaT4)

• ROI proportional to the size of fine level patches?

Page 8: 155 South 1452 East Room 380  Salt Lake City, Utah 84112  1-801-585-1233 This research was sponsored by the National Nuclear Security Administration

Status: CompletedStatus: Completed

80% Complete: Data Onion, dynamic & static region of interests.

Testing phase, need benchmarks.

90% Complete: Integration of RMCRT tasks within ARCHES

(2 level)

Page 9: 155 South 1452 East Room 380  Salt Lake City, Utah 84112  1-801-585-1233 This research was sponsored by the National Nuclear Security Administration

Status: Work in ProgressStatus: Work in Progress• Single Level

Verification Order of accuracy

# rays (old)

grid resolution

Scalability studies, new mixed scheduler.

• 2 Levels verification

Errors associated with coarsening

Page 10: 155 South 1452 East Room 380  Salt Lake City, Utah 84112  1-801-585-1233 This research was sponsored by the National Nuclear Security Administration

Benchmark ProblemBenchmark Problem

S. P. Burns and M.A Christon. Spatial domain-based parallelism in large-scale, participating-media, radiative transport applications. Numerical Heat Transfer, Part B, 31(4):401-421, 1997.

Initial Conditions:

- Uniform temperature field

- Analytical function for absorption coefficient

Page 11: 155 South 1452 East Room 380  Salt Lake City, Utah 84112  1-801-585-1233 This research was sponsored by the National Nuclear Security Administration

Verification: 1LVerification: 1L

S. P. Burns and M.A Christon. Spatial domain-based parallelism in large-scale, participating-media, radiative transport applications. Numerical Heat Transfer, Part B, 31(4):401-421, 1997.

Page 12: 155 South 1452 East Room 380  Salt Lake City, Utah 84112  1-801-585-1233 This research was sponsored by the National Nuclear Security Administration

Verification: 1LVerification: 1L

Page 13: 155 South 1452 East Room 380  Salt Lake City, Utah 84112  1-801-585-1233 This research was sponsored by the National Nuclear Security Administration

Verification: 2LVerification: 2L

4X error from coarsening abskg

Page 14: 155 South 1452 East Room 380  Salt Lake City, Utah 84112  1-801-585-1233 This research was sponsored by the National Nuclear Security Administration

Verification: 2LVerification: 2L

Coarsening: smoothing filter

Error

Abskg

Page 15: 155 South 1452 East Room 380  Salt Lake City, Utah 84112  1-801-585-1233 This research was sponsored by the National Nuclear Security Administration

CollaborationCollaboration

Leverage the work of Dr. Berzin’s team

Hybrid MPI-threaded Task Scheduler (Qingyu Meng)

GPU-RMCRT (Alan Humphrey)

Page 16: 155 South 1452 East Room 380  Salt Lake City, Utah 84112  1-801-585-1233 This research was sponsored by the National Nuclear Security Administration

Hybrid MPI-threaded Task SchedulerHybrid MPI-threaded Task Scheduler

Hybrid MPI-threaded Task Scheduler*:

• Memory reduction!

• 13.5Gb -> 1GB per node (12 cores/node)*.

(2 material CFD problem, 20483 cells, on 110592 cores of Jaguar)

• Interconnect drivers and MPI software must be threadsafe.

• RMCRT requires an MPI environmental variable expert!

*Q. Meng, M. Berzins, and J. Schmidt, Using hybrid parallelism to improve memory use in uintah. In Proceeding of the Teragrid 2011.

Page 17: 155 South 1452 East Room 380  Salt Lake City, Utah 84112  1-801-585-1233 This research was sponsored by the National Nuclear Security Administration

MPI-threaded Task SchedulerMPI-threaded Task Scheduler Kraken

100 rays per cell

Page 18: 155 South 1452 East Room 380  Salt Lake City, Utah 84112  1-801-585-1233 This research was sponsored by the National Nuclear Security Administration

MPI-threaded Task SchedulerMPI-threaded Task Scheduler

Difficult to run on Kraken, crashing in mvapich

Further testing needed on bigger machines?

Page 19: 155 South 1452 East Room 380  Salt Lake City, Utah 84112  1-801-585-1233 This research was sponsored by the National Nuclear Security Administration

GPU-RMCRT

Motivation - Utilize all available hardware

Uintah’s asynchronous task-based approach is well suited to take

advantage of GPUs

RMCRT is ideal for GPUs

Keeneland Initial Delivery System360 GPUs

DoE Titan1000s of GPUs

Nvidia M2070/90 Tesla GPU

Multi-core CPU

+

Page 20: 155 South 1452 East Room 380  Salt Lake City, Utah 84112  1-801-585-1233 This research was sponsored by the National Nuclear Security Administration

GPU-RMCRT

• Offload Ray Tracing and RNG to GPU(s)

Available CPU cores can perform other computation.

• Uintah infrastructure supports GPU task scheduling and execution:

Can access multiple GPUs on-node

Uses Nvidia CUDA C/C++

• Using NVIDIA cuRAND Library

GPU-accelerated random number generation (RNG)

Page 21: 155 South 1452 East Room 380  Salt Lake City, Utah 84112  1-801-585-1233 This research was sponsored by the National Nuclear Security Administration

Uintah Hybrid CPU/GPU Scheduler

• Create & schedule CPU & GPU tasks

• Enables Uintah to “pre-fetch” GPU data

• Uintah infrastructure manages:

• Queues of CUDA Stream and Event handles

• Device memory allocation and transfers

• Utilize all available: CPU cores and GPUs

Page 22: 155 South 1452 East Room 380  Salt Lake City, Utah 84112  1-801-585-1233 This research was sponsored by the National Nuclear Security Administration

Uintah GPU Scheduler Abilities

• Capability jobs run on:

Keeneland Initial Delivery System (NICS)

1440 CPU cores & 360 GPUs simultaneously

Jaguar - GPU partition (OLCF)

15360 CPU cores & 960 GPUs simultaneousl

• Development of GPU RMCRT prototype underway.

Page 23: 155 South 1452 East Room 380  Salt Lake City, Utah 84112  1-801-585-1233 This research was sponsored by the National Nuclear Security Administration

Status: PendingStatus: Pending• Head-to-head comparison of RMCRT with Discrete Ordinates Method.

Single level.

Accuracy versus computational cost.

• 2 Levels:

Coarsening error for variable temperature and radiative properties.

• Data Onion:

Serial performance

Accuracy versus number of levels, refinement ratio, dynamic/static ROI.

Scalability Studies

Page 24: 155 South 1452 East Room 380  Salt Lake City, Utah 84112  1-801-585-1233 This research was sponsored by the National Nuclear Security Administration

SummarySummary

• Order of Accuracy: # rays0.5

, grid Cells1

• Accuracy issues related to coarsening data.

• Cost = f( #rays, Grid Cells1.4-1.5 communication….)

Doubling the grid resolution = 20ish X increase in cost.

• Good scalability characteristics

Year 2: Demonstration of a fully-coupled problem using RMCRT within ARCHES.

Scalability demonstration.

SummarySummary

Page 25: 155 South 1452 East Room 380  Salt Lake City, Utah 84112  1-801-585-1233 This research was sponsored by the National Nuclear Security Administration

Acknowledgements:

DoE for funding the CSAFE project from 1997-2012, DOE NETL, DOE NNSA, INCITE

NSF for funding via SDCI and PetaApps

Keeneland Computing Facility, supported by NSF under Contract OCI-0910735 Oak Ridge Leadership Computing Facility – DoE Jaguar XK6 System (GPU partition)

http://www.uintah.utah.edu

GPU RMCRTGPU RMCRT

Page 26: 155 South 1452 East Room 380  Salt Lake City, Utah 84112  1-801-585-1233 This research was sponsored by the National Nuclear Security Administration

PhysicsPhysics

• Isotropic scattering added to the model

• Verification testing performed using an exact solution (Siegel, 1987)

• Grid convergence analysis performed

• Discrepancy diminishes with increased mesh refinement

Page 27: 155 South 1452 East Room 380  Salt Lake City, Utah 84112  1-801-585-1233 This research was sponsored by the National Nuclear Security Administration

Isotropic Scattering:VerificationIsotropic Scattering:Verification

Seigel, R. “Transient Radiative Cooling of a droplet-filled layer,” ASME Journal of Heat Transfer,109:159-164, 1987.

Benchmark Case of Seigel 1987

• Cube (1m3)

• Uniform Temperature 64.7K

• Mirror surface on all sides

• Black top and bottom walls

• Computed surface fluxes on top & bottom walls

• 10 rays per cell (low)

Page 28: 155 South 1452 East Room 380  Salt Lake City, Utah 84112  1-801-585-1233 This research was sponsored by the National Nuclear Security Administration

Isotropic Scattering:VerificationIsotropic Scattering:Verification

Radiative Flux vs Optical Thickness

Seigel, R. “Transient Radiative Cooling of a droplet-filled layer,” ASME Journal of Heat Transfer,109:159-164, 1987.

RMCRT (dots)Exact solution (lines)

Page 29: 155 South 1452 East Room 380  Salt Lake City, Utah 84112  1-801-585-1233 This research was sponsored by the National Nuclear Security Administration

Isotropic Scattering:VerificationIsotropic Scattering:Verification

Grid convergence of the L1 error norms where the scattering coefficient is 8 m-1, and the absorption coefficient is 2m-1.

Page 30: 155 South 1452 East Room 380  Salt Lake City, Utah 84112  1-801-585-1233 This research was sponsored by the National Nuclear Security Administration

DOM vs RMCRTDOM vs RMCRT

IFRF burner simulation (production size run)

• 1344 processors/cores

• Initial conditions taken from a previous run with DOM.

• Domain: (1m x 4.11 m x 1m)

• Resolution: (4.4mm x 8.8mm x 4.4mm) 24 million cells

Page 31: 155 South 1452 East Room 380  Salt Lake City, Utah 84112  1-801-585-1233 This research was sponsored by the National Nuclear Security Administration

DOM vs RMCRTDOM vs RMCRT