Upload
justin-holmes
View
216
Download
2
Embed Size (px)
Citation preview
1
Using the PETSc Parallel Software library in Developing MPP Software for Calculating Exact
Cumulative Reaction Probabilities for
Large Systems
(M. Minkoff and A. Wagner)ANL (MCS/CHM)
Introduction Problem Description MPP Software Tools Computations Future Direction
2
Parallelization of Cumulative Reaction Probabilities (CRP) with PETScM. Minkoff (ANL/MCS) and A. Wagner(ANL/CHM)
Calculation of gas phase rate constants Develop a highly scalable and efficient parallel algorithm for calculating the
Cumulative Reaction Probability, P. Use parallel subroutine libraries for higher generation parallel machines to
develop parallel CRP simulation software. Implementing Miller and Manthe (1994) method for time-independent solution
of P in parallel. – P is determined for an eigenvalue problem with an operator involving two Green’s
functions. The eigenvalues are obtained using a Lanczos method. The Green’s functions are evaluated via a GMRES iteration with diagonal preconditioner.
3
Benefits of using PETSc
Sparsity: PETSc allows arbitrarily sparse data structures GMRES: PETSc has GMRES as an option for linear
solves
Present tests involve problems in dimensions 3 to 6. Testing is underway using an SGI Power Challenge (ANL), and SGI/CRAY T3E (NERSC). (Portability is provided via MPI and PETSc, so higher dimensional systems are planned for future work).
4
Chemical Dynamics Theory3 angles, 3 stretches6 degrees of freedom
5
Chemical Dynamics Theory
How fast do chemicals react? Rate constant “k” determines it
– d[X] / dt = -k1[X][Y] + k2[Z][Y]– many rates at work in devices– rates express interactions in the chemistry– individual rates are measurable and calculable– rates depend on T, P.
6
Chemical Dynamics TheoryN(E) = Tr[P(E)]
Rates are related to – Cumulative Reaction Probability (CRP), N(E)
N(E) = 4 Tr[1/ 2 1/ 2†
]( ) ( )r p r
G E G E
7
Chemical Dynamics Theory
Probability Operator and It’s Inverse– Using probability method calculates a few large
eigenvalues via iterative methods. The iterative evaluation involves the action of two Green’s function.
– Using inverse probability method involves a direct calculation each iteration to obtain a few smallest eigenvalues. At each iteration the action of a vector by the Green’s function is required. This leads to solving linear systems involving the Hamiltonian.
8
Chemical Dynamics Theory The Green’s functions have the form:
G(E) = (E + i- H)-1
and so we need to solve two linear systems (at each iteration) of the form:
(E + i- H) y = xwhere x is known.
This system is solved via GMRES with preconditioning methods (initially diagonal scaling).
9
PETSc: Portable, Extensible Toolkit for Scientific Computing
Focus: data structures and routines for the scalable solution of PDE-based applications
Object-oriented design using mathematical abstractions
Freely available and supported research code Available via http://www.mcs.anl.gov/petsc Usable in C, C++, and Fortran77/90 (with minor
limitations in Fortran 77/90 due to their syntax) Users manual, hyperlinked manual pages for all
routines Many tutorial-style examples Support via email: [email protected]
Satish Balay, William Gropp, Lois McInnes, and Barry Smith
MCS Division, Argonne National Laboratory
10
Computation and Communication KernelsMPI, MPI-IO, BLAS, LAPACK
Profiling Interface
PETSc PDE Application Codes
Object-OrientedMatrices, Vectors, Indices
GridManagement
Linear SolversPreconditioners + Krylov Methods
Nonlinear Solvers,Unconstrained Minimization
ODE Integrators Visualization
Interface
Application Codes Using PETSc
Applications can interface to whatever abstraction level is most appropriate.
11
CompressedSparse Row
(AIJ)
Blocked CompressedSparse Row
(BAIJ)
BlockDiagonal(BDIAG)
Dense Other
Indices Block Indices Stride Other
Index SetsVectors
Line Search Trust Region
Newton-based MethodsOther
Nonlinear Solvers
AdditiveSchwartz
BlockJacobi
Jacobi ILU ICCLU
(Sequential only)Others
Preconditioners
EulerBackward
EulerPseudo Time
SteppingOther
Time Steppers
GMRES CG CGS Bi-CG-STAB TFQMR Richardson Chebychev Other
Krylov Subspace Methods
Matrices
PETSc Numerical Components
12
Linear iterations
0
1000
2000
3000
128 256 384 512 640 768 896 1024
Nonlinear iterations
01020304050
128 256 384 512 640 768 896 1024
Execution time
0
1000
2000
128 256 384 512 640 768 896 1024
Aggregate Gflop/s
0
40
80
128 256 384 512 640 768 896 1024
Mflops/s per processor
020406080
100
128 256 384 512 640 768 896 1024
Efficiency
0
1
128 256 384 512 640 768 896 1024
600 MHz T3E, 2.8M vertices
Sample Scalable Performance
• 3D incompressible Euler• Tetrahedral grid• Up to 11 million unknowns • Based on a legacy NASA code, FUN3d, developed by W. K. Anderson• Fully implicit steady-state
• Newton-Krylov-Schwarz algorithm with pseudo-transient continuation• Results courtesy of Dinesh Kaushik and David Keyes, Old Dominion University
13
Computations via MPI and PETSc
14
5D/T3E Results for Varying Eigenvalue and G-S Method
10
100
1000
104
1 10 100 1000
Modified and Unmodified G-S
.32ev Modified G-S
.32ev Unmodified G-S
.42 Modified G-S
.42ev Unmodified G-S
.55 Modified G-S
.55ev Unmodified G-S
.32
ev M
od
ifie
d G
-S
Number of Processors
15
Parallel Speedup5D/6D ANL/SGI and NERSC/T3E
1
10
100
1000
1 10 100
Parallel Speedup
SU idealSGI Origin 5D .32ev (Unmodifed G-S)SGI Origin 6D .32ev SpeedupT3E 6D .32ev Speedup
Sp
eed
up
(5D
, 6D
, T
3e,
SG
I O
rig
in)
Number of Processors
16
Storage Required for Higher Dimensions
0.01
0.1
1
10
100
1000
104
1 10
Storage Required for F=3.5 Cutoff, .32ev (3% Accuracy)
Matrix Size (MWords)Extrapolation Curve
Mat
rix
Siz
e (M
Wo
rds)
Dimension
17
Results and Future Work Achieved parallelization with less effort
– Suboptimal but perhaps 2X Optimal Performance Testing for 6D and 7D underway.
– MPP CPU and Memory can provide necessary resources
– Many degrees of freedom can be approximated, so maximum dimension needed is ~10.
Develop block structured preconditioning methods