Upload
jasmin-morton
View
212
Download
0
Embed Size (px)
Citation preview
Overview of Recent MCMD Developments
Jarek Nieplocha
CCA Forum Meeting
San Francisco
MCMD Working Group Recent activities focus on development of specifications for CCA-based
processor groups teams BOFs held during CCA meetings in April and July, 2007 Mini-Workshop held January 24, 2007 Use cases documented and analyzed Wiki webpage and mailing list:
https://www.cca-forum.org/wiki/tiki-index.php?page=MCMD-WG Specifications document version 0.3
Telecon held Sept 28, 2007 Several other people sent good comments by email Issues about threads, fault tolerant environment, MPI-centric narrative and examples, ID
representation Plans
Complete work on the spec document be end of 2007 Telecon, mailing list discussions and reviews
Prototype implementation and some application evaluation NWChem, subsurface
Multilevel Parallelism
How can applications effectively exploit the massive amount of h/w parallelism available in petaflop-scale machines?
Massive numbers of CPUs in future systems require algorithm and software redesign to exploit all available parallelism
Multilevel parallelism Divide work into parts that can be executed
concurrently on groups of processors Can exploit massive hardware parallelism Increases granularity of computation =>
improve the overall scalability
Task 2
Task 1
Task 2Task 1
Multiple Component Multiple Data
MCMD extends the SCMD (single component multiple data) model that was the main focus of CCA in Scidac-1 Prototype solution described at SC’05 for
computational chemistry Allows different groups of processors execute
different CCA components Main motivation for MCMD is support for multiple
levels of parallelism in applicationsSCMD
MCMD
NWChem example
SCMDMCMD
MCMD Use Cases
Coop Parallelism Hierarchical Parallelism in Computational Chemistry Ab Initio Nuclear Structure Calculations Coupled Climate Modeling Molecular Dynamics, Multiphysics Simulations Fusion use-case described at Silver Springs Meeting
Target Execution Model and Global Ids
Global id specification global id = <machine id> + <job id> +
<task/process rank> + <thread id>
Single/Multiple mpiruns
MPI Tasks/Processes
Threads Threads
Group Management
Various execution models E.g. coop parallelism vs. single mpirun
Programming Models Should be MPI-Friendly but also open to other
models MPI, Threads, GAS models including GA,
UPC, HPCS languages Global process and team ids Group translators
CCA Processor Teams
We propose to use a slightly different term of process(or) teams rather than groups Avoid confusion with existing terminology and interfaces in
programming models Some use cases call for something more general than MPI
groups e.g., COOP with multiple mpiruns For example, CCA team can encompass a collection of
processes in two different MPI jobs. We cannot construct a single MPI group corresponding to that.
Operations on CCA teams might not have direct mapping to group operations in programming models that support groups
MPI Job A MPI Job B
MPI groups
CCA Process Team
CCA Team Service
How do initialize the application? COOP example makes it non-trivial
Provides the following Create, destroy, compare, split teams
More capabilities can be added as required Assigns global ids to tasks from one or more
jobs running on one or more machines Global id = <machines id> + <job id> + < task id>
Also, <thread id> if we were to support threads at component level in the future
Locality Information Gets the job id, machines id, task id of the given
task
Plugins
CCA Team
MPI Group Service
GA
Gro
up
PV
M G
roup
Inte
rope
rabl
e G
roup
Ser
vice
Lay
er
MPI Group Service
GA Group Service
PVM Group Service
XYZ Prog Model’sGroup Service
CCATeam
Service
Provide mappings between CCAteams and task/image/thread groups for programming modelscomponents written in
Example
LandOcean
Coupled System
PVMProcGroup
GAProcGroup
MPIProcGroup
Ocean Model Land Model I/O
Global CCA Team
PVM Job A MPI/GA Job B
Specification Document
Version 0.3 on wiki (Word, PDF) Please review and contribute Looking at candidate applications and
component s/w for initial evaluation Numerical, I/O
Issues from the Telecon
Eliminate threads from the spec + Add more emphasis on mixing multiple
programming models + How do we handle global ids ?
Pros and cons of using integers Conclusion is to use "global ids" as
objects and introduce a new representaion called "global ranks".
Need for dynamic team management
Dynamic Behavior
We want to support dynamic nature of applications
Application composed of parallel jobs that are launched and complete at different stages of application execution
Fault tolerance in style of FT-MPI Adaptation to faults Teams can shrink/expand. Cannot count of
persistency of values returned by team service calls.