Upload
others
View
4
Download
0
Embed Size (px)
Citation preview
A DOE Laboratory plan for providing exascale applica9ons and
technologies for cri9cal DOE mission needs
Rick Stevens and Andy White, co-‐chairs Pete Beckman, Ray Bair-‐ANL; Jim Hack, Jeff Nichols, Al Geist-‐ORNL; Horst Simon, Kathy Yelick, John Shalf-‐LBNL; Steve Ashby, Moe Khaleel-‐PNNL; Michel McCoy, Mark Seager, Brent Gorda-‐LLNL; John Morrison, Cheryl Wampler-‐LANL; James
Peery, Sudip Dosanjh, Jim Ang-‐SNL; Jim Davenport, BNL; Fred Johnson, Paul Messina, ex officio
DOE Exascale Ini9a9ve 1
Argonne Na9onal Laboratory Brookhaven Na9onal Laboratory Oak Ridge Na9onal Laboratory
Lawrence Berkeley Na9onal Laboratory Lawrence Livermore Na9onal Laboratory
Los Alamos Na9onal Laboratory Pacific Northwest Na9onal Laboratory
Sandia Na9onal Laboratory
A grand challenge for the 21st century Development of an Exascale Compu9ng System is a Grand Challenge for the 21st Century
The Exascale DraX plan has Four High-‐Level Components
• Science and engineering mission applica9ons
• Systems soXware, tools and programming models • Computer hardware and technology development
• Systems acquisi9on, deployment and opera9ons
DOE Exascale Ini9a9ve 3
The plan targets exascale pla[orm deliveries in 2018 and a robust simula9on environment and science and mission applica9ons by 2020
Co-‐design and co-‐development of hardware, system soXware, programming model and applica9ons requires intermediate (~200 PF/s) pla[orms in 2015
The plan is currently under considera9on for a na9onal ini9a9ve to begin in 2012
Three early funding opportuni9es have been release by DOE this spring to support preliminary research
Broad Community Input • Town Hall Mee9ngs April-‐June 2007 • Scien9fic Grand Challenges Workshops Nov,
2008 – Oct, 2009 – Climate Science (11/08),
– High Energy Physics (12/08), – Nuclear Physics (1/09), – Fusion Energy (3/09), – Nuclear Energy (5/09), – Biology (8/09), – Material Science and Chemistry (8/09),
– Na9onal Security (10/09)
• Exascale Steering Commifee – “Denver” vendor NDA visits 8/2009 – Extreme Architecture and Technology
Workshop 12/2009
– Cross-‐cuing workshop 2/2010
• Interna9onal Exascale SoXware Project – Santa Fe, NM 4/2009 – Paris, France 6/2009 – Tsukuba, Japan 10/2009
DOE Exascale Ini9a9ve 4
MISSION IMPERATIVES
FUNDAMENTAL SCIENCE
Science and Engineering Drivers
• Climate • Nuclear Energy • Combus9on • Advanced Materials • CO2 Sequestra9on • Basic Science
• Common Needs – Mul9scale – Uncertainty Quan9fica9on – Rare Event Sta9s9cs
DOE Exascale Ini9a9ve 5
07! 08! 09! 10! 11! 12! 13! 14! 15! 16! 17! 18! 19! 20! 21! 22! 23! 24!
Computa9onal science roadmap for a predic9ve, integrated climate model
Exascale Ini9a9ve Steering Commifee 6
10-20PF!
150PF! 1-2EF!
Science simulations:!
Climate Drivers:!
1PF!
10EF!
25!
Quantify predictive
skill for hydrologic cycle on
sub continental
space scales!
Exploration of climate
system sensitivities!
Global cloud-system
resolving model
formulations!
Non-hydrostatic
adaptive formulations!
Quantify interactions of
carbon, nitrogen,
methane and water!
Comprehensive and accurate treatment of
aerosol microphysics!
Comprehensive land ice
treatments!
Accurate and bounded
representation of climate extremes!
Direct incorporation !of “human-
dimensions” feedback
mechanisms!
Development of accurate system
initialization techniques!
Global Simulations with 1st Generation coupled carbon and nitrogen components!
Accurate treatment of 200 Km hydrological features!
Global simulations with full chemistry /biogeochemistry and fully coupled stratosphere!
Global simulations capable of accurate treatment of !• 100 Km ! hydrological! features!• Atmospheric ! composition!
Prediction experiments on decadal time scales, 100 Km hydrological features!
Large-ensemble multi-decadal prediction experiments!
Global cloud-system resolving ensemble prediction experiments!
Computa(onal science road map for predic(ve integrated modeling of nuclear systems
DOE Exascale Ini9a9ve 7
Low temperature combus9on for automobile engines can provide high efficiency, low emissions
• Design challenge: • understand new
combus9on modes, control igni9on 9ming, rate of pressure rise, etc.
• Science challenge: • low-‐temperature igni9on
kine9cs and coupling with turbulence are poorly understood
• Approach: • perform mul9-‐scale high-‐
fidelity simula9ons to predict behavior under engine relevant condi9ons
High-‐fidelity combus9on simula9ons require exascale compu9ng to predict the behavior of alterna9ve fuels in novel fuel-‐efficient, clean engines, and so facilitate design of op9mal combined engine-‐fuel system
Chemiluminescence images of stra9fied sequen9al autoigni9on in an HCCI engine courtesy John Dec, Combus9on Research Facility, Sandia Na9onal Laboratories
8 DOE Exascale Ini9a9ve
Mul9-‐scale modeling and simula9on of combus9on at exascale is key for predic9on and design
• Engine design and op9miza9on: – Design and op9miza9on of diesel/
HCCI engines, in geometry with surrogate large-‐molecule fuels representa9ve of actual fuels
• Direct numerical simula9on: – DNS of turbulent jet flames at engine
pressures (30-‐50 atm) with iso-‐butanol (50-‐100 species) for diesel or HCCI engine thermochemical condi9ons
• Molecular scale: – Full-‐scale MD with on-‐the-‐fly ab-‐ini9o
force field. Study combined physical and chemical processes affec9ng nanopar9cle and soot forma9on processes. Size of the problem: >1000 molecules
9 DOE Exascale Ini9a9ve
Evolving the Use of Computa9on for Design
Current tools for engineering
• Reynolds-‐Averaged Navier Stokes
• Calculate bulk effects of turbulence • Turbulent combus9on submodels
calibrated over narrow range
Current tools for science
• Large Eddy Simula9on (LES) with simple chemistry, normal pressures
• Capture transient behavior
• Physics-‐based turbulent combus9on submodels for standard fuels
Future tools for engineering are similar to those used for science • LES with complex chemistry, high
pressures, mul9phase transport
• Cycle-‐to-‐cycle transient behavior for control strategy design
• Physics-‐based turbulent combus9on submodels for biofuels
RANS calcula9on for fuel injector captures bulk behavior
LES calcula9on for fuel injector captures greater range of physical
scales
Mul9-‐scale materials design is cri9cal for improving efficiency and cost of photovoltaics
• Scien9fic and engineering challenges – Predic9on of excited states: energies, forces, band-‐
gaps, level alignment – Understanding and control of carrier genera9on,
recombina9on and transport – Understanding and control of defect interac9ons and
migra9on – Simula9on of interface forma9on and dynamics
• Computa9onal science challenges – Mul9-‐scale methods – Increased spa9al and 9me scale simula9ons – ab ini9o MD methods – Electronic structure methods for excited states
DOE Exascale Ini9a9ve 11
“Extreme-‐scale computa9on can create a breakthrough in the accurate predic9on of the combined effects of mul9ple interac9ng structures and their effects upon electronic excita9on and ul9mate performance of a PV cell.” Discovery in Basic Energy Sciences: The Role of Compu;ng at the Extreme Scale (Workshop Lefer Report)
From Photovoltaic Fundamentals Subpanel, BES workshop 8/2009
Full quantum simula9ons are required for understanding and deploying befer catalysts
• Efficient storage of energy requires chemical bonds
• Catalysts are crucial for inter-‐conversion of electrical and chemical energy
• Development of new catalysts requires full quantum simula9ons
• Now: 1000 atoms, 3 nm • Needed: 10,000 atoms, 6 nm
– 300,000 basis func9ons – > 1020 integrals
• Achievable with Exascale • Founda9on for mul9-‐scale
models
DOE Exascale Ini9a9ve 12
BES/ASCR Extreme scale workshop
Simula9on enables fundamental advances in basic science. • Nuclear Physics
– Quark-‐gluon plasma & nucleon structure – Fundamentals of fission and fusion reac9ons
• Facility and experimental design – Effec9ve design of accelerators – Probes of dark energy and dark mafer – ITER shot planning and device control
• Materials / Chemistry – Predic9ve mul9-‐scale materials modeling:
observa9on to control
– Effec9ve, commercial, renewable energy technologies, catalysts and baferies
• Life Sciences – Befer biofuels – Sequence to structure to func9on – Genotype to Phenotype
Slide 13
ITER
ILC
Hubble image of lensing
Structure of nucleons
These breakthrough scientific discoveries and facilities require exascale applications and resources.
Cross-‐cuing capabili9es are cri9cal to success for DOE mission and science
• Uncertainty quan9fica9on – Predict climate response to energy technology
strategies – Assessment of safety, surety and performance
of the aging/evolving stockpile without nuclear tes9ng
– Energy security
– Responding to natural and manmade hazards
• Mul9-‐scale, mul9-‐physics modeling – Mul9ple physics packages in earth system
model: ocean, land surface, atmosphere, ice
– Mul9ple physics packages in modeling reactor core: neutronics, heat transfer, structures, fluids
• Sta9s9cs of rare events – Severe weather and surprises in climate
system – Accident scenarios in nuclear energy
– Nuclea9on of cracks and damage in materials
4/13/10 Exascale Ini9a9ve Steering Commifee
Mul9-‐scale modeling in fast sodium reactor
Storm track in ESM
urban fire
Geologic sequestra9on
Pine Island Glacier
Uncertainty quan9fica9on is cri9cal and requires exascale resources
Response surface
Posterior explora9on
Finding least favorable priors Bounds on func9onals
Adjoint enabled forward models
Data extrac9on from model
Local approxima9ons, filtering Stochas9c error es9ma9on
15 DOE Exascale Ini9a9ve
EXASCALE RESOURCES Large ensembles Embedded UQ
“We need to be able to make quan9ta9ve statements about the predictability of regional clima9c variables that are of use to society.”
“scien9sts must create new suites of applica9on codes, Integrated Performance and Safety Codes (IPSCs) that incorporate …integrated uncertainty quan9fica9on..”
“computa9onal techniques and needs complement the scien9fic areas that will be pursued with extreme scale compu9ng. Examples include … verifica9on and valida9on issues for extreme scale computa9ons ”
Exascale Systems Targets
Systems 2009 2018 Difference Today & 2018
System peak 2 Pflop/s 1 Eflop/s O(1000)
Power 6 MW ~20 MW (goal)
System memory 0.3 PB 32 -‐ 64 PB O(100)
Node performance 125 GF 1.2 or 15TF O(10) – O(100)
Node memory BW 25 GB/s 2 -‐ 4TB/s O(100)
Node concurrency 12 O(1k) or O(10k) O(100) – O(1000)
Total Node Interconnect BW 3.5 GB/s 200-‐400GB/s (1:4 or 1:8 from memory BW)
O(100)
System size (nodes) 18,700 O(100,000) or O(1M) O(10) – O(100)
Total concurrency 225,000 O(billion) + [O(10) to O(100) for latency hiding]
O(10,000)
Storage Capacity 15 PB 500-‐1000 PB (>10x system memory is min)
O(10) – O(100)
IO Rates 0.2 TB 60 TB/s O(100)
MTTI days O(1 day) -‐ O(10)
Reducing power is fundamentally about
architecture choices and process technology • Memory (2x-‐5x)
– New memory interfaces (chip stacking and vias) – Replace DRAM with zero power non-‐vola9le memory
• Processor (10x-‐20x) – Reducing data movement (func9onal reorganiza9on, > 20x) – Domain/Core power ga9ng and aggressive voltage scaling
• Interconnect (2x-‐5x) – More interconnect on package – Replace long haul copper with integrated op9cs
• Data Center Energy Efficiencies (10%-‐20%) – Higher opera9ng temperature tolerance – Power supply and cooling efficiencies
The Path Forward
• Research Needed to Achieve Exascale Performance
• Extreme voltage scaling to reduce core power
• More parallelism 10x – 100x to achieve target speed • Re-‐architec9ng DRAM to reduce memory power
• New interconnect for lower power at distance • NVM to reduce disk power and accesses
• Resilient design to manage unreliable transistors • New programming models for extreme parallelism
• Applica9ons built for extreme (billion way) parallelism
Co-‐design is a fundamental tenet of the ini9a9ve
Application
Technology
⬆ Model ⬆ Algorithms ⬆ Code
Now, we must expand the co-‐design space to find beGer solu;ons: • new applica;ons & algorithms,
• beGer technology and performance.
⊕ architecture ⊕ programming model ⊕ resilience ⊕ power
Applica9on driven: Find the best technology to run this code. Sub-‐op;mal
Technology driven: Fit your applica9on to this technology. Sub-‐op;mal.
System soXware as currently implemented is not suitable for exascale system
• Barriers – System management SW not parallel
– Current OS stack designed to manage only O(10) cores on node
– Unprepared for industry shiX to NVRAM
– OS management of I/O has hit a wall – Not prepared for massive concurrency
• Technical Focus Areas – Design HPC OS to par99on and manage node
resources to support massively concurrency – I/O system to support on-‐chip NVRAM
– Co-‐design messaging system with new hardware to achieve required message rates
• Technical gaps – 10X: in affordable I/O rates – 10X: in on-‐node message injec9on rates
– 100X: in concurrency of on-‐chip messaging hardware/soXware
– 10X: in OS resource management
20 SoXware challenges in extreme scale systems, Sarkar, 2010
SoXware &
Programming
Mod
els
Hardw
are
Pla[
orms
Track 1
Hardw
are
Pla[
orms
Track n
Basic
Techno
logy
App
lied Math
Uncertainty
Quan9
fica9
on
CoDesign &
App
lica9
ons
SoXware
Conven9o
nal
Facili9
es
Prep
ara9
on
…… Outreach,
Educa9
on and
Training
SoXware &
Programming
Mod
els
Hardw
are
Pla[
orms
Track 1
Hardw
are
Pla[
orms
Track n
Basic
Techno
logy
App
lied Math
Uncertainty
Quan9
fica9
on
CoDesign &
App
lica9
ons
SoXware
Conven9o
nal
Facili9
es
Prep
ara9
on
…… Outreach,
Educa9
on and
Training
What is in this box and how will it be organized?
SoXware &
Programming
Mod
els
Hardw
are
Pla[
orms
Track 1
Hardw
are
Pla[
orms
Track n
Basic
Techno
logy
App
lied Math
Uncertainty
Quan9
fica9
on
CoDesign &
App
lica9
ons
SoXware
Conven9o
nal
Facili9
es
Prep
ara9
on
…… Outreach,
Educa9
on and
Training
What is in this box and how will it be organized?
What is in this box?
• SoXware – Node and System OS – Run9me Libraries – Numerical Libraries – IO Libraries – Filesystems? – Data Management – Visualiza9on Tools? – Compilers? – Parallel Tools?
• Programming Models – Abstrac9ons – Standards – Reference Implementa9ons
• How many na9onal efforts will there be?
• Do we have a strong or weak coordina9on model?
• How much of this is open source?
• How does the IESP relate to the vendor soXware efforts?
Issues to Think About
• We need a soXware roadmap for community developed “open” soXware that is common to mul9ple hardware pla[orms
• The SoXware Roadmap needs to determine what soXware is needed by EI, what capabili9es to be included in each itera9on and how the soXware development ac9vi9es will connect to various na9onal efforts
• IESP in order to play a significant role in the EI needs to assume the role of community soXware supplier to the rest of the program
Issues to Think About • SoXware in general is a weak link in large-‐scale projects • If the IESP is not perceived as a primary provider of cri9cal soXware, then
other groups will feel the need to build soXware development into their plans for Exascale
• Primary provider means: – Owns the milestones for delivering soXware – Has a structure and support needed to deliver soXware on schedule
• Development teams and process • Tes9ng and valida9on • Packaging and Delivery
– Is seen as delivering that capability by all groups involved in exascale – Offloads the responsibility from the vendors and hardware development
projects – Can be counted on by the applica9ons teams to provide what they need – Is the primary organiza9on that is involved in co-‐design addressing the
soXware related items
Governance Models
• Strong Model – Acts and thinks like a (Single) Project – Takes responsibility and provides deliverables to the na9onal
programs and vendors – Strong and par9cipatory governance model
• Weak Model – Provides coordina9on and cheerleading to the mul9tude of
interna9onal “projects” – Supports others that are actually leading the development
programs – Opt in governance model – Not necessarily the primary contact for the rest of the program
Prac9cal Reali9es • We have some challenges with “open source”
– Probably need a well wrifen manifesto on why EI needs open source for key soXware and why this is a good thing not something that is a compe99veness problem
• Not all groups think that soXware should be implemented as an interna9onal effort or that an interna9onal effort can deliver on a schedule
– We need a set of examples where this has worked to demonstrate that this is not a crazy idea
• There is not yet a funding agency consensus that the soXware must be developed in an interna9onal fashion
– Some perceive the afempt to do this as misguided
– Some perceive that the current soXware ecosystem is not very interna9onal
• There is a definite difference in the scope/defini9on of “soXware” from the EU, US and Asian par9cipants that is confusing to program management
– Systems soXware vs applica9ons soXware?
– We need a way to resolve these differences in outlook as the programma9c support will like be dependent on this resolu9on
Need to Accomplish
• Develop a governance model for IESP that has broad buy in from the community and agencies
• Validate the roadmap in discussions with applica9ons, math and vendors – Develop a schedule with firm target dates and resource es9mates
• Develop a program execu9on plan that is firm enough that other parts of the EI can use it for leverage
• Address the “unspoken” concerns via short white papers, manifestos and community outreach
• Start ac9ng like a “project” to exercise the governance model