Upload
others
View
2
Download
0
Embed Size (px)
Citation preview
Simulation in the Automotive Industry: Creating the Next Generation Vehiclenafems.org/americas November 14th, 2019 | Troy, MI
The Effect of InfiniBand In-Network
Computing on CAE Simulations
HPC-AI Advisory Council
October 1st, 2019 | Columbus, OH
Simulation in the Automotive Industry: Creating the Next Generation Vehiclenafems.org/americas November 14th, 2019 | Troy, MI
The HPC-AI Advisory Council
• World-wide HPC non-profit organization
• More than 400 member companies / universities / organizations
• Bridges the gap between HPC-AI usage and its potential
• Provides best practices and a support/development center
• Explores future technologies and future developments
• Leading edge solutions and technology demonstrations
October 1st, 2019 | Columbus, OH
Simulation in the Automotive Industry: Creating the Next Generation Vehiclenafems.org/americas November 14th, 2019 | Troy, MI
HPC Advisory Council Members
HPC-AI Advisory Council Cluster Center (Examples)
• Supermicro / Foxconn 32-node cluster
• Dual Socket Intel(R) Xeon(R) Gold 6138 CPU @ 2.00GHz
• Dell™ PowerEdge™ R730/R630 36-node cluster
• Dual Socket Intel® Xeon® 16-core CPUs E5-2697A V4 @ 2.60 GHz
• AMD Daytona_X
• Dual Socket AMD Rome 128 core 8-node cluster @ 2.25GHz
Simulation in the Automotive Industry: Creating the Next Generation Vehiclenafems.org/americas November 14th, 2019 | Troy, MI
• Lattice QCD
• LAMMPS
• LS-DYNA
• miniFE
• MILC
• MSC Nastran
• MR Bayes
• MM5
• MPQC
• NAMD
• Nekbone
• NEMO
• NWChem
• Octopus
• OpenAtom
• OpenFOAM
• OpenMX
• OptiStruct
• PARATEC
• PFA
• PFLOTRAN
• Quantum ESPRESSO
• RADIOSS
• SNAP
• SPECFEM3D
• STAR-CCM+
• STAR-CD
• VASP
• VSP
• WRF
Multiple Applications Best Practices Published
• Abaqus
• ABySS
• AcuSolve
• Amber
• AMG
• AMR
• ANSYS CFX
• ANSYS FLUENT
• ANSYS Mechanical
• BQCD
• BSMBench
• CAM-SE
• CCSM
• CESM
• COSMO
• CP2K
• CPMD
• Dacapo
• Desmond
• DL-POLY
• Eclipse
• FLOW-3D
• GADGET-2
• Graph500
• GROMACS
• Himeno
• HIT3D
• HOOMD-blue
• HPCC
• HPCG
• HYCOM
• ICON
App
App
App
October 1st, 2019 | Columbus, OH
Simulation in the Automotive Industry: Creating the Next Generation Vehiclenafems.org/americas November 14th, 2019 | Troy, MI
HPC-AI Advisory Council Activities
• HPC-AI Advisory Council– More then 400 members, http://www.hpcadvisorycouncil.com/
– Application best practices, case studies– Development and benchmarking center with remote access for users– World-wide conferences
• Conferences– USA (Stanford University) – February– Switzerland (CSCS) – April– Student Cluster Competition (ISC) – July– China (HPC China) - August– Australia - August– UK – September– China – November
• Competitions– APAC HPC-AI Competition - March– China - 6th Annual RDMA Competition - May– ISC Germany - Annual Student Cluster Competition - June
• For more information – www.hpcadvisorycouncil.com– [email protected]
October 1st, 2019 | Columbus, OH
Simulation in the Automotive Industry: Creating the Next Generation Vehiclenafems.org/americas November 14th, 2019 | Troy, MI
HPC|Works Community
October 1st, 2019 | Columbus, OH
Simulation in the Automotive Industry: Creating the Next Generation Vehiclenafems.org/americas November 14th, 2019 | Troy, MI
Computing Evolution – Compute Centric to Data Centric
Von NeumannMachine
NeuralNetworks
Compute-Centric Data-Centric
October 1st, 2019 | Columbus, OH
Simulation in the Automotive Industry: Creating the Next Generation Vehiclenafems.org/americas November 14th, 2019 | Troy, MI
The Need for Intelligent Data Center
CPU-Centric (Onload) Data-Centric (Offload)
Move Data to the ComputeMust Wait for the Data
Creates Performance Bottlenecks
GPU
CPU
GPU
CPU
Onload Network
GPU
CPU
CPU
GPU
GPU
CPU
GPU
CPU
GPU
CPU
CPU
GPU
Move Compute to the DataAnalyze Data Everywhere
Higher Performance and Scale
Scalable Hierarchical Aggregation and Reduction Protocol (SHARP)
• Reliable Scalable General Purpose Primitive
• In-network Tree based aggregation mechanism
• Large number of groups
• Multiple simultaneous outstanding operations
• Applicable to Multiple Use-cases
• HPC Applications using MPI / SHMEM
• Distributed Machine Learning applications
• Scalable High Performance Collective Offload
• Barrier, Reduce, All-Reduce, Broadcast and more
• Sum, Min, Max, Min-loc, max-loc, OR, XOR, AND
• Integer and Floating-Point, 16/32/64 bits
DataAggregated
AggregatedResult
Aggregated Result
Data
Host Host Host Host Host
SwitchSwitch
Switch
SHARP AllReduce Performance Advantages (128 Nodes)
SHARP AllReduce Performance Advantages 1500 Nodes, 60K MPI Ranks, Dragonfly+ Topology
The Niagara Supercomputer – University of Toronto
October 1st, 2019 | Columbus, OH
Simulation in the Automotive Industry: Creating the Next Generation Vehiclenafems.org/americas November 14th, 2019 | Troy, MI
OpenFOAM
• Toolbox in an open source CFD applications that can simulate– Complex fluid flows involving
– Chemical reactions
– Turbulence
– Heat transfer
– Solid dynamics
– Electromagnetics
– The pricing of financial options
October 1st, 2019 | Columbus, OH
Simulation in the Automotive Industry: Creating the Next Generation Vehiclenafems.org/americas November 14th, 2019 | Troy, MI
OpenFOAM Profiling – MPI/User Time Ratio
• OpenFOAM simpleFOAM solver uses mainly non-blocking communications
• 23% of overall runtime spent on MPI communication at 16 nodes / 640 MPI cores
• Both Intel MPI and HPC-X spent the same time in overall runtime on MPI communications
• Overall of MPI time spent in MPI non-blocking communications (MPI_Waitall 47%, MPI_Isend,
47%)
• Most of the MPI calls made by OpenFOAM are MPI_Waitall
October 1st, 2019 | Columbus, OH
Simulation in the Automotive Industry: Creating the Next Generation Vehiclenafems.org/americas November 14th, 2019 | Troy, MI
OpenFOAM Profiling – MPI Time
• MPI profiler shows the type of underlying MPI network communications
– Majority of communications occurred are non-blocking communications
• Majority of the MPI time is spent on non-blocking communications at 32 nodes
– MPI_Waitall (11% wall), 8-byte MPI_Recv (1.4% wall), 1-byte MPI_Recv (0.7% wall)
– Only 14% of the overall runtime is spent on MPI communications at 32-nodes (when EDR
is used)
October 1st, 2019 | Columbus, OH
Simulation in the Automotive Industry: Creating the Next Generation Vehiclenafems.org/americas November 14th, 2019 | Troy, MI
OpenFOAM Profiling – MPI Communication Topology
• Communication topology shows communication patterns among MPI ranks
• MPI processes mainly communicates with neighbors, but also shows some other patterns
32 Nodes16 Nodes8 Nodes4 Nodes
October 1st, 2019 | Columbus, OH
Simulation in the Automotive Industry: Creating the Next Generation Vehiclenafems.org/americas November 14th, 2019 | Troy, MI
OpenFOAM Performance E5-2697A v4 @ 2.60GHz, HDR100
23%
October 1st, 2019 | Columbus, OH
Simulation in the Automotive Industry: Creating the Next Generation Vehiclenafems.org/americas November 14th, 2019 | Troy, MI
OpenFOAM Performance E5-2697A v4 @ 2.60GHz, HDR100
50%
October 1st, 2019 | Columbus, OH
Simulation in the Automotive Industry: Creating the Next Generation Vehiclenafems.org/americas November 14th, 2019 | Troy, MI
OpenFOAM Performance Using (HPC-X 2.5 MPI)
35%
October 1st, 2019 | Columbus, OH
Simulation in the Automotive Industry: Creating the Next Generation Vehiclenafems.org/americas November 14th, 2019 | Troy, MI
LS-DYNA
• LS-DYNA – A general purpose structural and fluid analysis simulation software
package capable of simulating complex real world problems
– Developed by the Livermore Software Technology Corporation (LSTC)
• LS-DYNA used by – Automobile
– Aerospace
– Construction
– Military
– Manufacturing
– Bioengineering
2019, 28 - 29 October 35th INTERNATIONAL CAE CONFERENCE AND EXHIBITION
21
October 1st, 2019 | Columbus, OH
Simulation in the Automotive Industry: Creating the Next Generation Vehiclenafems.org/americas November 14th, 2019 | Troy, MI
LS-DYNA PerformanceIntel Xeon Gold 6138 CPU 2.00GHz , HDR100
2019, 28 - 29 October 35th INTERNATIONAL CAE CONFERENCE AND EXHIBITION
22
October 1st, 2019 | Columbus, OH
Simulation in the Automotive Industry: Creating the Next Generation Vehiclenafems.org/americas November 14th, 2019 | Troy, MI
LS-DYNA PerformanceIntel Xeon Gold 6138 CPU 2.00GHz , HDR100
2019, 28 - 29 October 35th INTERNATIONAL CAE CONFERENCE AND EXHIBITION
23
39 %
October 1st, 2019 | Columbus, OH
Simulation in the Automotive Industry: Creating the Next Generation Vehiclenafems.org/americas November 14th, 2019 | Troy, MI
ANSYS Fluent
• Computational Fluid Dynamics (CFD)
– Enables the study of the dynamics of things that flow
– Enable better understanding of qualitative and quantitative physical phenomena in the flow which is used to improve engineering design.
• CFD brings together a number of different disciplines
– Fluid dynamics, mathematical theory of partial differential systems, computational geometry, numerical analysis, Computer science.
• ANSYS FLUENT is a leading CFD application from ANSYS
– Widely used in almost every industry sector and manufactured product.
2019, 28 - 29 October 35th INTERNATIONAL CAE CONFERENCE AND EXHIBITION
24
October 1st, 2019 | Columbus, OH
Simulation in the Automotive Industry: Creating the Next Generation Vehiclenafems.org/americas November 14th, 2019 | Troy, MI
ANSYS FluentE5-2697A v4 @ 2.60GHz, HDR100
2019, 28 - 29 October 35th INTERNATIONAL CAE CONFERENCE AND EXHIBITION
25
26%
October 1st, 2019 | Columbus, OH
Simulation in the Automotive Industry: Creating the Next Generation Vehiclenafems.org/americas November 14th, 2019 | Troy, MI
ANSYS Fluent E5-2697A v4 @ 2.60GHz, HDR100
2019, 28 - 29 October 35th INTERNATIONAL CAE CONFERENCE AND EXHIBITION
26
15%
October 1st, 2019 | Columbus, OH
Simulation in the Automotive Industry: Creating the Next Generation Vehiclenafems.org/americas November 14th, 2019 | Troy, MI
InfiniBand QoS
• IBTA Standard
• Application SL -> VL mapping
• WWR / Strict Priority setting
2019, 28 - 29 October 35th INTERNATIONAL CAE CONFERENCE AND EXHIBITION
27
October 1st, 2019 | Columbus, OH
Simulation in the Automotive Industry: Creating the Next Generation Vehiclenafems.org/americas November 14th, 2019 | Troy, MI
QoS LS-DYNA test• Run the test with no background traffic, on 4 nodes
• Add massive background traffic but without enabling
any QoS; both the application and massive
background traffic use the same SL and the same
network resource.
• Add massive background traffic, but with enabling
QoS, and setting a priority to the LS-DYNA application
over the background traffic. 2019, 28 - 29 October 35th INTERNATIONAL CAE CONFERENCE AND EXHIBITION
28
October 1st, 2019 | Columbus, OH
Simulation in the Automotive Industry: Creating the Next Generation Vehiclenafems.org/americas November 14th, 2019 | Troy, MI
LS-DYNA QoSE5-2697A v4 @ 2.60GHz, HDR100
2019, 28 - 29 October 35th INTERNATIONAL CAE CONFERENCE AND EXHIBITION
29
October 1st, 2019 | Columbus, OH
Simulation in the Automotive Industry: Creating the Next Generation Vehiclenafems.org/americas November 14th, 2019 | Troy, MI
Summary• HPC cluster environments impose high demands on connectivity throughput and low latency
with low CPU overhead, network flexibility, and high efficiency
• Fulfilling these demands enables the maintenance of a balanced system that can achieve high application performance and high scaling
• With the increase in number of CPU cores and application threads, there is a need to develop a new HPC cluster architecture - a data-focused architecture
• The Co-Design collaboration enables the development of In-Network Computing technology that breaks the performance and scalability barriers
• The OpenFOAM, LS-DYNA and ANSYS Fluent applications were benchmarked over AMD Rome and Intel CPUs for this study to demonstrate the advantages of In-Network Computing technology
• We have witness 50% performance advantage and linear scalability with InfiniBand In-Network Computing technology
• InfiniBand QoS can be considered in network design. By enabling QoS, we can achieve similar performance for LS-DYNA, with and without the background noise.
All trademarks are property of their respective owners. All information is provided “As-Is” without any kind of warranty. The HPC Advisory Council makes no representation to the accuracy and completeness of the information
contained herein. HPC Advisory Council undertakes no duty and assumes no obligation to update or correct any information presented herein
Thank You