Upload
dodan
View
263
Download
2
Embed Size (px)
Citation preview
© 2014 ANSYS, Inc. November 25, 2014 2
High Performance Computing (HPC) at ANSYS:
An ongoing effort designed to remove computing limitations from engineers who use computer aided engineering in all phases of design, analysis, and testing.
It is a hardware and software initiative!
HPC Defined
© 2014 ANSYS, Inc. November 25, 2014 3
Need for HPC
Impact product design Enable large models Allow parametric studies
Turbulence Combustion Particle Tracking
Assemblies CAD-to-mesh Capture fidelity
Multiple design ideas Optimize the design Ensure product integrity
© 2014 ANSYS, Inc. November 25, 2014 4
HPC Revolution
• Recent advancements have revolutionized the computational speed available on the desktop – Multi-core processors
• Every core is really an independent processor
– Large amounts of RAM
© 2014 ANSYS, Inc. November 25, 2014 5
Typical HPC Growth Path
Cluster Users Desktop User Workstation and/or
Server Users
© 2014 ANSYS, Inc. November 25, 2014 6
Summary
Design Impact
HPC
Using today’s multicore computers is key for companies to remain competitive. ANSYS HPC product suite allows scalability to whatever computational level required, from single-user or small user group options at entry-level up to virtually unlimited parallel capacity or large user group options at enterprise level.
• Shorter time to solution
• Increase high-fidelity insight
• Examine more design variants faster
© 2014 ANSYS, Inc. November 25, 2014 7
Application Example
Benefit of HPC for CFD Applications - Shorter Time to Solution with GPUs
Objective Meeting engineering services schedule & budget, and technical excellence are imperative for success. ANSYS Solution • PSI evaluates and implements the new technology
in software (ANSYS 15.0) and hardware (NVIDIA GPU) as soon as possible.
• GPU produces a 43% reduction in Fluent solution time on an Intel Xeon E5-2687 (8 core, 64GB) workstation equipped with an NVIDIA K40 GPU
Design Impact Increased simulation throughput allows meeting delivery-time requirements for engineering services. Images courtesy of Parametric Solutions, Inc.
© 2014 ANSYS, Inc. November 25, 2014 9
Application Example
Benefit of HPC for CFD Applications - Increase High-Fidelity Insight
Objective Full-stage simulations of turbochargers for diesel engines are needed to reliably understand and optimize their performance prior to physical prototyping. ANSYS Solution • ANSYS CFX simulations deliver near-linear parallel
processing on 160-core HPC system upgrade. • ANSYS HPC performance delivers ability to consider
5 full-stage compressor or turbine designs in a few hours (compared to many days prior to upgrade).
Design Impact ANSYS HPC is enabling Cummins to use larger models with greater geometric details and more-realistic treatment of physical phenomena to generate results in less time.
Courtesy Cummins Turbo Technologies
© 2014 ANSYS, Inc. November 25, 2014 10
Application Example
Benefit of HPC for CFD Applications - Increase High-Fidelity Insight
Objective Overtake the technological challenges on flow assurance and subsea oil processing present on the new pre-salt oil fields. ANSYS Solution • Transient multiphase simulations with ANSYS Fluent are
used to understand the sand transportation inside the kilometres long production lines
• ANSYS HPC performance together with advanced multiphase models and dynamic meshing features enable Petrobras to virtually reproduce critical scenarios and complex operation.
Design Impact Very detailed CFD simulations are providing Petrobras important physical insights that are guiding the design of the new tendency of upstream processing systems at oil industry.
© 2014 ANSYS, Inc. November 25, 2014 11
Application Example
Benefit of HPC for CFD Applications - Examine More Design Variants
Objective Advance in racing boat design to sustain medal-winning performances at Olympic games. ANSYS Solution • ANSYS CFX is used to optimize the fluid
dynamics for different classes of racing kayaks. • Using HPC, transient simulations of moving
boats can be accomplished in just two or three days.
Design Impact Using HPC, the FES engineers were able to efficiently consider up to 20 different virtual designs per boat class, and from those 20 designs they gained enough confidence to build a single prototype for testing.
Courtesy FES
© 2014 ANSYS, Inc. November 25, 2014 12
ANSYS Fluent Scaling at Dual Processors - Faster with More Compute Cores
Intel Xeon E5-2690v2 processors (3 GHz, 20 cores total) with 128 GB of RAM.
0
200
400
600
800
1000
1200
1400
1p 2p 4p 6p 8p 10p 12p 14p 16p 18p 20p
solver ratings
processes Geometric mean
0
2
4
6
8
10
12
14
16
1p 2p 4p 6p 8p 10p 12p 14p 16p 18p 20p
Speedup
processes Speedup
Higher is
Better
© 2014 ANSYS, Inc. November 25, 2014 13
ANSYS CFX Scaling at Multiple Nodes - Faster with More Compute Nodes
Each node has 2 X 10-core Intel Xeon E5-2690 v2 processors (3.0 GHz, 1866 MHz) with 128 GB of RAM. InfiniBand FDR.
Speedup
0
1
2
3
4
5
6
7
8
1 node 2 nodes 4 nodes 8 nodes
© 2014 ANSYS, Inc. November 25, 2014 15
Hexa mesh (830.000 cells)
Standard K-Epsilon Turbulence Model
VOF multiphase model (3 phases): Molten Steel Foamy Slag Oxygen 0
1
2
3
4
5
6
7
12 24 36 48 60 72
Spee
dup
cores
ideal speedupmeasured speedup
cores overall time (h)
measured speedup
ideal speedup
12 0.56 1.00 1 24 0.29 1.94 2 36 0.21 2.60 3 48 0.17 3.33 4 72 0.12 4.76 6
Courtesy of MORE S.r.l.
ANSYS Fluent Scaling at Multiple Nodes - Faster with More Compute Cores, for Complex Physics
© 2014 ANSYS, Inc. November 25, 2014 16
• Segregated implicit solver • Scalable at ~10K cells per core!
0
500
1000
1500
2000
2500
3000
3500
4000
0 2048 4096 6144 8192 10240 12288
Ratin
g
Number of Cores
13.0.014.0.015.0.0
Rating is jobs per day. A higher rating means faster performance.
Truck_111M Turbulent Flow
0
100
200
300
400
500
600
700
800
900
1000
0 2048 4096 6144 8192 10240 12288 14336
Ratin
g
Number of Cores
DLR_96M LES Combustion
R15.0Ideal
• Pressure based coupled solver • Scalable at ~10K cells per core!
Scaling Improvements at 10,000+ Cores Yield Benefits for Smaller Jobs!
ANSYS Fluent Scaling at Multiple Nodes - Parallel Efficiency Improving Release-by-Release!
© 2014 ANSYS, Inc. November 25, 2014 17
ANSYS CFX Scaling at Multiple Nodes - Parallel Efficiency Improving Release-by-Release!
R&D effort to improve HPC scaling in CFX • Basic & physics specific scaling areas • Significantly improved scalability
– Up to 89% efficiency at 2048 cores – HPC improvements are “beta” level for R15.0
4X faster
Courtesy Siemens AG, Müllheim, Germany, Paper GT2013-94639
5X faster
• Six Stage Axial Compressor • 13M nodes • 14 domains, 12 mixing planes
• Duct case • 150M nodes
© 2014 ANSYS, Inc. November 25, 2014 18
ANSYS Fluent 15.0 on GPU Performance of Pressure-Based Solver
Sedan Model
Sedan geometry 3.6M mixed cells Steady, turbulent External aerodynamics Coupled PBNS, DP CPU: Intel Xeon E5-2680; 8 cores GPU: 2 X Tesla K40
CPU + GPU
Segregated solver
1.9x
Higher is
Better
Coupled solver CPU only CPU only
15 Jobs/day
12 Jobs/day
27 Jobs/day
Convergence criteria: 10e-03 for all variables; No of iterations until convergence: segregated CPU-2798 iterations (7070 secs); coupled CPU-967 iterations (5900 secs); coupled 985 iterations (3150 secs)
NOTE: Times for total solution until convergence
© 2014 ANSYS, Inc. November 25, 2014 19
ANSYS Fluent 15.0 on GPU Performance of Pressure-Based Solver
All results are based on turbulent flow over a truck case (14-million cells) until convergence; steady-state, pressure-based coupled solver with double-precision; No. of iterations to reach convergence: CPU-531; CPU+GPU-566; The solution cost is approximated and includes both hardware and software license costs. Productivity is based on number of completed Fluent jobs/day in a multi-user cluster environment. Hardware: Intel Xeon E5-2680 (64 CPU cores on 8 sockets) 8 Tesla K40 GPUs. License: ANSYS Fluent and ANSYS HPC Workgroup 64.
CPU only CPU + GPU
16 Jobs/day
25 Jobs/day
Higher is
Better
Benefit
100%
125%
100%
156%
CPU only CPU + GPU Cost
TRUCK BODY MODEL (14 million cells)
© 2014 ANSYS, Inc. November 25, 2014 20
ANSYS Fluent 15.0 on GPU Better Speedup on Larger Models
Truck Model
NOTE: Reported times are per
iteration 14 million cells
13
9.5
111 million cells
36
18
144 CPU cores
1.4 X 2 X
Lower is
Better
36 CPU cores
36 CPU cores + 12 GPUs
ANSY
S Fl
uent
Tim
e (S
ec)
External aerodynamics Steady, k-ε turbulence Double-precision solver CPU: Intel Xeon E5-2667; 12 cores per node GPU: Tesla K40, 4 per node
144 CPU cores + 48 GPUs
© 2014 ANSYS, Inc. November 25, 2014 21
NVIDIA-GPU Solution Fit for ANSYS Fluent
Yes
No
Pressure-based coupled
solver?
Pressure–based coupled solver
Best-fit for GPUs
Segregated solver Is it a
steady-state analysis?
No
Consider switching to the pressure-based coupled solver for better performance (faster convergence) and further speedups with GPUs. Please see the next slide.
Yes
Is it single-phase & flow dominant?
Not ideal for GPUs
CFD analysis
No
© 2014 ANSYS, Inc. November 25, 2014 22
Scalable HPC Licensing
2048
32 8
128 512
Parallel Enabled (Cores)
Packs per Simulation 1 2 3 4 5
ANSYS HPC (per-process)
ANSYS HPC Pack • Each simulation consumes one or more Packs • Parallel enabled increases quickly with added Packs
ANSYS HPC Workgroup • 16 to 2048 parallel shared across any number of
simulations on a single server (16, 32 and 64 are NEW!) • 128 to 2048 enterprise parallel deployed and used
anywhere in the world
ANSYS HPC Parametric Pack and DSO • Enables simultaneous execution of multiple design
points while consuming just one set of licenses
Single HPC solution for FEA/CFD/FSI and any level of fidelity
© 2014 ANSYS, Inc. November 25, 2014 23
15.0 HPC Licensing Enabling GPU Acceleration - One HPC Task Required to Unlock one GPU!
6 CPU Cores + 2 GPUs 1 x ANSYS HPC Pack 4 CPU Cores + 4 GPUs
Licensing Examples:
Total 8 HPC Tasks (4 GPUs Max)
2 x ANSYS HPC Pack Total 32 HPC Tasks (16 GPUs Max)
Example of Valid Configurations:
24 CPU Cores + 8 GPUs
(Total Use of 2 Compute Nodes)
.
.
.
.
. (Applies to all license schemes: ANSYS HPC, ANSYS HPC Pack, ANSYS HPC Workgroup)
© 2014 ANSYS, Inc. November 25, 2014 24
HPC Parametric Pack License Scheme - Explore Parametric Designs Faster, More Cost Effectively
© 2012 ANSYS, Inc. November 25, 2014 25
Problem Description • Improve mixing while reducing energy • Design objective:
– Optimize the inlet velocities within their operating limits so that both temperature spread at the outlet and pressure drop in the vessel are minimized
• Input Parameters: fluid velocity at the cold and hot inlet (8 Design Points)
Example: Mixing Vessel - ANSYS HPC Parametric Pack
inlet cold
outlet
inlet hot
• Detail: – K-Epsilon Model with Standard Wall Functions – 52,000 nodes and 280,000 elements – Workstation: HP workstation with dual Intel Xeon E5-2687W
(3.10 GHz, 16 cores), 128 GB memory
Licensing Solution • 1 ANSYS Fluent • 2 ANSYS HPC Parametric Packs Result/Benefit • ~4.8x speedup over sequential execution
• Easier and fully automated workflow Acknowledgment: Paul Schofield and Jiaping Zhang, ANSYS Houston
© 2014 ANSYS, Inc. November 25, 2014 26
ANSYS Advantages
HPC for CFD Applications - Final Remarks
• Superior and proven parallel scalability above 80% efficiency with as low as 10,000 cells per CPU core, providing the ability to – Run bigger models at smaller hardware – Run smaller models at higher core counts
• Solvers required for complex physics (chemistry, multiphase) are highly optimized to run fast and deliver outstanding parallel scaling on today’s multicore processors
• ANSYS provides flexible, scalable, and cost-attractive HPC licensing! Courtesy of MORE S.r.l.
© 2014 ANSYS, Inc. November 25, 2014 27
“Take Home” Points / Discussion
With HPC, you can increase your engineering productivity by: • Decreasing your simulation time (increasing throughput)
• Performing larger, more detailed simulations (solving the unsolvable)
• Evaluating more design variations (gaining better insight into product performance)