20
Release 2015.0 April 15, 2015 1 © 2015 ANSYS, Inc. 2015.0 Release Introduction to ANSYS HFSS Lecture 3-2: High Performance Computing (HPC) for HFSS 3D

Lecture 3-2: High Performance Computing (HPC) for HFSS 3D€¦ · – Parallelized excitations • Field Recovery – Parallelized field recovery for multiple excitations • Available

Embed Size (px)

Citation preview

Page 1: Lecture 3-2: High Performance Computing (HPC) for HFSS 3D€¦ · – Parallelized excitations • Field Recovery – Parallelized field recovery for multiple excitations • Available

Release 2015.0 April 15, 2015 1 © 2015 ANSYS, Inc.

2015.0 Release

Introduction to ANSYS HFSS

Lecture 3-2: High Performance Computing (HPC) for HFSS 3D

Page 2: Lecture 3-2: High Performance Computing (HPC) for HFSS 3D€¦ · – Parallelized excitations • Field Recovery – Parallelized field recovery for multiple excitations • Available

Release 2015.0 April 15, 2015 2 © 2015 ANSYS, Inc.

High Performance Computing (HPC) for HFSS

Page 3: Lecture 3-2: High Performance Computing (HPC) for HFSS 3D€¦ · – Parallelized excitations • Field Recovery – Parallelized field recovery for multiple excitations • Available

Release 2015.0 April 15, 2015 3 © 2015 ANSYS, Inc.

Solution Process

Initial Mesh Adaptive

Mesh Solve

Frequency Sweep

HPC HPC

Solve

Page 4: Lecture 3-2: High Performance Computing (HPC) for HFSS 3D€¦ · – Parallelized excitations • Field Recovery – Parallelized field recovery for multiple excitations • Available

Release 2015.0 April 15, 2015 4 © 2015 ANSYS, Inc.

HFSS Solvers and Solver Options

Methods

HPC

Finite Element HFSS-IE Eigenmode HFSS-TR

Techniques

Direct

Iterative

Direct

Iterative

Hybrid Explicit/Implicit

Domain Decomposition Methods (DDM)

Distributed Matrix Solver

Multi-Threaded Shared Memory

Distributed Matrix Solver

Multi-Threaded Shared Memory

Distributed Matrix Solver

Multi-Threaded Shared Memory

Solve

Multi-Threaded Shared Memory

Page 5: Lecture 3-2: High Performance Computing (HPC) for HFSS 3D€¦ · – Parallelized excitations • Field Recovery – Parallelized field recovery for multiple excitations • Available

Release 2015.0 April 15, 2015 5 © 2015 ANSYS, Inc.

Leveraging High Performance Computing Hardware

Multi-Threading

Spectral Domain Method Distributed Frequency Sweeps

Distributed Parallel Solvers

HFSS DDM Mesh and Matrix based Domain Solver

HFSS Periodic Domains Finite Array Domain Solver

HFSS-IE DDM Matrix based Domain Solver

HFSS-Hybrid DDM Hybrid HFSS/HFSS-IE Domain Solver

Faster

Bigger

HFSS Distributed Direct HFSS Direct Solver Memory

Page 6: Lecture 3-2: High Performance Computing (HPC) for HFSS 3D€¦ · – Parallelized excitations • Field Recovery – Parallelized field recovery for multiple excitations • Available

Release 2015.0 April 15, 2015 6 © 2015 ANSYS, Inc.

HFSS with HPC

Faster

Faster - Solver technology targeted at utilizing multiple processor/cores to accelerate the

solution process.

Multi-Threading

Spectral Domain Method

Distributed HFSS-Transient

Page 7: Lecture 3-2: High Performance Computing (HPC) for HFSS 3D€¦ · – Parallelized excitations • Field Recovery – Parallelized field recovery for multiple excitations • Available

Release 2015.0 April 15, 2015 7 © 2015 ANSYS, Inc.

HPC: Multi-Threading (MT)

• Multi-Threading (HPC-MT) • Single workstation solution to increase the speed of the solver

• TAU Initial Mesh Generation

– Parallelized mesh generation

• Direct Matrix Solver

– Parallelized matrix solver

• Iterative Solver

– Parallelized matrix pre-conditioner

– Parallelized excitations

• Field Recovery

– Parallelized field recovery for multiple excitations

• Available in HFSS 3D, HFSS-IE, and HFSS-Transient

HFSS – HPC-MT Processor Performance* Speed up vs. number of cores: 1 HPC pack = 8 cores

4 Cores

8 Cores

2 Cores

1 Core 1x

1.9x

3.6x

5.6x

(Baseline) No HPC

Thread 1

Thread 2 Thread 3

Thread 4

*HFSS Direct Matrix Solver

Page 8: Lecture 3-2: High Performance Computing (HPC) for HFSS 3D€¦ · – Parallelized excitations • Field Recovery – Parallelized field recovery for multiple excitations • Available

Release 2015.0 April 15, 2015 8 © 2015 ANSYS, Inc.

HPC: Spectral Domain Method (SDM)

• Spectral Decomposition Method (HPC-SDM) • Accelerates frequency sweeps by distributing the

spectral content across a network of processors

– Uses RSM

• Increases simulation speed

– Combines with HPC-MT

• Scalable to large numbers of cores

• Available in HFSS 3D and HFSS-IE

Frequency 1

Frequency 4

Frequency 3

Frequency 2

• Interpolating vs Discrete frequency sweep • Why do we have an interpolating sweep?

– Minimize the number of solved points

• With HPC-SDM it is compelling to run discrete sweeps

– Passive/Causal – at least no interpolating noise

– Save Fields at each frequency point

• HPC Packs • 1 Pack: 8 Cores

• 2 Packs: 32 Cores

• 3 Packs: 128 Cores

Page 9: Lecture 3-2: High Performance Computing (HPC) for HFSS 3D€¦ · – Parallelized excitations • Field Recovery – Parallelized field recovery for multiple excitations • Available

Release 2015.0 April 15, 2015 9 © 2015 ANSYS, Inc.

HFSS: HPC-SDM for Discrete and Interpolating Sweeps

• HPC setup to maximize SDM Factor: Frequency Points vs. Multi-Threading

0.00 1.00 2.00 3.00 4.00 5.00

Local

SDM1

SDM2

SDM4

SDM Factor 0.00 1.00 2.00 3.00 4.00 5.00

Local

SDM1

SDM2

SDM4

SDM Factor SDM1: 32 Freq

SDM2: 16 Freq/2 HPC-MT

SDM4: 8 Freq/4 HPC-MT

Discrete sweep: • Best setup is without multi-threading • Running more frequency points improves performance

- Multi-Threading does not scale linearly with cores

Interpolating sweep: • Total core count is only factor that impacts performance

- Does not matter how you use cores: Frequency Points vs. Multi-Threading: on average same performance

- Multi-Threading does not scale linearly with cores

- Interpolating Efficiency increases as the number of simultaneously frequency points decreases

Page 10: Lecture 3-2: High Performance Computing (HPC) for HFSS 3D€¦ · – Parallelized excitations • Field Recovery – Parallelized field recovery for multiple excitations • Available

Release 2015.0 April 15, 2015 10 © 2015 ANSYS, Inc.

HPC: HFSS-Transient Distributed Parallel Solver

• HFSS-Transient Distributed Parallel (HPC-DP) • Accelerates HFSS-Transient solutions by distributing the excitations across a network of processors

• Increases simulation speed

– Combines with HPC-MT

• Available in HFSS-Transient

Excitation 1

Excitation 4

Excitation 3

Excitation 2

Page 11: Lecture 3-2: High Performance Computing (HPC) for HFSS 3D€¦ · – Parallelized excitations • Field Recovery – Parallelized field recovery for multiple excitations • Available

Release 2015.0 April 15, 2015 11 © 2015 ANSYS, Inc.

HFSS with HPC

Bigger

Bigger - Solver technology targeted at distributing the simulation memory across multiple

computers. The distributed nature of the solution may also result in faster simulations, but it is primarily intended to increase capacity.

HFSS DDM (Mesh Based)

HFSS-IE DDM (Matrix Based)

HFSS-Hybrid DDM

HFSS Periodic Domains

Page 12: Lecture 3-2: High Performance Computing (HPC) for HFSS 3D€¦ · – Parallelized excitations • Field Recovery – Parallelized field recovery for multiple excitations • Available

Release 2015.0 April 15, 2015 12 © 2015 ANSYS, Inc.

HPC: HFSS-DDM (Mesh Based)

• Domain Decomposition Method: Meshed Based • Distributed memory parallel technique

– Distributes mesh sub-domains to network of processors/RAM

• Significantly increases simulation capacity

• Highly scalable to large numbers of processors

– Uses industry standard MPI

– Combines with HPC-MT

• Automatic generation of domains by mesh partitioning

– User friendly

– Load balance

• Hybrid iterative & direct solver

– Multi-frontal direct solver for each sub-domain

– Sub-domains exchange information iteratively via Robin’s transmission conditions (RTC)

• Available in HFSS 3D

Domain 1

Domain 4

Domain 3

Domain 2

Page 13: Lecture 3-2: High Performance Computing (HPC) for HFSS 3D€¦ · – Parallelized excitations • Field Recovery – Parallelized field recovery for multiple excitations • Available

Release 2015.0 April 15, 2015 13 © 2015 ANSYS, Inc.

Domain Decomposition Examples (FEM)

Solution Size

Total RAM (GB)

Elapsed Time

(hours)

Distributed Engines

4,861 λ3 160 GB (DDM)

8 12

Solution Size

Total RAM (GB)

Elapsed Time

(hours)

Distributed Engines

33,750 λ3 300 GB (DDM)

5 72

Page 14: Lecture 3-2: High Performance Computing (HPC) for HFSS 3D€¦ · – Parallelized excitations • Field Recovery – Parallelized field recovery for multiple excitations • Available

Release 2015.0 April 15, 2015 14 © 2015 ANSYS, Inc.

HPC: HFSS-IE DDM (Matrix Based)

• Domain Decomposition Method: Matrix Based • Distributed memory parallel technique

– Distributes matrix solution to network of processors/RAM

• Significantly increases simulation capacity

• Highly scalable to large numbers of machines

– Uses industry standard MPI

– Combines with HPC-MT

• Automatic generation and load balancing of matrix partitions

• Available in HFSS-IE

Domain1

Domain 3

Domain 2

Domain 4

18 GHz RAM Elapsed Time

HFSS-IE HPC-DDM

146G 7.3h Incident Wave

Page 15: Lecture 3-2: High Performance Computing (HPC) for HFSS 3D€¦ · – Parallelized excitations • Field Recovery – Parallelized field recovery for multiple excitations • Available

Release 2015.0 April 15, 2015 15 © 2015 ANSYS, Inc.

❷IE-Region

❶FEM-IE

❶ FEM-IE

❶ FEM-IE

HPC: Hybrid HFSS - FEM DDM with IE Regions

• Domain Decomposition Method for Hybrid Solve • Extension of HFSS DDM to support the Hybrid FEM/IE solver with IE Regions & FE-BI boundaries

– Distributes mesh sub-domains to network of processors

• FEM volume can be sub-divided into multiple domains

– IE Domains and FEBI boundaries will be distributed to separate nodes when they become large

• Significantly increases simulation capacity

• Uses Industry Standard MPI

• Available in HFSS 3D with HFSS-IE license

Domain 1

IE-Domain

Domain 3

Domain 2

Page 16: Lecture 3-2: High Performance Computing (HPC) for HFSS 3D€¦ · – Parallelized excitations • Field Recovery – Parallelized field recovery for multiple excitations • Available

Release 2015.0 April 15, 2015 16 © 2015 ANSYS, Inc.

HPC: HFSS-Periodic Domains (Finite Arrays)

• Periodic Domain Decomposition (HPC-PDM) • Distributed memory parallel technique for finite periodic geometries, such as finite antenna arrays

– Distributes unit cell mesh sub-domains to network of processors/RAM

• Significantly increases simulation capacity

• Highly scalable to large numbers of processors

– Uses industry standard MPI

– Combines with HPC-MT

• Automatic generation of domains

– User friendly and easy to implement

– Efficient simulation of only unique cells

• Available in HFSS 3D

Domain 1

Domain 4

Domain 2

Domain 3

Unit Cell Mesh

Finite Periodic

Array

Unit Cell Adaptive

Mesh

Linked Mesh:

No additional

adaptive meshing

Finite Array

Definition

Page 17: Lecture 3-2: High Performance Computing (HPC) for HFSS 3D€¦ · – Parallelized excitations • Field Recovery – Parallelized field recovery for multiple excitations • Available

Release 2015.0 April 15, 2015 17 © 2015 ANSYS, Inc.

HFSS: HPC-PDM Snowflake Array

E-field 5mm above aperture

Circularly polarized elements

10 GHz RAM Elapsed Time

HFSS HPC-PDM

62G 27min

529 circular WG elements, 1058 modes

Array Mask

Composite Excitation

Page 18: Lecture 3-2: High Performance Computing (HPC) for HFSS 3D€¦ · – Parallelized excitations • Field Recovery – Parallelized field recovery for multiple excitations • Available

Release 2015.0 April 15, 2015 18 © 2015 ANSYS, Inc.

Analysis Configuration: Manual vs. Automatic

• Automatic Settings of Analysis configurations • Indicate machines and total number of cores per machine

to use in simulations

• Default Settings of Analysis configurations • Indicate machines, tasks and total number of cores per

machine to use in simulations

• Indicate Job Distribution

Page 19: Lecture 3-2: High Performance Computing (HPC) for HFSS 3D€¦ · – Parallelized excitations • Field Recovery – Parallelized field recovery for multiple excitations • Available

Release 2015.0 April 15, 2015 19 © 2015 ANSYS, Inc.

Multi-level HPC for Speed and Scale Level 1

Distributed

Variations

Level 2 Distributed

Memory

32 core DDM per variation

Time for 8 variations, serial: 14:52:57

128 core ‘two level’, 32 core DDM per variation

Time for 8 variations, four variations in parallel: 3:39:38

~4X faster

Page 20: Lecture 3-2: High Performance Computing (HPC) for HFSS 3D€¦ · – Parallelized excitations • Field Recovery – Parallelized field recovery for multiple excitations • Available

Release 2015.0 April 15, 2015 20 © 2015 ANSYS, Inc.

Distributed Simulation Technologies Installation

RSM and MPI:

Manage communications between

local and remote computers for

HFSS simulations

Use RSM

Use MPI