18
Performance Study: Abaqus/Standard 6.8-3 Stan Posey Director, Industry and Applications Market Development Panasas, Fremont, CA, USA Bill Loewe, Ph.D. Sr. Applications Engineer Panasas, Fremont, CA, USA

Performance Study: Abaqus/Standard 6.8-3Slide 11Slide 11 Please Keep Confidential to Customer and PanasasPlease Keep Confidential Between CSC and Panasas Abaqus/Standard 6.8-3: Comparison

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Performance Study: Abaqus/Standard 6.8-3Slide 11Slide 11 Please Keep Confidential to Customer and PanasasPlease Keep Confidential Between CSC and Panasas Abaqus/Standard 6.8-3: Comparison

Performance Study:Abaqus/Standard 6.8-3

Stan PoseyDirector, Industry and Applications Market DevelopmentPanasas, Fremont, CA, USA

Bill Loewe, Ph.D.Sr. Applications EngineerPanasas, Fremont, CA, USA

Page 2: Performance Study: Abaqus/Standard 6.8-3Slide 11Slide 11 Please Keep Confidential to Customer and PanasasPlease Keep Confidential Between CSC and Panasas Abaqus/Standard 6.8-3: Comparison

Slide 2 Please Keep Confidential Between CSC and PanasasSlide 2 Please Keep Confidential to Customer and Panasas

Background on Abaqus/Standard Study

Abaqus is an application from SIMULIA -- not a benchmark kernel

The FEA model and tests are relevant to customer practice

All tests were run on a dedicated system at Panasas

The results were validated by SIMULIA

Since Apr 2007, SIMULIA and Panasas have made joint

investments in a business and technical alliance that

ensures Abaqus will fully leverage Panasas PanFS

This study demonstrates benefits of Panasas parallel file

system and parallel storage for Abaqus/Standard 6.8-3

with tests for both single job and mulit-job computing

Motivation

Considerations

Page 3: Performance Study: Abaqus/Standard 6.8-3Slide 11Slide 11 Please Keep Confidential to Customer and PanasasPlease Keep Confidential Between CSC and Panasas Abaqus/Standard 6.8-3: Comparison

Slide 3 Please Keep Confidential Between CSC and PanasasSlide 3 Please Keep Confidential to Customer and Panasas

3

Abaqus/Standard 6.8-3: Model S4b 5M DOF Non-linear Static Analysis

Automotive engine block cylinder head bolt-up

Panasas Study on Abaqus/Standard 6.8-3

Page 4: Performance Study: Abaqus/Standard 6.8-3Slide 11Slide 11 Please Keep Confidential to Customer and PanasasPlease Keep Confidential Between CSC and Panasas Abaqus/Standard 6.8-3: Comparison

Slide 4 Please Keep Confidential Between CSC and PanasasSlide 4 Please Keep Confidential to Customer and Panasas

Abaqus/Standard I/O Scheme

CSM implicit solver

Abaqus/Standard is

direct and single-

step, with out-of-core

READS and WRITES

-- I/O occurs in the sparse

factor phase of the solver

-- this scheme is for static, if

an eigen (Lanzcos) solution,

then I/O can be VERY heavy

-- NOTE: Abaqus also has an

implicit iterative solver

start

Write solution results [100’s of GB’s of I/O]

complete

element matrix

generation and

assembly into

global matrix

matrix factor

(dominant phase,

as much as 85%

of total time,

often I/O wait)

FBS solve phase,

stress recovery,

multiple RHS’s

Factor matrixout-of-core,reads/writes

.

.

.

.

.

.

Read nodes,elements andcontrol file

Work Dir: serial IO

Scratch Dir: parallel IO

Work Dir: serial IO

Job Task IO Scheme IO Operation

Page 5: Performance Study: Abaqus/Standard 6.8-3Slide 11Slide 11 Please Keep Confidential to Customer and PanasasPlease Keep Confidential Between CSC and Panasas Abaqus/Standard 6.8-3: Comparison

Slide 5 Please Keep Confidential Between CSC and PanasasSlide 5 Please Keep Confidential to Customer and Panasas

Features of the Hardware System Configurations

CISCOSYSTEMS

NOTE: Panasas total 30 TB in 12U, installed and operational in just 1 hr!

10 GigE

Features of Penguin cluster configuration:

Processors: 2.3GHz QC AMD Opteron

Nodes: 8 x 2 Sockets x 4 cores; 2 GB/core

Interconnect: 10GigE

Local FS: Ext3, single drive per node, 160 GB

SATA, 7200 RPM

Features of the Panasas storage system:

3 shelves: 1 director + 10 storage blades

Each shelf 10 TB, total of 30 TB

Panasas Study on Abaqus/Standard 6.8-3

8 nodes,

64 cores

10

GigE

Page 6: Performance Study: Abaqus/Standard 6.8-3Slide 11Slide 11 Please Keep Confidential to Customer and PanasasPlease Keep Confidential Between CSC and Panasas Abaqus/Standard 6.8-3: Comparison

Slide 6 Please Keep Confidential Between CSC and PanasasSlide 6 Please Keep Confidential to Customer and Panasas

Abaqus/Standard 6.8-3: Comparison of PanFS vs. Local FS

Total Time in Seconds

5M DOFEngine Block

1273213674

12770

15108

0

6000

12000

18000

PanFS -- Num OpsPanFS -- Total TimeLocal FS -- Num OpsLocal FS -- Total Time

Lower

is

better

11%

NOTE: Num-Ops times within 1%Difference is IO

NOTE: PanFS

11% Advantage

in Total Time

vs. Local FS

S4b Performance for Single Core

1 Job x 1 Core x 1 Node

Times for Single Job on a Single Core

Page 7: Performance Study: Abaqus/Standard 6.8-3Slide 11Slide 11 Please Keep Confidential to Customer and PanasasPlease Keep Confidential Between CSC and Panasas Abaqus/Standard 6.8-3: Comparison

Slide 7 Please Keep Confidential Between CSC and PanasasSlide 7 Please Keep Confidential to Customer and Panasas

Numerical vs. IO Computational Profile

0

6000

12000

18000Local FS -- IO-OpsLocal FS -- Num-OpsPanFS -- IO-OpsPanFS -- Num-Ops

Job Profiles of Numerical Ops % vs. IO %

So

lve

r

97%

50%

Lower

is

better

13674 IO – 16%

93%

Numerical

Operations

IO – 7%

15108

NOTE: PanFS

11% Advantage

in Total Time

vs. Local FS

5M DOFEngine Block

Abaqus/Standard 6.8-3: Comparison of PanFS vs. Local FS

Total Time in Seconds

84%

Numerical

Operations

1 Job x 1 Core x 1 Node

Page 8: Performance Study: Abaqus/Standard 6.8-3Slide 11Slide 11 Please Keep Confidential to Customer and PanasasPlease Keep Confidential Between CSC and Panasas Abaqus/Standard 6.8-3: Comparison

Slide 8 Please Keep Confidential Between CSC and PanasasSlide 8 Please Keep Confidential to Customer and Panasas

Abaqus/Standard 6.8-3: Comparison of PanFS vs. Local FS

S4b Performance for 1 Core x 8 NodesTotal Time in Seconds

5M DOFEngine Block

12641

14373

12654

15064

0

6000

12000

18000

PanFS -- Num OpsPanFS -- Total TimeLocal FS -- Num OpsLocal FS -- Total Time

Lower

is

better

5%

NOTE: PanFS

5% Advantage

in Total Time

vs. Local FS

Average Times of 8 Simultaneous Jobs

NOTE: N-Ops times within 1%Difference is IO

Average of 8 Jobs | Each on 1 Core | Each on 1 Node | 7 Cores Idle on Each Node

8 Jobs x 1 Core x 8 Nodes

Page 9: Performance Study: Abaqus/Standard 6.8-3Slide 11Slide 11 Please Keep Confidential to Customer and PanasasPlease Keep Confidential Between CSC and Panasas Abaqus/Standard 6.8-3: Comparison

Slide 9 Please Keep Confidential Between CSC and PanasasSlide 9 Please Keep Confidential to Customer and Panasas

Numerical vs. IO Computational Profile

0

6000

12000

18000Local FS -- IO-OpsLocal FS -- Num-OpsPanFS -- IO-OpsPanFS -- Num-Ops

Job Profiles of Numerical Ops % vs. IO %

So

lve

r

97%

50%

Lower

is

better

14373IO – 19%

88%

Numerical

Operations

IO – 12%

15064

5M DOFEngine Block

Abaqus/Standard 6.8-3: Comparison of PanFS vs. Local FS

Total Time in Seconds

81%

Numerical

Operations

NOTE: PanFS

5% Advantage

in Total Time

vs. Local FS

Average of 8 Jobs | Each on 1 Core | Each on 1 Node | 7 Cores Idle on Each Node

8 Jobs x 1 Core x 8 Nodes

Page 10: Performance Study: Abaqus/Standard 6.8-3Slide 11Slide 11 Please Keep Confidential to Customer and PanasasPlease Keep Confidential Between CSC and Panasas Abaqus/Standard 6.8-3: Comparison

Slide 10 Please Keep Confidential Between CSC and PanasasSlide 10 Please Keep Confidential to Customer and Panasas

Abaqus/Standard 6.8-3: Comparison of PanFS vs. Local FS

S4b Performance for 8 Cores x 1 NodeTotal Time in Seconds

5M DOFEngine Block

2946

5495

0

2000

4000

6000PanFS -- Total Time

Local FS -- Total Time

Lower

is

better

NOTE: PanFS

46% Advantage

in Total Time

vs. Local FS

Singe Job on Single 8-Core Node

46%

1 Job x 8 Cores x 1 Node

Page 11: Performance Study: Abaqus/Standard 6.8-3Slide 11Slide 11 Please Keep Confidential to Customer and PanasasPlease Keep Confidential Between CSC and Panasas Abaqus/Standard 6.8-3: Comparison

Slide 11 Please Keep Confidential Between CSC and PanasasSlide 11 Please Keep Confidential to Customer and Panasas

Abaqus/Standard 6.8-3: Comparison of PanFS vs. Local FS

S4b Performance for Single Job ScalingTotal Time in Seconds

5M DOFEngine Block

13674

2946

5495

15108

0

4000

8000

12000

16000

Lower

is

better

Scalability of Single Job from 1 to 8 Cores

1 Job

1 Core

1 Job

8 Cores

NOTE: PanFS

58% in Parallel

Efficiency vs.

35% for Local FS

1 Job

1 Core

1 Job

8 Cores

4.6x on 8

2.8x on 8

Page 12: Performance Study: Abaqus/Standard 6.8-3Slide 11Slide 11 Please Keep Confidential to Customer and PanasasPlease Keep Confidential Between CSC and Panasas Abaqus/Standard 6.8-3: Comparison

Slide 12 Please Keep Confidential Between CSC and PanasasSlide 12 Please Keep Confidential to Customer and Panasas

Abaqus/Standard 6.8-3: Comparison of PanFS vs. Local FS

S4b Performance for 8 Cores x 4 NodesTotal Time in Seconds

5M DOFEngine Block3773

5289

0

2000

4000

6000PanFS -- Total Time

Local FS -- Total Time

Lower

is

better

NOTE: PanFS

40% Advantage

in Total Time

vs. Local FS

Average Times of 4 Simultaneous Jobs

Average of 4 Jobs | Each Job on 8 Cores | Each Job on 1 Node Using All 8 Cores

40%

4 Jobs x 8 Cores x 4 Nodes

Page 13: Performance Study: Abaqus/Standard 6.8-3Slide 11Slide 11 Please Keep Confidential to Customer and PanasasPlease Keep Confidential Between CSC and Panasas Abaqus/Standard 6.8-3: Comparison

Slide 13 Please Keep Confidential Between CSC and PanasasSlide 13 Please Keep Confidential to Customer and Panasas

Abaqus/Standard 6.8-3: Comparison of PanFS vs. Local FS

S4b Performance for Single vs. Multi-JobTotal Time in Seconds

5M DOFEngine Block

2946

3773

52895495

0

2000

4000

6000

PanFS -- Total Time, Single Job

PanFS -- Total Time, Multi-Job

Local FS -- Total Time, Single Job

Local FS - Total Time, Multi-JobLower

is

better

Times of Single 8-way and Multi 8-way Jobs

1 Job

8-way

1 Job

8-way

4 Jobs

8-way

4 Jobs

8-way

NOTE: PanFSdegrades 22%for1 to 4 nodes

NOTE: Local FSabout the same for 1 to 4 nodes, each FS on node is independent

NOTE: PanFS

40% Advantage

in Total Time

vs. Local FS

22%

Page 14: Performance Study: Abaqus/Standard 6.8-3Slide 11Slide 11 Please Keep Confidential to Customer and PanasasPlease Keep Confidential Between CSC and Panasas Abaqus/Standard 6.8-3: Comparison

Slide 14 Please Keep Confidential Between CSC and PanasasSlide 14 Please Keep Confidential to Customer and Panasas

Panasas and Intel Abaqus S4b Study

Panasas:

16 client iozone

1180 MB/s write

1260 MB/s read

ENDEAVOR File Systems and Storage

PanFS: 2 Shelves AS6000 (1+10 and 2+9), 38 TB FS; network connected through 10GigE switches and IB router, ~ 1.2 GB/s

Lustre: DDN storage, 100 TB FS, ~ 5 GB/s

Local FS: Ext2 FS, 370 GB SATA drive, 80 MB/s per disk

Intel ENDEAVOR Xeon ClusterLocation: Intel HPC Customer Enabling Center, Dupont ,WA

Vendor: Intel; 80 nodes; 640 c ores; 18 GB memory per node

CPU: Intel Xeon (Nehalem) QC, 2.8 GHz, 8 cores per node

Interconnect: Infiniband

File Systems: Panasas PanFS; Lustre on DDN; Local disk

Operating System: RHEL Linux v5.2

Local FS:

Ext2

~80 MB/s

per disk

DDN/Lustre:

16 client iozone

5390 MB/s write

3370 MB/s read

ENDEAVOR

Page 15: Performance Study: Abaqus/Standard 6.8-3Slide 11Slide 11 Please Keep Confidential to Customer and PanasasPlease Keep Confidential Between CSC and Panasas Abaqus/Standard 6.8-3: Comparison

Slide 15 Please Keep Confidential Between CSC and PanasasSlide 15 Please Keep Confidential to Customer and Panasas

2613

1268

639 574

1289

4180

0

1000

2000

3000

4000

5000

8 16 32

PanFS

Local FS

Abaqus/Standard 6.8-3: Comparison of PanFS vs. Local FS Ext2

S4b Performance for Single Job ScalingTotal Time in Seconds

5M DOFEngine Block

Lower

is

better

Single Job Scalability 8 to 32 Cores; Memory 90%

NOTE: PanFS

advantage over

Local for single

node case when

IO is heavy – in

the same range

for 2-4 nodes

when job goes

in-memoryNumber of Cores

60 %

Page 16: Performance Study: Abaqus/Standard 6.8-3Slide 11Slide 11 Please Keep Confidential to Customer and PanasasPlease Keep Confidential Between CSC and Panasas Abaqus/Standard 6.8-3: Comparison

Slide 16 Please Keep Confidential Between CSC and PanasasSlide 16 Please Keep Confidential to Customer and Panasas

1268 1219 12191289

0

500

1000

1500

2000

Memory 90% Memory 70%

PanFS

Local FS

S4b Performance for Single Job ScalingTotal Time in Seconds

5M DOFEngine Block

Lower

is

better

Single Job Scalability on 16 Cores; Memory 90%/70%

NOTE: Effect of

memory setting

16 Cores Each Case

Abaqus/Standard 6.8-3: Comparison of PanFS vs. Local FS Ext2

Page 17: Performance Study: Abaqus/Standard 6.8-3Slide 11Slide 11 Please Keep Confidential to Customer and PanasasPlease Keep Confidential Between CSC and Panasas Abaqus/Standard 6.8-3: Comparison

Slide 17 Please Keep Confidential Between CSC and PanasasSlide 17 Please Keep Confidential to Customer and Panasas

12941360

0

500

1000

1500

2000PanFS

Local FS

S4b Performance for Multi-Job Thru-putTotal Time in Seconds

5M DOFEngine Block

Lower

is

better

Average Times for 8 Jobs, Each 16 Cores; Mem 90%

NOTE: PanFS

and Local FS

difference ~ 5%

Average Times for 8 Jobs | Each Job on 2 Nodes | Each Job on 16 Cores | Total 128 Cores

8 Jobs x 16 Nodes x 128 Cores

Average of 8 Jobs Average of 8 Jobs

Abaqus/Standard 6.8-3: Comparison of PanFS vs. Local FS Ext2

Page 18: Performance Study: Abaqus/Standard 6.8-3Slide 11Slide 11 Please Keep Confidential to Customer and PanasasPlease Keep Confidential Between CSC and Panasas Abaqus/Standard 6.8-3: Comparison

Slide 18 Please Keep Confidential Between CSC and PanasasSlide 18 Please Keep Confidential to Customer and Panasas

Questions

Thank You

For more information,call Panasas at:

1-888-PANASAS(US & Canada)

00 (800) PANASAS2(UK & France)

00 (800) 787-702(Italy)

+001 (510) 608-7790(All Other Countries)