18
1 Brand-New Vector Supercomputer NEC Corporation IT Platform Division Shintaro MOMOSE SC13

Brand-New Vector Supercomputer - NEC(Japan)jpn.nec.com/hpc/info/pdf/SC13_NEC.pdf · 1 Brand-New Vector Supercomputer NEC Corporation IT Platform Division Shintaro MOMOSE SC13

Embed Size (px)

Citation preview

Page 1: Brand-New Vector Supercomputer - NEC(Japan)jpn.nec.com/hpc/info/pdf/SC13_NEC.pdf · 1 Brand-New Vector Supercomputer NEC Corporation IT Platform Division Shintaro MOMOSE SC13

1

Brand-New

Vector Supercomputer

NEC Corporation IT Platform Division

Shintaro MOMOSE

SC13

Page 2: Brand-New Vector Supercomputer - NEC(Japan)jpn.nec.com/hpc/info/pdf/SC13_NEC.pdf · 1 Brand-New Vector Supercomputer NEC Corporation IT Platform Division Shintaro MOMOSE SC13

New Product

2 2 © NEC Corporation, 2013

NEC Released A Brand-New

Vector Supercomputer, SX-ACE Just Now.

Vector Supercomputer for

Memory Bandwidth Intensive Applications

Sustained performance

Usability

Productivity

Page 3: Brand-New Vector Supercomputer - NEC(Japan)jpn.nec.com/hpc/info/pdf/SC13_NEC.pdf · 1 Brand-New Vector Supercomputer NEC Corporation IT Platform Division Shintaro MOMOSE SC13

3

Concept

Page 4: Brand-New Vector Supercomputer - NEC(Japan)jpn.nec.com/hpc/info/pdf/SC13_NEC.pdf · 1 Brand-New Vector Supercomputer NEC Corporation IT Platform Division Shintaro MOMOSE SC13

4 4 © NEC Corporation, 2013 4 4

SX History and Technical Evolutions P

erf

orm

an

ce

1990 2000 2010

SX-2

SX-3

SX-4

SX-5

SX-6

SX-8/8R

SX-9

Bipolar

Water Cooling

Multi node

CMOS

Air Cooling

1 Chip

Vector Processor

3D node

module

Multi-core

All in One Chip

ECO

SX-7

100GF

Processor

NEC has always provided the high sustained

performance by Vector Super-Computer SX series.

ES

ES2

Distributed Parallelization

(MPI-SX)

Support

Over 100 nodes

Cluster

Support >1000 nodes

ECO

Auto Vectorization

Compiler

MPI support

to Multilane IXS

Automatic Parallelization

Function &

SUPER-UX

Page 5: Brand-New Vector Supercomputer - NEC(Japan)jpn.nec.com/hpc/info/pdf/SC13_NEC.pdf · 1 Brand-New Vector Supercomputer NEC Corporation IT Platform Division Shintaro MOMOSE SC13

1.0E-01

1.0E+00

1.0E+01

1.0E+02

1.0E+03

1.0E+04

1.0E+05

1.0E+06

1.0E+07

1.0E+08

1.0E+09

1.0E+10

2000 2002 2004 2006 2008 2010 2012 2014 2016 2018 2020

Linpack [TF]Lipack ave. [TF]# of cores# of cores ave.# of nodes# of nodes ave.Core performance [GF]Core perfromance ave. [GF]Frequency [GHz]Frequency ave. [GHz]

Growing of LINPAC performance has been provided by system enlarging User mast spend their time to extract massively parallelism Smaller # of cores with big cores can reduce the difficulty

Frequency [GHz] 115%/year

Exa Flops

Nearly

constant

Increasing

Trend of TOP500 (1st

~ 10th

system)

5 © NEC Corporation, 2013

big

co

re

few

er

co

res

Page 6: Brand-New Vector Supercomputer - NEC(Japan)jpn.nec.com/hpc/info/pdf/SC13_NEC.pdf · 1 Brand-New Vector Supercomputer NEC Corporation IT Platform Division Shintaro MOMOSE SC13

Required Byte/Flop in Real Applications

According to Japanese Government (MEXT) working group report of wide variety of strategic segment applications, diverse characteristics are observed.

MEXT: Ministry of Education, Culture, Sports, Science & Technology

B/F requirement from each application is differ greatly. Only one architecture cannot cover all application areas.

0.001 0.01 0.1 1 10 100 1000

0.0001

0.001

0.01

0.1

1

10

Required memory capacity [PB] Re

qu

ire

d m

em

ory

ba

nd

wid

th [

Byte

/Flo

p]

Calculation intensive

memory bandwidth intensive

Reference: “Report on Strategic Direction/Development of HPC

in Japan”, March 2012

Structural analysis Fluid dynamics

MD, Weather Cosmo physics Particle physics

Quantum chemistry Nuclear physics

6 © NEC Corporation, 2013

scalar

CPUs

Page 7: Brand-New Vector Supercomputer - NEC(Japan)jpn.nec.com/hpc/info/pdf/SC13_NEC.pdf · 1 Brand-New Vector Supercomputer NEC Corporation IT Platform Division Shintaro MOMOSE SC13

7 7 © NEC Corporation, 2013

Concepts of SX-ACE

Big Core

Providing higher sustained performance

The highest single core performance: 64GF

The largest single core memory bandwidth: 64~256GB/s

Inherit SX-DNA

Low Power Consumption

Higher power efficiency

Small Installation Space

To reduce floor space cost

1

10

1

5

compared to SX-9

Compared to SX-9

Page 8: Brand-New Vector Supercomputer - NEC(Japan)jpn.nec.com/hpc/info/pdf/SC13_NEC.pdf · 1 Brand-New Vector Supercomputer NEC Corporation IT Platform Division Shintaro MOMOSE SC13

Providing Big Core

The SX-ACE inherits SX-DNA and overwhelm other CPUs with its

world’s No.1 CPU core performance and word’s No.1 memory

bandwidth.

SX-ACE

Scalar A

Scalar B

Scalar C

64GF

30

24

15

64GB/s

16

6

5

2~4x 4~13x

Single core comparison

peak performance

2~4x performance

compared to competitors

memory bandwidth

4~13x performance

compared to competitors

8 © NEC Corporation, 2013

Page 9: Brand-New Vector Supercomputer - NEC(Japan)jpn.nec.com/hpc/info/pdf/SC13_NEC.pdf · 1 Brand-New Vector Supercomputer NEC Corporation IT Platform Division Shintaro MOMOSE SC13

9

Architecture

Page 10: Brand-New Vector Supercomputer - NEC(Japan)jpn.nec.com/hpc/info/pdf/SC13_NEC.pdf · 1 Brand-New Vector Supercomputer NEC Corporation IT Platform Division Shintaro MOMOSE SC13

CPU Architecture (Big Core, Large memory bandwidth)

10 © NEC Corporation, 2013

core core core

RCU

MC

MC

MC

MC

MC

MC

MC

MC

MC

MC

MC

MC

MC

MC

MC

MC

crossbar

ADB (Assignable Data Buffer)

SPU VPU

256GB/s

256GB/s

256GB/s

256GB/s

8GB/s x2

8GB/s x2

Memory(DDR3)

Interconnect

Architecture Vector

CORE

Performance 64GFlops

ADB size 1MB

ADB bandwidth 256GB/s

Memory bandwidth 64GB/s~

256GB/s

Memory Byte/Flop 1.0 ~ 4.0

CPU

Cores 4

Performance 256GFlops

Memory bandwidth 256GB/s

Byte/Flop 1.0

Vector Processing Unit

Scalar Processing Unit

Remote access Control Unit

Memory controller

Page 11: Brand-New Vector Supercomputer - NEC(Japan)jpn.nec.com/hpc/info/pdf/SC13_NEC.pdf · 1 Brand-New Vector Supercomputer NEC Corporation IT Platform Division Shintaro MOMOSE SC13

11

Node card

11cm

37cm

11 © NEC Corporation, 2013

All-in-one Processor

Memory Controller

Very large memory BW

256GB/s BW control

World’s largest BW 256GB/s

SX-ACE CPU

memory

Powerful core

Network controller

I/O Controller

World’s fastest CPU core

64GF x 4cores

1MB ADB/core

8GB/s/direction, Fat-tree

Connection to storage device, Ethernet

4 powerful cores and each controller (memory, network, I/O) are integrated in one-CPU = Power saving

Compact card design = Space saving

Performance: 256GF

Memory bandwidth: 256GB/s

Page 12: Brand-New Vector Supercomputer - NEC(Japan)jpn.nec.com/hpc/info/pdf/SC13_NEC.pdf · 1 Brand-New Vector Supercomputer NEC Corporation IT Platform Division Shintaro MOMOSE SC13

Node Card

12 (c) NEC Corporation, 2012

37cm

11cm

CPU 4 cores

256GF

256GB/s

Memory 4GB x 16DIMMs

DDR3 2000MHz

Page 13: Brand-New Vector Supercomputer - NEC(Japan)jpn.nec.com/hpc/info/pdf/SC13_NEC.pdf · 1 Brand-New Vector Supercomputer NEC Corporation IT Platform Division Shintaro MOMOSE SC13

13

System Configuration

13 © NEC Corporation, 2013

Each node is connected with 2 stages full Fat-tree network

Implementing global communication function into HW

SW1 #00

no

de

#000

no

de

#015

SW1 #01

no

de

#016

no

de

#031

SW1 #31

no

de

#496

no

de

#511

SW2 #00 SW2 #15

16 links

16 links

32 links

512 nodes, 2048 cores, 131TFlops, 1B/F

8GB/s /direction

•Fat-tree •HW function of global communication

node

Page 14: Brand-New Vector Supercomputer - NEC(Japan)jpn.nec.com/hpc/info/pdf/SC13_NEC.pdf · 1 Brand-New Vector Supercomputer NEC Corporation IT Platform Division Shintaro MOMOSE SC13

14

Configuration

14 (c) NEC Corporation, 2013

Node Card

1CPU, 256GF, 256GB/s

2-Node Module

2 nodes = 2 CPUs

16-Node Cage

8 modules = 16 nodes = 16 CPUs

Rack

64 nodes = 16TF, 16TB/s

System

Rack Specifications

16TF, 16TB/s, 64 CPUs

0.75m x 1.5m x 2.0m

30KW

16-Node Cage x4

4 cages = 32 modules = 64 nodes = 64CPUs

Page 15: Brand-New Vector Supercomputer - NEC(Japan)jpn.nec.com/hpc/info/pdf/SC13_NEC.pdf · 1 Brand-New Vector Supercomputer NEC Corporation IT Platform Division Shintaro MOMOSE SC13

co

ola

nt

pip

e (

inle

t)

co

ola

nt

pip

e (

ou

tle

t)

power junction

rack manager

OS disk

2 nodes 2 nodes

2 nodes 2 nodes

node manager

2 nodes 2 nodes

2 nodes 2 nodes

2 nodes 2 nodes

2 nodes 2 nodes

node manager

2 nodes 2 nodes

2 nodes 2 nodes

2 nodes 2 nodes

2 nodes 2 nodes

node manager

2 nodes 2 nodes

2 nodes 2 nodes

2 nodes 2 nodes

2 nodes 2 nodes

node manager

2 nodes 2 nodes

2 nodes 2 nodes

network switch

42U

15 (c) NEC Corporation, 2013

Detail of Rack Implementation

16-node cage

2-node module

Page 16: Brand-New Vector Supercomputer - NEC(Japan)jpn.nec.com/hpc/info/pdf/SC13_NEC.pdf · 1 Brand-New Vector Supercomputer NEC Corporation IT Platform Division Shintaro MOMOSE SC13

16 16 © NEC Corporation, 2013

Comparison with same performance (131TF)

Providing 5x smaller space and 10x lower power consumption compared to SX-9 by power saving design and compact implementation.

Downsizing and Power Saving

power 1/10

space 1/5

12m

24m

8m

7m

131TF

288m2

2.4MW

131TF

56m2

0.24MW

SX-9 SX-ACE

25m swimming pool size Meeting room size

80 nodes 512 nodes

Page 17: Brand-New Vector Supercomputer - NEC(Japan)jpn.nec.com/hpc/info/pdf/SC13_NEC.pdf · 1 Brand-New Vector Supercomputer NEC Corporation IT Platform Division Shintaro MOMOSE SC13

17 (c) NEC Corporation, 2013

NEC’s Exhibitor Forum is Today !

Exhibitor Forum Nov. 19th (Tue), 15:30 – 16:00 Room 501/502 NEC’s Brand-New Vector Supercomputer and HPC Roadmap

Page 18: Brand-New Vector Supercomputer - NEC(Japan)jpn.nec.com/hpc/info/pdf/SC13_NEC.pdf · 1 Brand-New Vector Supercomputer NEC Corporation IT Platform Division Shintaro MOMOSE SC13

©NEC Corporation, 2012 Confidential 18