Interconnect Your Futurerussianscdays.org/files/talks16/MKagan_RuSCDays16.pdf · Mellanox connects...

Preview:

Citation preview

September 2016

Interconnect Your Future

- Interconnect Technologies -© 2016 Mellanox Technologies 2

The Ever Growing Demand for Higher Performance

2000 202020102005

“Roadrunner”

1st

2015

Terascale Petascale Exascale

Single-Core to Many-CoreSMP to Clusters

Performance Development

Co-Design

HW SW

APP

Hardware

Software

Application

The Interconnect is the Enabling Technology

- Interconnect Technologies -© 2016 Mellanox Technologies 3

The Intelligent Interconnect to Enable Exascale Performance

CPU-Centric Co-Design

Work on The Data as it Moves

Enables Performance and Scale

Must Wait for the Data

Creates Performance Bottlenecks

Limited to Main CPU Usage

Results in Performance Limitation

Creating Synergies

Enables Higher Performance and Scale

- Interconnect Technologies -© 2016 Mellanox Technologies 4

ISC’16: Introducing ConnectX-5 World’s Smartest Adapter

- Interconnect Technologies -© 2016 Mellanox Technologies 5

SHArP Performance Advantage

MiniFE is a Finite Element mini-application

• Implements kernels that represent

implicit finite-element applications

10X to 25X Performance Improvement!

AllRedcue MPI Collective

- Interconnect Technologies -© 2016 Mellanox Technologies 6

Highest-Performance 100Gb/s Interconnect Solutions

Transceivers

Active Optical and Copper Cables

(10 / 25 / 40 / 50 / 56 / 100Gb/s) VCSELs, Silicon Photonics and Copper

36 EDR (100Gb/s) Ports, <90ns Latency

Throughput of 7.2Tb/s

7.02 Billion msg/sec (195M msg/sec/port)

100Gb/s Adapter, 0.6us latency

200 million messages per second

(10 / 25 / 40 / 50 / 56 / 100Gb/s)

32 100GbE Ports, 64 25/50GbE Ports

(10 / 25 / 40 / 50 / 100GbE)

Throughput of 6.4Tb/s

MPI, SHMEM/PGAS, UPC

For Commercial and Open Source Applications

Leverages Hardware Accelerations

- Interconnect Technologies -© 2016 Mellanox Technologies 7

The Performance Advantage of EDR 100G InfiniBand (28-80%)

28%

- Interconnect Technologies -© 2016 Mellanox Technologies 8

Mellanox Connects the World’s Fastest Supercomputer

93 Petaflop performance, 3X higher versus #2 on the TOP500

40K nodes, 10 million cores, 256 cores per CPU

Mellanox adapter and switch solutions

National Supercomputing Center in Wuxi, China

#1 on the TOP500 Supercomputing List

The TOP500 list has evolved, includes HPC & Cloud / Web2.0 Hyperscale systems

Mellanox connects 41.2% of overall TOP500 systems

Mellanox connects 70.4% of the TOP500 HPC platforms

Mellanox connects 46 Petascale systems, Nearly 50% of the total Petascale systems

InfiniBand is the Interconnect of Choice for

HPC Compute and Storage Infrastructures

- Interconnect Technologies -© 2016 Mellanox Technologies 9

Machine Learning

- Interconnect Technologies -© 2016 Mellanox Technologies 10

GPUDirect Enables Efficient Training Platform for Deep Neural Network

CPU

CPU

CPU GPU

GPU

GPU

3 Nodes with 3 GPUs for 3 days

Mellanox InfiniBand and GPU-Direct1K nodes (16K cores) for 1 week

- Interconnect Technologies -© 2016 Mellanox Technologies 12

Enable Real-time Decision Making

TeraSort Big Sur Machine Learning Platform

0

200

400

600

800

1000

1200

Intel 10Gb/s Mellanox 10Gb/s Mellanox 40Gb/s

Execu

tio

n T

ime (

in s

eco

nd

s)

Ethernet Network

70%

improvement

Fraud Detection workload

0

100

200

300

400

500

600

700

Existing Solution Aerospike withMellanox + Samsung

NVMe

Tota

l T

ransaction T

ime (

in m

s)

CPU + Storage + Network Fraud Detection Algorithm

1.8x more time for running

fraud detection algorithm

- Interconnect Technologies -© 2016 Mellanox Technologies 13

GPU Computing

- Interconnect Technologies -© 2016 Mellanox Technologies 15

Mellanox PeerDirect™ with NVIDIA GPUDirect RDMA

102%

0

10

20

30

40

50

60

70

80

2 4

Ave

rag

e t

ime

pe

r it

era

tio

n (

us

)

Number of nodes/GPUs

2D stencil benchmark

RDMA only RDMA+PeerSync

27% faster 23% faster

- Interconnect Technologies -© 2016 Mellanox Technologies 16

Basic GPU computing

GPU – from Server to Service

rCUDA – remote CUDA

- Interconnect Technologies -© 2016 Mellanox Technologies 17

GPU Virtualization – Benefits

Flexible Resources Assignment Resources’ Consolidation and Management

- Interconnect Technologies -© 2016 Mellanox Technologies 18

Storage and Data Access

- Interconnect Technologies -© 2016 Mellanox Technologies 19

Storage Technology Evolution

0.1

10

1000

HD SSD NVM

Acce

ss T

ime

(mic

ro-S

ec)

Storage Media Technology

Access T

ime in M

icro

Seconds

- Interconnect Technologies -© 2016 Mellanox Technologies 20

NVMe Technology

- Interconnect Technologies -© 2016 Mellanox Technologies 21

NVME over Fabrics – RDMA-based networking storage

- Interconnect Technologies -© 2016 Mellanox Technologies 22

RDMA-based Remote NVME Access (NVME over Fabrics)

Target Server

Micron FlashMicron Flash

Micron FlashMicron Flash

Mellanox NICMellanox NIC

PC

Ie

Initiator Server

Mellanox NICMellanox NIC

100GbE

0

500

1000

1500

2000

2500

3000

3500

4000

Read Write

Operations/sec – local vs. remote

Local Remote

- Interconnect Technologies -© 2016 Mellanox Technologies 23

BlueField System-on-a-Chip (SoC) Solution

Integration of ConnectX5 + Multicore ARM

State of the art capabilities• 10 / 25 / 40 / 50 / 100G Ethernet & InfiniBand

• PCIe Gen3 /Gen4• Hardware acceleration offload

- RDMA, RoCE, NVMeF, RAID

Family of products• Range of ARM core counts and I/O ports/speeds

• Price/Performance points

NVMe Flash Storage Arrays

Scale-Out Storage (NVMe over Fabric)

Accelerating & Virtualizing VNFs

Open vSwitch (OVS), SDN

Overlay networking offloads

- Interconnect Technologies -© 2016 Mellanox Technologies 24

Rack view

Scale-Out NVMe Storage System with BlueField

SSD

PCIe

Gen3/4

Ethernet/InfiniBand:

10, 25,40,50,100G

(2 ports)

DDR4

Storage

Shelf SSD

DDR4

PCIe

Gen3/4

PCIe

Ethernet/InfiniBand:

10, 25,40,50,100G

(2 ports)

Network Fabric

8 – 16 Drives

BMC

SSD SSD

Compute/Storage Head Node

TOR Switch

Storage/NVMfShelf

Compute/Storage Head Node

Storage/NVMfShelf

Storage/NVMfShelf

Storage/NVMfShelf

Storage/NVMfShelf

Storage/NVMfShelf

Storage/NVMfShelf

Storage/NVMfShelf

- Interconnect Technologies -© 2016 Mellanox Technologies 25

Technology Leadership

2001 2015

RDMA

RoCE

IO & Network

Virtualization

GPU-Direct

Remote GPU

(rCUDA)

RDMA NAS

Optics

Software

Cables

Systems

Boards

Silicon

Cables

Systems

Boards

Silicon

Systems

Boards

Silicon

Boards

Silicon

Software

Cables

Systems

Boards

Silicon

Intelligent

Interconnect

Processing

Optics

Software

Cables

Systems

Boards

Silicon

- Interconnect Technologies -© 2016 Mellanox Technologies 26

Strategy: Network-based Computing

Users

Users

Intelligence

Network Offloads

Computing for applications

Smart Network

Increase Datacenter Value

Network functions

On CPU

- Interconnect Technologies -© 2016 Mellanox Technologies 27

Computing Evolution

Scale-out

Services

Thank You

Recommended