26
ORNL is managed by UT-Battelle for the US Department of Energy UCX: An Open Source Framework for HPC Network APIs and Beyond Presented by: Pavel Shamis / Pasha

UCX: An Open Source Framework for HPC Network APIs and · UCX: An Open Source Framework for HPC Network APIs and Beyond UCX Framework UC-S for Services This framework provides basic

  • Upload
    others

  • View
    13

  • Download
    0

Embed Size (px)

Citation preview

Page 1: UCX: An Open Source Framework for HPC Network APIs and · UCX: An Open Source Framework for HPC Network APIs and Beyond UCX Framework UC-S for Services This framework provides basic

ORNL is managed by UT-Battelle for the US Department of Energy

UCX: An Open Source Framework for HPC Network APIs and Beyond

Presented by: Pavel Shamis / Pasha

Page 2: UCX: An Open Source Framework for HPC Network APIs and · UCX: An Open Source Framework for HPC Network APIs and Beyond UCX Framework UC-S for Services This framework provides basic

2 UCX: An Open Source Framework for HPC Network APIs and Beyond

Co-Design Collaboration

Collaborative Effort Industry, National Laboratories and

Academia

The Next Generation

HPC Communication

Framework

Page 3: UCX: An Open Source Framework for HPC Network APIs and · UCX: An Open Source Framework for HPC Network APIs and Beyond UCX Framework UC-S for Services This framework provides basic

3 UCX: An Open Source Framework for HPC Network APIs and Beyond

Challenges

• Performance Portability (across various interconnects) –  Collaboration between industry and research institutions

•  …but mostly industry (because they built the hardware)

• Maintenance – Maintaining a network stack is time consuming and

expensive –  Industry have resources and strategic interest for this

• Extendibility – MPI+X+Y ? –  Exascale programming environment is an ongoing debate

Page 4: UCX: An Open Source Framework for HPC Network APIs and · UCX: An Open Source Framework for HPC Network APIs and Beyond UCX Framework UC-S for Services This framework provides basic

4 UCX: An Open Source Framework for HPC Network APIs and Beyond

Challenges (CORAL)

12 SC’14  Summit  - Bland Do Not Release Prior to Monday, Nov. 17, 2014

How does Summit compare to Titan

Feature Summit Titan Application Performance 5-10x Titan Baseline Number of Nodes ~3,400 18,688 Node performance > 40 TF 1.4 TF Memory per Node >512 GB (HBM + DDR4) 38GB (GDDR5+DDR3) NVRAM per Node 800 GB 0 Node Interconnect NVLink (5-12x PCIe 3) PCIe 2 System Interconnect (node injection bandwidth)

Dual Rail EDR-IB (23 GB/s) Gemini (6.4 GB/s)

Interconnect Topology Non-blocking Fat Tree 3D Torus Processors IBM POWER9

NVIDIA Volta™ AMD  Opteron™ NVIDIA  Kepler™

File System 120 PB,  1  TB/s,  GPFS™ 32 PB, 1 TB/s, Lustre®

Peak power consumption 10 MW 9 MW

Page 5: UCX: An Open Source Framework for HPC Network APIs and · UCX: An Open Source Framework for HPC Network APIs and Beyond UCX Framework UC-S for Services This framework provides basic

5 UCX: An Open Source Framework for HPC Network APIs and Beyond

UCX – Unified Communication X Framework

• Unified –  Network API for multiple network architectures that target

HPC programing models and libraries

• Communication –  How to move data from location in memory A to location

in memory B considering multiple types of memories

• Framework –  A collection of libraries and utilities for HPC network

programmers

Page 6: UCX: An Open Source Framework for HPC Network APIs and · UCX: An Open Source Framework for HPC Network APIs and Beyond UCX Framework UC-S for Services This framework provides basic

6 UCX: An Open Source Framework for HPC Network APIs and Beyond

History

MXM ●  Developed by Mellanox Technologies ●  HPC communication library for InfiniBand

devices and shared memory ●  Primary focus: MPI, PGAS

PAMI ●  Developed by IBM on BG/Q, PERCS, IB

VERBS ●  Network devices and shared memory ●  MPI, OpenSHMEM, PGAS, CHARM++, X10 ●  C++ components ●  Aggressive multi-threading with contexts ●  Active Messages ●  Non-blocking collectives with hw accleration

support

Decades of community and industry experience in

development of HPC software

UCCS ●  Developed by ORNL, UH, UTK ●  Originally based on Open MPI BTL and

OPAL layers ●  HPC communication library for InfiniBand,

Cray Gemini/Aries, and shared memory ●  Primary focus: OpenSHMEM, PGAS ●  Also supports: MPI

Page 7: UCX: An Open Source Framework for HPC Network APIs and · UCX: An Open Source Framework for HPC Network APIs and Beyond UCX Framework UC-S for Services This framework provides basic

7 UCX: An Open Source Framework for HPC Network APIs and Beyond

What we are doing differently…

• UCX consolidates multiple industry and academic efforts – Mellanox MXM, IBM PAMI, ORNL/UTK/UH UCCS, etc.

• Supported and maintained by industry –  IBM, Mellanox, NVIDIA, Pathscale

Page 8: UCX: An Open Source Framework for HPC Network APIs and · UCX: An Open Source Framework for HPC Network APIs and Beyond UCX Framework UC-S for Services This framework provides basic

8 UCX: An Open Source Framework for HPC Network APIs and Beyond

What we are doing differently…

• Co-design effort between national laboratories, academia, and industry

Applications: LAMMPS, NWCHEM, etc.

Programming models: MPI, PGAS/Gasnet, etc.

Middleware:

Driver and Hardware

Co-

desi

gn

Page 9: UCX: An Open Source Framework for HPC Network APIs and · UCX: An Open Source Framework for HPC Network APIs and Beyond UCX Framework UC-S for Services This framework provides basic

9 UCX: An Open Source Framework for HPC Network APIs and Beyond

UCX

InfiniBand uGNI Shared Memory GPU Memory Emerging

Interconnects

MPI GasNet PGAS Task BasedRuntimes I/O

Transports

Protocols Services

Applications

Page 10: UCX: An Open Source Framework for HPC Network APIs and · UCX: An Open Source Framework for HPC Network APIs and Beyond UCX Framework UC-S for Services This framework provides basic

10 UCX: An Open Source Framework for HPC Network APIs and Beyond

A Collaboration Efforts

•  Mellanox co-designs network interface and contributes MXM technology –  Infrastructure, transport, shared memory, protocols,

integration with OpenMPI/SHMEM, MPICH

•  ORNL co-designs network interface and contributes UCCS project –  InfiniBand optimizations, Cray devices, shared memory

•  NVIDIA co-designs high-quality support for GPU devices –  GPUDirect, GDR copy, etc.

•  IBM co-designs network interface and contributes ideas and concepts from PAMI

•  UH/UTK focus on integration with their research platforms

Page 11: UCX: An Open Source Framework for HPC Network APIs and · UCX: An Open Source Framework for HPC Network APIs and Beyond UCX Framework UC-S for Services This framework provides basic

11 UCX: An Open Source Framework for HPC Network APIs and Beyond

Licensing

• Open Source –  BSD 3 Clause license –  Contributor License Agreement – BSD 3 based

Page 12: UCX: An Open Source Framework for HPC Network APIs and · UCX: An Open Source Framework for HPC Network APIs and Beyond UCX Framework UC-S for Services This framework provides basic

12 UCX: An Open Source Framework for HPC Network APIs and Beyond

UCX Framework Mission •  Collaboration between industry, laboratories, and academia

•  Create open-source production grade communication framework for HPC applications

•  Enable the highest performance through co-design of software-hardware interfaces

•  Unify industry - national laboratories - academia efforts

Performance oriented

Optimization for low-software overheads in communication path allows near

native-level performance

Community driven

Collaboration between industry, laboratories, and academia

Production quality

Developed, maintained, tested, and used by industry and researcher

community

API

Exposes broad semantics that target data centric and HPC programming

models and applications

Research

The framework concepts and ideas are driven by research in academia,

laboratories, and industry

Cross platform

Support for Infiniband, Cray, various shared memory (x86-64 and Power),

GPUs

Co-design of Exascale Network APIs

Page 13: UCX: An Open Source Framework for HPC Network APIs and · UCX: An Open Source Framework for HPC Network APIs and Beyond UCX Framework UC-S for Services This framework provides basic

13 UCX: An Open Source Framework for HPC Network APIs and Beyond

Architecture

Page 14: UCX: An Open Source Framework for HPC Network APIs and · UCX: An Open Source Framework for HPC Network APIs and Beyond UCX Framework UC-S for Services This framework provides basic

14 UCX: An Open Source Framework for HPC Network APIs and Beyond

UCX Framework

UC-S for Services This framework provides basic infrastructure for component based programming, memory management, and useful system utilities Functionality: Platform abstractions, data structures, debug facilities.

UC-T for Transport Low-level API that expose basic network operations supported by underlying hardware. Reliable, out-of-order delivery. Functionality: Setup and instantiation of communication operations.

UC-P for Protocols High-level API uses UCT framework to construct protocols commonly found in applications Functionality: Multi-rail, device selection, pending queue, rendezvous, tag-matching, software-atomics, etc.

Page 15: UCX: An Open Source Framework for HPC Network APIs and · UCX: An Open Source Framework for HPC Network APIs and Beyond UCX Framework UC-S for Services This framework provides basic

15 UCX: An Open Source Framework for HPC Network APIs and Beyond

A High-level Overview

UC-T (Hardware Transports) - Low Level API RMA, Atomic, Tag-matching, Send/Recv, Active Message

Transport for InfiniBand VERBs driver

RC UD XRC DCT

Transport for intra-node host memory communication

SYSV POSIX KNEM CMA XPMEM

Transport for Accelerator Memory

communucation

GPU

Transport for Gemini/Aries

drivers

GNI

UC-S (Services)

Common utilities

UC-P (Protocols) - High Level APITransport selection, cross-transrport multi-rail, fragmentation, operations not supported by hardware

Message Passing API Domain:tag matching, randevouze

PGAS API Domain: RMAs, Atomics

Task Based API Domain: Active Messages

I/O API Domain:Stream

Utilities Data stractures

Hardware

MPICH, Open-MPI, etc. OpenSHMEM, UPC, CAF, X10, Chapel, etc. Parsec, OCR, Legions, etc. Burst buffer, ADIOS, etc.

Applications

UCX

MemoryManagement

OFA Verbs Driver Cray Driver OS Kernel Cuda

Page 16: UCX: An Open Source Framework for HPC Network APIs and · UCX: An Open Source Framework for HPC Network APIs and Beyond UCX Framework UC-S for Services This framework provides basic

16 UCX: An Open Source Framework for HPC Network APIs and Beyond

Preliminary Evaluation ( UCT )

•  Two HP ProLiant DL380p Gen8 servers

•  Intel Xeon E5-2697 2.7GHz CPUs

•  Mellanox SX6036 switch

•  Single-port Mellanox Connect-IB FDR (10.10.5056)

•  Mellanox OFED 2.4-1.0.4. (VERBS)

•  Prototype implementation of Accelerated VERBS (AVERBS)

��������������������������

�� � �� �� ��� � ���

������������������

� ���� ���� ����� ��

������������ �������������� �������������� �������������� ��

����������������������������������������������

�� �� �� �� �� ��� ������������

�������������������

������������ �������������� �������������� ��

������������ �������������� �������������� ��

��

��

��

��

��

��

��

�� ��� � �� ��

������������ �

�� �������������� �

����������� ������������� �������������� ������������� ���

Page 17: UCX: An Open Source Framework for HPC Network APIs and · UCX: An Open Source Framework for HPC Network APIs and Beyond UCX Framework UC-S for Services This framework provides basic

17 UCX: An Open Source Framework for HPC Network APIs and Beyond

OpenSHMEM and OSHMEM (OpenMPI) Put Latency (shared memory)

0.1

1

10

100

1000

8 16 32 64 128 256 512 1KB 2KB 4KB 8KB 16KB 32KB 64KB 128KB256KB512KB 1MB 2MB 4MB

Late

ncy

(use

c, lo

gsca

le)

Message Size

OpenSHMEM−UCX (intranode)OpenSHMEM−UCCS (intranode)

OSHMEM (intranode)

Lower is better

Page 18: UCX: An Open Source Framework for HPC Network APIs and · UCX: An Open Source Framework for HPC Network APIs and Beyond UCX Framework UC-S for Services This framework provides basic

18 UCX: An Open Source Framework for HPC Network APIs and Beyond

OpenSHMEM and OSHMEM (OpenMPI) Put Injection Rate

0

2e+06

4e+06

6e+06

8e+06

1e+07

1.2e+07

1.4e+07

8 16 32 64 128 256 512 1KB 2KB 4KB

Mes

sage

Rat

e (p

ut o

pera

tions

/ se

cond

)

Message Size

OpenSHMEM−UCX (mlx5)OpenSHMEM−UCCS (mlx5)

OSHMEM (mlx5)OSHMEM−UCX (mlx5)Higher is better

Connect-IB

Page 19: UCX: An Open Source Framework for HPC Network APIs and · UCX: An Open Source Framework for HPC Network APIs and Beyond UCX Framework UC-S for Services This framework provides basic

19 UCX: An Open Source Framework for HPC Network APIs and Beyond

OpenSHMEM and OSHMEM (OpenMPI) GUPs Benchmark

0

0.0002

0.0004

0.0006

0.0008

0.001

0.0012

0.0014

0.0016

0.0018

2 4 6 8 10 12 14 16

GU

PS (b

illion

upd

ates

per

sec

ond)

Number of PEs (two nodes)

UCX (mlx5)OSHMEM (mlx5)

Higher is better

Connect-IB

Page 20: UCX: An Open Source Framework for HPC Network APIs and · UCX: An Open Source Framework for HPC Network APIs and Beyond UCX Framework UC-S for Services This framework provides basic

20 UCX: An Open Source Framework for HPC Network APIs and Beyond

MPICH - Message rate – Preliminary Results

0

1

2

3

4

5

6 1 2 4 8 16

32

64

12

8 25

6 51

2 1k

2k

4k

8k

16k

32k

64k

128k

25

6k

512k

1M

2M

4M

MM

PS

MPICH/UCX MPICH/MXM Slide courtesy of Pavan Balaji, ANL - sent to the ucx mailing list

Connect-IB

“non-blocking tag-send”

Page 21: UCX: An Open Source Framework for HPC Network APIs and · UCX: An Open Source Framework for HPC Network APIs and Beyond UCX Framework UC-S for Services This framework provides basic

21 UCX: An Open Source Framework for HPC Network APIs and Beyond

Where is UCX being used?

• Upcoming release of Open MPI 2.0 • Upcoming release of MPICH • OpenSHMEM reference implementation • PARSEC – runtime used on Scientific Linear

Libraries

Page 22: UCX: An Open Source Framework for HPC Network APIs and · UCX: An Open Source Framework for HPC Network APIs and Beyond UCX Framework UC-S for Services This framework provides basic

22 UCX: An Open Source Framework for HPC Network APIs and Beyond

What Next ?

•  UCX Consortium ! –  http://www.csm.ornl.gov/newsite/

•  UCX Specification –  Early draft is available online:

http://www.openucx.org/early-draft-of-ucx-specification-is-here/

•  Production releases –  MPICH, Open MPI, Open SHMEM(s), Gasnet, and more…

•  Support for more networks and applications and libraries

•  UCX Hackathon 2016 ! –  Will be announced on the mailing list and website

Page 23: UCX: An Open Source Framework for HPC Network APIs and · UCX: An Open Source Framework for HPC Network APIs and Beyond UCX Framework UC-S for Services This framework provides basic

https://github.com/orgs/openucx WEB: www.openucx.org Contact: [email protected] Mailing List:https://elist.ornl.gov/mailman/listinfo/ucx-group [email protected]

Page 24: UCX: An Open Source Framework for HPC Network APIs and · UCX: An Open Source Framework for HPC Network APIs and Beyond UCX Framework UC-S for Services This framework provides basic

24 UCX: An Open Source Framework for HPC Network APIs and Beyond

Acknowledgments

Page 25: UCX: An Open Source Framework for HPC Network APIs and · UCX: An Open Source Framework for HPC Network APIs and Beyond UCX Framework UC-S for Services This framework provides basic

25 UCX: An Open Source Framework for HPC Network APIs and Beyond

Acknowledgments

• Thanks to all our partners !

Page 26: UCX: An Open Source Framework for HPC Network APIs and · UCX: An Open Source Framework for HPC Network APIs and Beyond UCX Framework UC-S for Services This framework provides basic

Questions ?

Unified Communication - X Framework WEB: www.openucx.org Contact: [email protected] WE B: https://github.com/orgs/openucx Mailing List:https://elist.ornl.gov/mailman/listinfo/ucx-group [email protected]