15
1 Advanced Storage Technologies for High Performance Computing Sorin, Faibish EMC NAS Senior Technologist IDC HPC User Forum, April 14-16, Norfolk, VA

Advanced Storage Technologies for High Performance Computing

  • Upload
    axl

  • View
    39

  • Download
    0

Embed Size (px)

DESCRIPTION

Advanced Storage Technologies for High Performance Computing. Sorin, Faibish EMC NAS Senior Technologist IDC HPC User Forum, April 14-16, Norfolk, VA. New HPC Storage Intensive Applications. Storage Challenges* New algorithms that can scale to search and process massive datasets; - PowerPoint PPT Presentation

Citation preview

Page 1: Advanced Storage Technologies for High Performance Computing

1

Advanced Storage Technologiesfor

High Performance Computing

Sorin, FaibishEMC NAS Senior TechnologistIDC HPC User Forum, April 14-16, Norfolk, VA

Page 2: Advanced Storage Technologies for High Performance Computing

2

IDC HPC User Forum 2008

New HPC Storage Intensive Applications

Storage Challenges* New algorithms that can scale to search and process massive datasets;

New metadata management of distributed data sources;

New platforms provide uniform high-speed memory access to multi terabyte data structures;

Hybrid interconnect architectures to process and filter multi gigabyte data streams from scientific instruments;

High-performance, high-reliability, petascale distributed file systems;

New approaches to software mobility, so that algorithms can execute on nodes where the data resides;

Flexible and high-performance software integration technologies running on diverse computing platforms;

Data signature generation techniques for data reduction and rapid processing.

*Computer Magazine: http://www.computer.org/portal/cms_docs_computer/computer/homepage/0408/R4gei.pdf

Page 3: Advanced Storage Technologies for High Performance Computing

3

IDC HPC User Forum 2008

New Storage Technologies for HPC

Storage Technologies Virtualization to address the multi-

core problem

CDP and memory snapshots to address storage failures during computation

DR and distributed cache appliances to address computation across geographies

SSD disk technology to address Data Intensive Super Computing tasks as well as decrease power consumption of storage

pNFS and RDMA technologies to increase the I/O speeds and reduce computation cycles

Storage at Previous HPC User Forum

Page 4: Advanced Storage Technologies for High Performance Computing

4

IDC HPC User Forum 2008

New Concept – Better Utilization of multi-cores

Current Implementation – Application split on multiple single

core SMP HW– Use middleware SW (Platform)

Page 5: Advanced Storage Technologies for High Performance Computing

5

IDC HPC User Forum 2008

New Concept – Better Utilization of multi-cores

Dual-core support added– Application modified to support SMP

dual core– CPU used: 4x 100% (100%)– Licenses paid: 4– Licenses used: 4

Page 6: Advanced Storage Technologies for High Performance Computing

6

IDC HPC User Forum 2008

New Concept – Better Utilization of multi-cores

Quad-core chips appear– CPU used: 4x 100% (4/8=50%)– Licenses paid: 8– Licenses used: 4 – Application must be modified or

Page 7: Advanced Storage Technologies for High Performance Computing

7

IDC HPC User Forum 2008

New Concept – Better Utilization of multi-cores

Quad-core chips appear– CPU used: 4x 100% (50%)– Licenses paid: 8– Licenses used: 4 – Application must be modified or– Use VM with CPU affinity– CPU used: 8x80% (80%)– Licenses used: 8

Page 8: Advanced Storage Technologies for High Performance Computing

8

IDC HPC User Forum 2008

New Concept – Better Utilization of multi-cores

N-cores chips are coming– Use VM with VT support– CPU used: 2xNx90% (90%)– Licenses paid=used: 2xN

Page 9: Advanced Storage Technologies for High Performance Computing

9

IDC HPC User Forum 2008

New Concept – Better Utilization of multi-cores

Core agnostic Middleware will work with as many cores as available

– Enabled by pNFS access to shared storage

Page 10: Advanced Storage Technologies for High Performance Computing

10

IDC HPC User Forum 2008

CDP + Memory Snapshots in HPC applications

SAN

SunIBM HPHDSEMC

HPC Application platform support

CDP Journal + Memory Snapshots

CDPAppliance

CDP Technology will work with Real and Virtual Infrastructures

– VM snapshots on central storage repository

– VM and HW hosts memory snapshots

– Any SAN or NAS storage– Recover HPC job at any

point in time (last minute failure after 2 weeks run)

Page 11: Advanced Storage Technologies for High Performance Computing

11

IDC HPC User Forum 2008

HPC Application remote platform

HPC Application platform support

Continuous Remote Replication in HPC

Site A Site BSANSAN

SunIBM HPHDSEMC

SunIBM HPHDSEMC Heterogeneous

storage

CacheAppliance

CacheAppliance

HeterogeneousBlades; VM+HW

Distributed cache engines allow distributed access to shared storage

– Remote Compute Nodes accessing the shared storage

Page 12: Advanced Storage Technologies for High Performance Computing

12

IDC HPC User Forum 2008

SSD Disks in HPC applications

Solid State Disks will replace Disk Drives– Today HPC workloads are mostly compute

intensive– Data intensive Super Computing (DISC)

applications start to appear (see: IEEE Computer Magazine, April 2008)

– SSD will balance performance between DISC and compute intensive HPC applications

– EMC DMX has SSD today (25 SSD = 800K iops or 5 GB/sec) SAN

EMC

HPC Application platform support

DMX + SSD

0

0.5

1

1.5

2

Pri

ce/P

erfo

rman

ce

ext2 ext3 Reiserfs DualFS

Performance Normalized to Cost

Page 13: Advanced Storage Technologies for High Performance Computing

13

IDC HPC User Forum 2008

pNFS addresses the storage access issues

– Remove servers layer between CE and shared storage

– Separates MD traffic from Data Traffic

– Asymmetric storage architectures increase scalability

– SSD increase I/O speed

HPC Architecture

SSD STORAGE

CONNECTIVITY

MIDDLEWARE

NFS S E R V E R S

HPC Jobs

Storage must be Networked

Compute Engines

CONNECTIVITY

pNFS

pNFS will deliver very high I/O speeds to HPC

Page 14: Advanced Storage Technologies for High Performance Computing

14

IDC HPC User Forum 2008

MD is directed to the single MD server

Data is served by storage servers or storage arrays directly from host to storage

Storage access controlled by iSCSI

I/O to native IB or 10G storage redirected via RDMA in HW

iSCSI (iSER) NFS (pNFS)

Storage array

NFS/pNFS

File systems

Data path

Control path

Native IB Storage Array Cache

MetaData Cache

CE Cache

RD

MA

pNFS with Infiniband RDMA value added to HPC

Page 15: Advanced Storage Technologies for High Performance Computing

15

IDC HPC User Forum 2008