Dell EMC HPC Storage with LustreChina Lustre User Group 2019
Forrest Ling ���Standing Committee Member of CCF TCHPCDell EMC Greater China2019.10.15
© Copyright 2019 Dell Inc.2 of Y
HPC Hotspot : Open Source and HPC Storage
One Linux Stack To Rule HPC And AI, Nov 26, 2018
1995 2000 2005 2010 2015
Percentage %
Linux
Unix
N/A
© Copyright 2019 Dell Inc.3 of Y
HPC Storage at University of Cambridge with 24x Dell R740xd Servers
*Source from: Matt Rásó-BarnettUniversity of CambridgeLAD’19
© Copyright 2019 Dell Inc.4 of Y
HPC Key Components
ComputeStorage + Protection
Software Services
Networks
© Copyright 2019 Dell Inc.5 of Y
DELL EMC HPC STORAGE PORTFOLIO
High performance and flash (Tier One – Tier Two) Archival (Tier Three)
Elastic Cloud Storage (ECS)
All the benefits of a public cloud while
keeping cost under control
HPC Lustre StorageLustre storage starting from 230TB per object storage server pair and 22GB/s of write and 22GB/s of read
throughput. Scale out performance and capacity
with additional OSS pairs to 1PB/s file read and write and 1 Exabyte capacity.
IsilonScale-out NAS storage
to store, protect and analyze unstructured
data
HPC NFS StorageHigh availability storage system with up to 1PB
storage capacity5GB/s File Write and
7GB/s File Read.
POWERFUL PERFORMANCE | EFFICIENCY | SCALABILITY
Tier Zero
Data Accelerator (concept)NVMe based server, with
parallel file system for check pointing and bust buffer
workloads
© Copyright 2019 Dell Inc.6 of Y
Designed to scale to the Exabyte, with a Petabyte of throughput
LUSTRE SCALABLE BUILDING BLOCK
DELL EMC HPC LUSTRE STORAGE SOLUTION
• Single file system namespace scalable to high capacities and performance
• Engineered by Dell HPC Engineering to provide maximum throughput per building block with on-the-fly storage expansion
• Solution design for Big Data workloads using Intel Hadoop Adapter for Lustre (HAL)
• Dell Networking 10/25/100GbE, InfiniBand, or Omni-Path Architecture
• ~22GB/s Read and Write per building block
MDS
OSS
© Copyright 2019 Dell Inc.7 of Y
Lustre Storage scalability
Cap
acity
(TB
)
Throughput(GB/s)
Total U, No. of ME4084 18U, 1 23U, 2 33U, 4 42U, 5 47U, 6 58U, 8
1 Estimated Usable Space4TB/8TB/10TB/12TB (7.2 K RPM NL SAS HDD)
231 TiB461 TiB576 TiB691 TiB
461 TiB922 TiB1152 TiB1383 TiB
922 TiB1844 TiB2305 TiB2766 TiB
1153 TiB2305 TiB2881 TiB3458 TiB
1383 TiB2766 TiB3458 TiB4149 TiB
1844 TiB3688 TiB4610 TiB5532 TiB
Peak Read Performance4 ≈ 5.6 GB/s ≈ 11.3 GB/s 22.56 GB/s ≈ 28.2 GB/s ≈ 33.8 GB/s ≈ 45.1 GB/s
Peak Write Performance4 ≈ 5.3 GB/s ≈ 10.6 GB/s 21.27GB/s ≈ 26.6 GB/s ≈ 31.9 GB/s ≈ 42.5 GB/s
2 Sustained Performance4 ≈ 5 GB/s ≈ 10 GB/s ≈ 20 GB/s ≈ 25 GB/s ≈ 30 GB/s ≈ 40 GB/s1 Estimated Lustre usable space in TiB ≈ 0.99 * #Arrays * 80 * 0.8 * HDD size in TB * 10^12/2^402 Sustained performance ( steady state performance over a longer period of time/thread counts after the peak is attained) of this solution for read as well as write is very similar
3 Depending on customer’s power & weight DC restrictions 4 Performance for L config is measured. The performance numbers for rest of the configurations are an estimation/extrapolation based on L config
Base Configuration Examples of Scaling
L
3L+SL+M
2*L
(MDS)(OSS)
S
M
© Copyright 2019 Dell Inc.8 of Y
MDS1 : DellEMC PowerEdge R740 MDS2: DellEMC PowerEdge R740
12 Gbps SAS Failover Connections
12 Gbps SAS Failover Connections
IML: DellEMC PowerEdge R640
Dell PowerVault MD1420
OSS2: DellEMC PowerEdge R740OSS1: DellEMC PowerEdge R740 Active / ActiveHA
DellEMC PowerVault ME484 DellEMC PowerVault ME484 DellEMC PowerVault ME484DellEMC PowerVault ME484
Active / ActiveHA
Dell PowerVault MD1420(Optional)
PowerVault ME484, R740, CentOS 7.6, IB EDR CX5, Lustre Community Edition 2.12, ZFS 0.7.9 and IML 5
Lustre with ZFS+JBOD Reference Architecture
© Copyright 2019 Dell Inc.9 of Y
Data Protection and Backup AppliancesProtect your IT environment and the business value of your data. Industry-leading Dell
EMC data protection appliances include cloud-enabled protection storage, integrated
appliances and software-defined solutions.
© Copyright 2019 Dell Inc.10 of Y
Dell EMC PowerEdge HPC Server Nodes
powerful performance | density | efficiency
Infrastructure and I/OCompute - Purpose built for HPC Modular
C6420
Maximizes density, scalability, and energy efficiency per U for high-performance hyperscale workloads
C4140
2 socket, ultra-dense, 4 GPU rack server
R640/R440
Ideal combination for dense scale out data center computing and storage in a 1U/2S platform
R740, R740xd
Ideal for applications requiring best-in-class storage performance, high scalability, and density. Support up to 3 double-wide GPUs.
Dell EMC Blades
Dense compute and optimal memory throughput for demanding HPC workloads
Additional options: DellEMC.com/servers
R840/R940/R940xa
Ideal for mission-critical applications and real-time data and analytics
Large memory
© Copyright 2019 Dell Inc.11 of Y
Redesigned and Reimagined with 2nd Gen
AMD EPYC processors
One-Socket Rack Servers Powered by AMD
One-socket (single CPU) rack servers offer a
cost-effective balance of performance and
storage capacity to make IT easy. PowerEdge
rack servers powered by AMD provide dual-
socket performance in a single-socket 1U rack
design.
•PowerEdge R6515 PowerEdge R7515•PowerEdge R6415 PowerEdge R7415
Two-Socket Rack Servers Powered by AMDTwo-socket (dual CPU) rack servers offer a
wide variety of features to accommodate
more demanding workloads. PowerEdge
servers powered by AMD deliver
breakthrough performance, scalability, and
outstanding TCO.
•PowerEdge R6525•PowerEdge C6525•PowerEdge R7425
© Copyright 2019 Dell Inc.12 of YNVIDIA CONFIDENTIAL. DO NOT DISTRIBUTE.
Dell EMC GPU ServersPowerEdge R940xa (4U Server)Up to 4 GPUs per Node PowerEdge C4140
(1U Server)Up to 4 GPUs per Node
PowerEdge T640 (Tower)Up to 4 GPUs per node
PowerEdge R740 / R740xd (2U Server)Up to 3 GPUs per Node
PowerEdge DSS8440 (4U Server)Up to 10 GPUs per Node
© Copyright 2019 Dell Inc.13 of Y
BCM Support Lustre
© Copyright 2019 Dell Inc.14 of Y
OpenHPCSupport Lustre
© Copyright 2019 Dell Inc.15 of Y
Converged Platform HPC/Bigdata/AI : Magpie
Magpie Support Lustre
© Copyright 2019 Dell Inc.16 of Y
HPC Solutions Support
Supported Hardware and Software Technology and Local Services PartnersCluster Management
Software
ProDeployfor HPC
ProSupport ProSupport Plusor
Asset-level support
Solution support
ProSupport Add-on for HPC
HPC Add-on: Individual
nodesHPC Add-on:
StorageHPC Add-on:
M1000e
1 2 3
Support Deployment
StorageNetworkingOperating System Server Local Partners
© Copyright 2019 Dell Inc.17 of Y
Thanks!